This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Syslink notify implementation and linux kernel notify implementation (on DM814x)

Other Parts Discussed in Thread: TMS320DM8148

Hello,

We are working on a DSP application that communicates with user mode linux via SysLink on TMS320DM8148 rev 2.1. Our setup so far uses a Syslink kernel module compiled with USE_SYSLINK_NOTIFY=0 (i.e. we are using the linux kernel notify implementation).

We are seeing different a behavior in our DSP application when SR0 is cacheable/non-cacheable [1] and we are considering using the notify implementation that comes with SysLink. Before we move to syslink notify, we'd like to know which one of the notify implementations is used during validation of the syslink releases and for which one we are likely to get better support.

We have tried SysLink versions: 2.10.00.12, 2.20.02.20, 2.21.00.03 (each with its recommended dependencies) and are currently using:
syslink 2.21.00.03
ti-cgt6x 7.2.2
ti-cgt470 5.0.2
ti-xdctools 3.24.03.33
ti-sysbios 6.34.03.19
ti-ipc 1.25.01.09
Linux 2.6.37-psp04.04.00.01.patch1

Thank you
--
Delio Brignoli
AudioScience Inc

[1] We have read the documentation and understand the limitation of the linux kernel implementation regarding the "cacheability of shared region 0" on slave processors.

  • On that device, assuming the M3's are in use (via the VPSS driver), I think you are required to not set USE_SYSLINK_NOTIFY=1.  That is, you must use the kernel's notify driver, not SysLink's.

    You only get one notify driver in the system and either it's the kernel's (which doesn't support cached SR0, but is the only one the VPSS driver supports) or it's SysLink's (which can support cached SR0).

    Often we recommend SR0 remain non-cached, and recommend creating a separate SharedRegion for data/etc that you want cached.  This approach also helps for your use case, where you may not have the option of making SR0 cached.

    To your question of validation - on that device we don't validate USE_SYSLINK_NOTIFY=1, mainly b/c of the kernel notify limitation constraining any real use cases.

    Chris

  • Hello Chris, thank you for answering. M3s are not in use and VPSS driver is not compiled in in our design. We have 3 separate issues:

    1. Our board has little RAM so having SR0 not cacheable makes the memory map layout more complicated. MAR register granularity is 16MBytes so to avoid ending up with SR1 and DSP program space being non cached we have to layout the memory out like so (linux is instructed to leave alone the memory where SR0 resides):
       *  8000_0000 - 83FF_FFFF   400_0000  (  64 MB) External Memory
       *  ------------------------------------------------------------------------
       *  8000_0000 - 82FE_FFFF   300_0000  (  48 MB - 64 KB) Linux
       *  82FF_0000 - 82FF_FFFF     1_0000  (  64 KB) SR_0 (ipc)
       *  8300_0000 - 831F_FFFF    1F_0000  (2048 KB) SR_1 (MessageQ buffers)
       *  8320_0000 - 83FF_FFFF    E0_0000  (  14 MB) DSP_PROG (code, data)
      Cache.MAR128_159 = 0x20000000;
      Being able to have SR0 cacheable in the last 16MBytes of memory would simplify things for us.

    2. When we compile/run with SR0 cacheable (using syslink notify) our application works fine, while with SR0 not cacheable (using kernel notify) messages take seconds (!) to go from one processor to the other. Even the ex02_messageq sample app sometimes hangs for a moment on startup when SR0 is not cacheable. On the DSP side we wait for messages, process them and send back responses via a separate messageq, so it's fairly similar to the ex02_messageq sample.

    3. When the DSP tries to send a message using a messageq (different from the one used in point (2)) we get a kernel fault. It looks like the host side syslink (library or module) detects an issue and tries to shutdown but crashes the kernel in the process.

    We are working to simplify our usermode<->dsp communication to have just one messageq in each direction. This way we hope it will be easier to isolate the problem.
    Given that the M3s are not in use and the information above would you still advice us to leave SR0 not cacheable?

    Thanks
    --
    Delio Brignoli
    AudioScience Inc

  • As a clarification point, the cache setting of Shared Regions can be different on each side.  That is, the Linux side can have SR0 non-cached, and BIOS can have SR0 cached - they just need to "configure things right".

    To further clarify, my earlier reply assumed you were just talking about Linux-side caching of SR0 (which has the kernel notify limitations).  The BIOS-side caching doesn't have that limitation - you can configure SR0 _from the BIOS side_ to be either cached or not... has nothing to do with which Linux-side notify driver is being used.  That may help with your MAR bit granularity - I don't know.

    With that, maybe you can clarify some of your points (enable cache from which side?) and I can try replying to them?

    Chris

  • Hello Chris,

    In my post I was talking about the SYS/BIOS side configuration of SR0 cacheability. I thought cacheability on the linux side and SYS/BIOS side had to match up for SR0.

    I want to make sure I understand what you are saying before going any further. This is how I understood SR0 cacheability:

    1- I have two knobs I can tweak when compiling a SYS/BIOS application: Cache.MAR* registers and the cacheEnable attribute of SharedRegion.Entry. In my tests when cacheEnable = false then the corresponding bit in Cache.MAR* must be set else syslink does not work. When cacheEnable = true, the corresponding bit in Cache.MAR* must not be set else syslink does not work. (It would be nice to document what happens when the cacheEnable attr is set/unset and that Cache.MAR* must follow).

    2- I have no control on cacheability of SR0 in linux because the syslink driver takes care of mapping the memory and I thought the cacheEnable attribute of SharedRegion.Entry only affected the SYS/BIOS side.

    3- Given the above, I read "On TI81XX, if SR0 is made cacheable, the SysLink Notify driver must be used <snip> The in-kernel Notify driver only works with non-cached SR0 and does not perform any cache coherency operations, therefore making SR0 cacheable may cause notifications in SysLink to be lost" in [1] to mean that using the kernel syslink notify required SR0 to be non-cacheable on the SYS/BIOS side and conversely a cacheable SR0 on the SYS/BIOS side required using the syslink notify driver.

    However, you seem to be saying I can use the kernel notify API with a SYS/BIOS application compiled with SR0 cacheable even though it seems to contradict "On TI81XX, if SR0 is made cacheable, the SysLink Notify driver must be used <snip> The in-kernel Notify driver only works with non-cached SR0 and does not perform any cache coherency operations, therefore making SR0 cacheable may cause notifications in SysLink to be lost" in [1].

    Can you point me to documentation that explains if/how my assumption above are wrong? Or alternatively documentation that helps me 'configure things right' as you say.

    Ideally we would like SR0 to live in the last 16MBytes of the memory map and have it cacheable (at least on the SYS/BIOS side) like this:

    *  8000_0000 - 82FF_FFFF   300_0000  (  48 MB) Linux
    *  8300_0000 - 8300_FFFF     1_0000  (  64 KB) SR_0 (ipc)
    *  8301_0000 - 831F_FFFF    1F_0000  (1984 KB) SR_1 (MessageQ buffers)
    *  8320_0000 - 83FF_FFFF    E0_0000  (  14 MB) DSP_PROG (code, data)

    At the same time we would also like to pick the syslink configuration that is more likely to receive better/quicker support.

    Thanks
    --
    Delio Brignoli
    AudioScience Inc

    [1] http://processors.wiki.ti.com/index.php/SysLink_Notify#Limitations_with_respect_to_cache_usage

  • FYI, here's a related thread that may be worth reading:   http://e2e.ti.com/support/embedded/bios/f/355/t/201603.aspx

    And b/c I'll bring it up later, note that there are 2 ways to setup a SharedRegion - statically and dynamically.

    • Static SharedRegions are set up in your BIOS .cfg script (like I think you're doing).  These static SharedRegions persist for the life of your BIOS-side application, and are "always there".
    • Dynamic SharedRegions are set up at runtime using the SharedRegion_setEntry() API.
    Regardless of how they're created, static or dynamic, the SharedRegion has a .cacheEnable attribute that must be correctly set.

    Delio Brignoli said:
    1- I have two knobs I can tweak when compiling a SYS/BIOS application: Cache.MAR* registers and the cacheEnable attribute of SharedRegion.Entry. In my tests when cacheEnable = false then the corresponding bit in Cache.MAR* must be set else syslink does not work. When cacheEnable = true, the corresponding bit in Cache.MAR* must not be set else syslink does not work. (It would be nice to document what happens when the cacheEnable attr is set/unset and that Cache.MAR* must follow).

    Yes, on the BIOS side you have two cache-related knobs that must be consistent:

    1. Enabling the cache in the hardware.  On C6x devices, this is done via the Cache.MAR settings.
    2. Configuring each SharedRegion.cacheEnable field consistent with "however the HW is configured".  This enables SW using the SharedRegion (e.g. MessageQ) to determine whether cache management is necessary.

    Delio Brignoli said:
    2- I have no control on cacheability of SR0 in linux because the syslink driver takes care of mapping the memory and I thought the cacheEnable attribute of SharedRegion.Entry only affected the SYS/BIOS side.

    Not quite.  Forgetting about the in-kernel notify driver limitation for now(!), you have the same 2 knobs on the Linux side when using SysLink.

    1. Enable the cache in hardware - in Linux, this is done when SysLink's kernel driver maps the memory backing the SR.  You can use the SysLink_params variable to tell SysLink to map given SRs with cache enabled (in your SR0-specific case, set SysLink_params to "SharedRegion.entry[0].cacheEnable=TRUE;").  Be sure to set SysLink_params prior to calling SysLink_setup() for the first time.  (Note that the SL_PARAMS environment variable can also be used to set SysLink_params - that's nice as you don't have to change C files and recompile.)
    2. Configure the SharedRegion.cacheEnable field.  This one's a little tricky... SysLink detects statically created SharedRegions (like SR0!) when it loads the slave, and sets the .cacheEnable field based on the SysLink_params.

    Now let's (finally!) remember that on TI81XX devices, the in-kernel notify driver doesn't know how to manage cache.  Because of this, if you're using that in-kernel notify driver on a TI81XX device, you can't enable cache on SR0.  In your DSP-only use case, maybe you don't have that limitation, and could switch to the SysLink kernel notify driver.

    Delio Brignoli said:
    3- Given the above, I read "On TI81XX, if SR0 is made cacheable, the SysLink Notify driver must be used <snip> The in-kernel Notify driver only works with non-cached SR0 and does not perform any cache coherency operations, therefore making SR0 cacheable may cause notifications in SysLink to be lost" in [1] to mean that using the kernel syslink notify required SR0 to be non-cacheable on the SYS/BIOS side and conversely a cacheable SR0 on the SYS/BIOS side required using the syslink notify driver.

    However, you seem to be saying I can use the kernel notify API with a SYS/BIOS application compiled with SR0 cacheable even though it seems to contradict "On TI81XX, if SR0 is made cacheable, the SysLink Notify driver must be used <snip> The in-kernel Notify driver only works with non-cached SR0 and does not perform any cache coherency operations, therefore making SR0 cacheable may cause notifications in SysLink to be lost" in [1].

    I've updated [1] to clarify it's only discussing the Linux-side cache setting, and some of the wording should be familiar to what we've been discussing here.  :)

    Delio Brignoli said:
    Ideally we would like SR0 to live in the last 16MBytes of the memory map and have it cacheable (at least on the SYS/BIOS side) like this:

    *  8000_0000 - 82FF_FFFF   300_0000  (  48 MB) Linux
    *  8300_0000 - 8300_FFFF     1_0000  (  64 KB) SR_0 (ipc)
    *  8301_0000 - 831F_FFFF    1F_0000  (1984 KB) SR_1 (MessageQ buffers)
    *  8320_0000 - 83FF_FFFF    E0_0000  (  14 MB) DSP_PROG (code, data)

    That should be fine - be sure to set the MAR bit to enable BIOS-side cache from 8300_0000 - 83FF_FFFF and configure both SR0 and SR1 to have .cacheEnable = true.

    Delio Brignoli said:
    At the same time we would also like to pick the syslink configuration that is more likely to receive better/quicker support.

    On TI81XX devices, we only validate Linux systems using the in-kernel driver.  For the reasons I list above, the strong majority of our Linux customers configure their systems that way.  That said, when non-Linux OS's are run on the A8 (e.g. QNX, BIOS), we don't have the Linux in-kernel notify driver and therefore _do_ validate the SysLink Notify driver.

    There's always some risk when diverging from the beaten path.  But if you find issues, we'll be here to help.  :)

    Chris

  • Hello Chris,

    Thank you for the detailed answer. I have one more question about the details of the mechanism SysLink uses on the HLOS to decide if a region is cacheable or not. You wrote:

    Chris Ring said:

    Not quite.  Forgetting about the in-kernel notify driver limitation for now(!), you have the same 2 knobs on the Linux side when using SysLink.

    1. Enable the cache in hardware - in Linux, this is done when SysLink's kernel driver maps the memory backing the SR.  You can use the SysLink_params variable to tell SysLink to map given SRs with cache enabled (in your SR0-specific case, set SysLink_params to "SharedRegion.entry[0].cacheEnable=TRUE;").  Be sure to set SysLink_params prior to calling SysLink_setup() for the first time.  (Note that the SL_PARAMS environment variable can also be used to set SysLink_params - that's nice as you don't have to change C files and recompile.)
    2. Configure the SharedRegion.cacheEnable field.  This one's a little tricky... SysLink detects statically created SharedRegions (like SR0!) when it loads the slave, and sets the .cacheEnable field based on the SysLink_params.

    Now let's (finally!) remember that on TI81XX devices, the in-kernel notify driver doesn't know how to manage cache.  Because of this, if you're using that in-kernel notify driver on a TI81XX device, you can't enable cache on SR0.  In your DSP-only use case, maybe you don't have that limitation, and could switch to the SysLink kernel notify driver.

    I read point (2) (the tricky one) as meaning that, if I have a statically defined SR0 with .cacheEnable = true in a slave application, syslink will detect that "on slaveload" and use the value for setting .cacheEnable = true for the HLOS as well. Is this correct? If so, will the in-kernel notify implementation ignore the statically defined .cacheEnabled = true attribute given that it cannot handle cacheable SR0?

    We'll be using the following memory map, set .cacheEnable = true for SR0 statically and leaving MAR registers untouched (cacheable) with the in-kernel notify driver (USE_SYSLINK_NOTIFY=0):

    *  8000_0000 - 82FF_FFFF   300_0000  (  48 MB) Linux
    *  8300_0000 - 8300_FFFF     1_0000  (  64 KB) SR_0 (ipc)
    *  8301_0000 - 831F_FFFF    1F_0000  (1984 KB) SR_1 (MessageQ buffers)
    *  8320_0000 - 83FF_FFFF    E0_0000  (  14 MB) DSP_PROG (code, data)

    Does it look sane?

    Thank you
    --
    Delio Brignoli
    AudioScience Inc

  • Delio Brignoli said:
    I read point (2) (the tricky one) as meaning that, if I have a statically defined SR0 with .cacheEnable = true in a slave application, syslink will detect that "on slaveload" and use the value for setting .cacheEnable = true for the HLOS as well. Is this correct?

    No.  On the HLOS side, it has it's own .cacheEnable field (independent of the BIOS side's .cacheEnable field).  And the value of the HLOS-side's field is set to the default (non-cached) unless it's overridden using the SysLink_params string (e.g. "SharedRegion.entry[0].cacheEnable=TRUE;").

    I've highlighted that in my original response here:

    Chris Ring said:
    2.  Configure the SharedRegion.cacheEnable field.  This one's a little tricky... SysLink detects statically created SharedRegions (like SR0!) when it loads the slave, and sets the .cacheEnable field based on the SysLink_params.

    I was too subtle maybe.

    Delio Brignoli said:

    We'll be using the following memory map, set .cacheEnable = true for SR0 statically and leaving MAR registers untouched (cacheable) with the in-kernel notify driver (USE_SYSLINK_NOTIFY=0):

    *  8000_0000 - 82FF_FFFF   300_0000  (  48 MB) Linux
    *  8300_0000 - 8300_FFFF     1_0000  (  64 KB) SR_0 (ipc)
    *  8301_0000 - 831F_FFFF    1F_0000  (1984 KB) SR_1 (MessageQ buffers)
    *  8320_0000 - 83FF_FFFF    E0_0000  (  14 MB) DSP_PROG (code, data)

    Does it look sane?


    Sane as it gets!  :)  Hang in there.

    Chris

  • Chris,

    Chris Ring said:
    I was too subtle maybe.



    I thought the two points were introduced as separate ways of affecting cacheability on the HLOS side, but point (2) does not really affect cacheability which is set by one of the method described in point (1) hence my confusion. Anyway, all is good now!

    Thank you
    --
    Delio Brignoli
    AudioScience Inc