This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How to control rpmsg virtio buffer location (ARM-DSP IPC)?

Hi,

There are couple of related posts in the forum (e.g. e2e.ti.com/.../1317596;) but none of them answers the particular question presented here. This problem is related to keystone2 device:

In DSP bios configuration the line

Cache.setMarMeta(0xA0000000, 0x0FFFFFFF, 0);

is said the disable cache in the given range due to the vring buffers that reside in that location. I have two related questions:

1) Does 0xa000 0000 refer to "real" DDR3B address or MPAX mapped DDR3A address at 0x08 2000 0000? The MPAX reset value for our device maps the range 0x8000 0000-0xffff ffff to 0x08 0000 0000 - 0x08 7fff ffff.

2) We have reconfigured the CONFIG_VMSPLIT and increased the size of CONFIG_CMA_SIZE_MBYTES. It is thus very likely that our CMA block is now overlapping the "hard coded" region starting at 0xa000 0000. How can we relocate the rpmsg/virtio stuff or is it relocated automatically but we just have to find out the location and change the cache disabling accordingly?

Prompt answers are appreciated.

Regards,

Marko

  • Marko,

    1.
    If DDR3A_REMAP_EN pin is '1': (first 2GB of DDR3A)
    DDR3A - 00 8000 0000 to 00 FFFF FFFF is aliased of 08 0000 0000 to 08 7FFF FFFF(Real) .

    If DDR3A_REMAP_EN pin is '0': (2GB)
    DDR3B - 00 8000 0000 to 00 FFFF FFFF (Real)
    DDR3B(first 512MB) - 00 6000 0000 00 7FFF FFFF is aliased of 00 8000 0000 to 00 FFFF FFFF (Real)
  • Hi Raja,

    Yes, I understand the relationship between the DDR3A_REMAP_EN pin and ARM memory view. Are you saying that DSP BIOS configuration uses ARM view with the instruction that disables cache: Cache.setMarMeta(0xA0000000, 0x0FFFFFFF, 0)? I would have thought that DSP BIOS configuration is purely DSP related and the memory should be visible as "DSP view" in http://www.ti.com/lit/ds/sprs866e/sprs866e.pdf (page 93-94). In that case the address 0xa000 0000 would be in DDR3B unless the MPAX setting takes it to somewhere else already at the configuration phase.

    Could you please clarify this a bit.

    Marko

  • If I understand correctly, the question about the DSP BIOS related to code that runs on the DSP. The DSP core knows ONLY logical address so that any address in the DSP BIOS (which is 32 bit) is always logical address.

    Now for the physical address. Unless you changed it, the MPAX registers when the DSP reads or writes directly from memory have default value as you mentioned - 0x8000 0000 logical mapped into 0x8 0000 0000 . If another master moves the data, the same default value is from the SES MPAX registers (different set of registers, different user guide information)

    For the ARM to be able to c0ommunicate with the DSP IPC, a special section of the memory must be defined, and the MPM must load the DSP code. This is how the ARM knows the location of the DSP IPC.

    Is this what you ask? Do you need more information?

    Ran
  • Hi Ran,

    Thanks for the reply. Please take a look at my original post in this thread. I have understood that there is a pre-defined memory location in DDR3A which is used for communication between ARM and DSP (rpmsg virtio buffers). However, no one from TI has been able to explain where and how this memory section is defined.

    This issue is very much ARM Linux / MCSDK related. We have increased the size of the memory block of Contiguous Memory Allocator (CMA) from the default 16M in MCSDK to 512M. I am suspecting that the increase in size causes CMA block to overlap with the "hard coded" 0xa000 0000 (or probably 0x08 2000 0000 after MPAX manipulation). And that corrupts the rpmsg buffers and causes IPC between ARM and DSP to fail.

    So the question still remains: In some TI examples there is a hard coded address 0xa000 0000 in app.cfg to disable cache in DDR3 due to rpmsg virtio buffers being located in that area. How do we relocate the rpmsg virtio buffers to somewhere else? Surely it cannot be a fixed, hard-coded address that just happens to work in one given configuration i.e. MCSDK Linux distro.

    regards,

    Marko

  • Marko,

    My current working theory is that the address (0xa0000000) found in the app.cfg must correspond to the address in the dspmem node in the device tree include file found at arch/arm/boot/dts/k2hk.dtsi in the linux kernel files. The default device tree node looks like this:

    dspmem: dspmem {
         compatible = "linux,rproc-user";
         mem = <0x0c000000 0x000600000
                         0xa0000000 0x20000000>;
         label = "dspmem";
    };

    You can also see that the compatible string ("linux,rproc-user") from the device tree node also matches the driver found in /drivers/remoteproc/remoteproc_user.c.

    I would suggest moving this dspmem memory location if you think that it is getting overwritten by your increased memory block. Also make sure to update the app.config file to reflect the move. 

    I am fairly certain that this device tree node determines where the vrings are placed in memory but I am checking with some other guys on my team to be sure. I will let you know what they respond with.

    Here is a link that shows how to rebuild the Linux kernel and the device tree blob after making any changes:

     Thanks,

    Jason Reeder

  • Marko,

    It seems that I misspoke about where the vrings are placed into memory. The vrings are located based on what is in the resource table in the DSP image. So, if you need to move your vrings due to the increase in your CMA block size then you will need to update the vring location in your resource table as well as make sure the app.cfg configures that section of memory as non-cacheable.

    Thanks,

    Jason Reeder
  • Thank you Jason for your reply.
    Could you still be a bit more specific please? What do you mean by "resource table in the DSP image"?

    regards,
    Marko
  • Marko,

    Can you point me to the example that you started your development from? (I should have started with this question)

    There should be a memory section in your DSP program image that is named '.resource_table'. You should be able to check out the map file from your DSP build process and see this section. Before the ARM host loads the program image into the DSP, it parses the program image looking for this section (specifically by the name of '.resource_table'). This resource table is how the DSP tells the ARM core all of the resources (memory, vrings, etc.) that it will need in order to operate. The ARM then allocates these resources and has them available to the DSP before loading and running the DSP with its program image.

    Here is a wiki page that talks about the resource table usage in IPC 3.00.01 and newer. This wiki page will show how to override the default resource table and use your own: 

    For an example of a resource table you can look in mcsdk_3_00_04_18/ipc_3_00_04_29/packages/ti/ipc/remoteproc/rsc_table_tci6638.h. Notice that the RPMSG_VRING0_DA and RPMSG_VRING1_DA definitions are set to 0xA0000000 and 0xA0004000 respectively. Also notice that a DATA_SECTION #pragma is used to place the ti_ipc_remoteproc_ResourceTable into the ".resource_table" section.

    I have asked the development team if there is any more collateral that exists on this topic and I will let you know if I find anything. 

    Jason Reeder

  • Hi Jason.

    Finally an answer that says it all! Thank you very much for taking the time and explaining all the details. It is starting to make sense now. Custom resource table is definitely something that we need to do to solve this issue. I'll drop a note here after I get this checked and verified.

    I tried to look for the exact piece of SW that we used as a starting point but couldn't come up with anything. My guess is that the example code was an IPC demo of some sort. I could be related to image processing demos or something we have received at Keystone 2 multicore workshop. 

    Best regards & big thanks,

    Marko

  • Hi Jason Reeder,

    I went through MCSDK demo image processing application. In that I found SYS/BIOS config file which specifies vrings address range 0xa0000000 to 0xa1ffffff uncachable.

    If I do like this, inside DSP, I observed memcpy from this uncached section to cached DDR memory section taking more time (around 4.5 MCPS for PAL resolution image memcpy).

    If this uncached area forcefully cached using "Cache_setMar" function, memcpy time reduced lot but the captured processed image in ARM side becomes zero.

    My doubt is how to make this vrings address range cacheable for TCI6638K2K platform.


    Regards,

    Hemanth, P

  • Hemanth,

    Check out the block diagram of the TCI6638K2K device in Figure 1-1 of the device data sheet: www.ti.com/.../tci6638k2k.pdf

    The ARM and DSP cores have separate L1 and L2 data caches. If you allow the DSP to view the memory range as cacheable then the data the DSP tries to write to DDR will only make it as far as the L1 or L2 data cache. This means the ARM core can't see it in DDR. Which seems like what you are experiencing.

    If you did mark the memory range as cacheable for the DSP, your DSP code would be responsible for invalidating and writing back the cache each time you write something to DDR that you need the ARM to see (which would be the same as having the memory be noncacheable except it requires more code).

    This is why shared memory IPC configures the shared data structures to reside in noncacheable memories.

    Jason Reeder