This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/AM5728: IPC CMEM area and shared regions

Expert 1935 points
Part Number: AM5728
Other Parts Discussed in Thread: SYSBIOS,

Tool/software: TI-RTOS

Hello,

I've got a few questions regarding the IPC memory map use in ex41_forwardmsg:

The devicetree specifies these address origins/sizes:
IPU1: 0x9580 0000 / 0x0380 0000 (56M)
DSP1: 0x9900 0000 / 0x0400 0000 (64M)
IPU1: 0x9D00 0000 / 0x0200 0000 (32M)
DSP2: 0x9F00 0000 / 0x0080 0000 (8M)
CMEM: 0xA000 0000 / 0x0C00 0000 (192M)

In the config.bld the following information is provided:
DSP1:
 *  Virtual     Physical        Size            Comment
 *  ------------------------------------------------------------------------
 *  9500_0000   ???0_0000    10_0000  (  ~1 MB) EXT_CODE
 *  9510_0000   ???0_0000    10_0000  (   1 MB) EXT_DATA
 *  9520_0000   ???0_0000    30_0000  (   3 MB) EXT_HEAP
 *  9F00_0000   ???0_0000     6_0000  ( 384 kB) TRACE_BUF
 *  9F06_0000   ???6_0000     1_0000  (  64 kB) EXC_DATA
 *  9F07_0000   ???7_0000     2_0000  ( 128 kB) PM_DATA (Power mgmt)
 *  BFC0_0000   ???0_0000    10_0000  (   1 MB) SR0 (Shared region)
IPU1:
 *  Virtual     Physical        Size            Comment
 *  ------------------------------------------------------------------------
 *  0000_4000   ???0_4000     F_C000  (  ~1 MB) EXT_CODE
 *  8000_0000   ???0_0000    20_0000  (   2 MB) EXT_DATA
 *  8020_0000   ???0_0000    30_0000  (   3 MB) EXT_HEAP
 *  9F00_0000   ???0_0000     6_0000  ( 384 kB) TRACE_BUF
 *  9F06_0000   ???6_0000     1_0000  (  64 kB) EXC_DATA
 *  9F07_0000   ???7_0000     2_0000  ( 128 kB) PM_DATA (Power mgmt)
 *  BFC0_0000   ???0_0000    10_0000  (   1 MB) SR0 (Shared region)
(I added the Shared Region 0 info)

rsc_table_dsp1:
#define DSP_MEM_TEXT            0x95000000
#define DSP_MEM_IOBUFS          0x80000000
#define DSP_MEM_DATA            0x95100000
#define DSP_MEM_HEAP            0x95200000
#define DSP_SR0_VIRT            0xBFC00000
#define DSP_SR0                 0xBFC00000
#define DSP_MEM_IPC_DATA        0x9F000000
#define DSP_MEM_IPC_VRING       0xA0000000
#define DSP_MEM_RPMSG_VRING0    0xA0000000
#define DSP_MEM_RPMSG_VRING1    0xA0004000
#define DSP_MEM_VRING_BUFS0     0xA0040000
#define DSP_MEM_VRING_BUFS1     0xA0080000
#define PHYS_MEM_IPC_VRING      0x99000000

rsc_table_ipu1:
#define IPU_MEM_TEXT            0x0
#define IPU_MEM_DATA            0x80000000
#define IPU_SR0_VIRT            0xBFC00000
#define IPU_SR0                 0xBFC00000
#define IPU_MEM_IPC_DATA        0x9F000000
#define IPU_MEM_IPC_VRING       0x60000000
#define IPU_MEM_RPMSG_VRING0    0x60000000
#define IPU_MEM_RPMSG_VRING1    0x60004000
#define IPU_MEM_VRING_BUFS0     0x60040000
#define IPU_MEM_VRING_BUFS1     0x60080000
#define PHYS_MEM_IPC_VRING      0x9D000000

- Are the IPU/DSP addresses all remapped by the offset given in PHYS_MEM_IPC_VRING?

- The sections TRACE_BUF, EXC_DATA and PM_DATA are at large offsets beyond the carveout. To which physical addresses do they point?

- IPU_SR0 and DSP_SR0 are located at 0xBFC00000 (phys). This is inside the RAM but outside of any carveout. How does the Linux-side know that this area is used by the Shared Region 0?


The files I expect to modify to change the memory map are:
- linux device tree include file to specify cma origin and length, also cmem origin and length
- config.bld (linker info)
- ipc.cfg.xs (shared region config)
- rsc_table.h (address info; two versions, one for dsp one for ipu)

Did I miss something?


For our application I'd like to exchange messages between Linux/DSP/IPU.
- Is it safe to use one Shared Region by the IPU and the DSP at the same time to allocate memory (for a MessageQ message)?
- Do I need to setup a Shared Region to allocate memory for MessageQ messages or could I use apart of the CMEM area? (does CMEM implement IHeap?)
- Shared Regions are for Sysbios only, can I send messages allocated in a Shared Region to a Linux MessageQ?


I also would like to have a large memory block shared between Linux/DSP/IPU, for that I'd use a chunk of RAM reserved by CMEM. It this the recommended way?


Best regards,
Lo2


  • The RTOS team have been notified. They will respond here.
  • Hello,

    one more question: I'm trying to add a Shared Region as shown in example ex41.

    Still I get:

    ti.sdo.ipc.heaps.HeapBufMP: ERROR: line 812: assertion failure: A_noHeap: Region has no heap
    ti.sdo.ipc.heaps.HeapBufMP: line 812: assertion failure: A_noHeap: Region has no heap

     

    I added an entry in config.bld, provided a custom rsc_table.h (with SR0 entries), added the module in Dsp1.cfg (var HeapBufMP   = xdc.useModule('ti.sdo.ipc.heaps.HeapBufMP');), enabled the custom table (Resource.customTable = true).

    The resulting map file shows an entry for the SR0:

    MEMORY CONFIGURATION

             name            origin    length      used     unused   attr    fill
    ----------------------  --------  ---------  --------  --------  ----  --------
      L2SRAM                00800000   00040000  00000000  00040000  RW X
      OCMC_RAM1             40300000   00080000  00000000  00080000  RW X
      OCMC_RAM2             40400000   00100000  00000000  00100000  RW X
      OCMC_RAM3             40500000   00100000  00000000  00100000  RW X
      EXT_CODE              95000000   00100000  0002ca68  000d3598  RW X
      EXT_DATA              95100000   00100000  000730c0  0008cf40  RW  
      EXT_HEAP              95200000   00300000  00000000  00300000  RW  
      TRACE_BUF             9f000000   00060000  00008004  00057ffc  RW  
      EXC_DATA              9f060000   00010000  00000200  0000fe00  RW  
      PM_DATA               9f070000   00020000  00000000  00020000  RW X
      SR_0                  bfc00000   00100000  00100000  00000000  RW

     

    I've searched the documentation (the wiki) but can't find the missing bit.

    Best regards,

    Lo2

  • Hi Lo2,

    Will try to answer all your questions. In the meantime, you may check if the BigData IPC example in Processor SDK helps?

    processors.wiki.ti.com/.../Processor_SDK_Big_Data_IPC_Examples

    "The SharedRegion and HeapMem modules are not currently supported for Linux in the TI Standard IPC package.

    The example provides these modules with same/similar API implemented for Linux with some limitations.

    The CMEM APIs provide user space allocation of contiguous memory for the Big data buffers. "

    Regards,
    Garrett
  • Hi Garrett,

    thanks for the big data example, I'll take a detailed look at it! I didn't know about this example, I only looked in the ipc/examples subdirs.

    Regarding the ex02 and ex41 examples for DRA7xx_linux:

    The ex02 doesn't call Ipc_attach at all, yet MessageQ works; ex41 does call Ipc_attach (as described in the manual). This confuses me.

    If I have multiple threads, how do I ensure the first one calls Ipc_attach?

    Is it safe to call it from the main context, before creating any threads? (If it fails, I would skip thread creation anyway)

    Best regards,

    Lo2

  • Lo2,

    Ipc.procSync configures startup protocol. When using Ipc.ProcSync_PAIR, Ipc_attach need to be called. For ex41, see

    ./dsp1/Dsp1.cfg:Ipc.procSync = Ipc.ProcSync_PAIR;
    ./ipu1/Ipu1.cfg:Ipc.procSync = Ipc.ProcSync_PAIR;

    Here both the DSP and IPU are running RTOS. You can find the details on Ipc_start and Ipc_attach in the slides processors.wiki.ti.com/.../IPC_Training_2_21.pdf

    You need call Ipc_attach in a task/thread after BIOS_start(). Not sure why you need 1st thread calls Ipc_attach, but you can manage the priority of tasks/threads to get the 1st (high priority) task/threads to call Ipc_attach.

    Regards,
    Garrett
  • Hi Garret,

    In ex41 Dsp1.cfg has the following two entries:

    var BIOS        = xdc.useModule('ti.sysbios.BIOS');
    BIOS.addUserStartupFunction('&IpcMgr_ipcStartup');

    var Ipc = xdc.useModule('ti.sdo.ipc.Ipc');
    Ipc.procSync = Ipc.ProcSync_PAIR;
    BIOS.addUserStartupFunction('&IpcMgr_callIpcStart');

    I assume that BIOS_start() will then call Ipc_start() for me, right? (via IpcMgr_callIpcStart() in IpcMgr.c)

    So unless I use ProcSync_ALL I need to pair to each target CPU. I won't use the ProcSync_ALL because I disabled one DSP and one IPU, and I'm not sure that would be handled correctly.

    Thanks!

    Lo2

  • Lo2,

    >>I assume that BIOS_start() will then call Ipc_start() for me, right? (via IpcMgr_callIpcStart() in IpcMgr.c)
    Correct.

    You should still be able to use ProcSync_ALL if you update the procNameAry with known connected cores in ex41_forwardmsg/shared/ipc.cfg.xs:

    var procNameAry = ["HOST", "IPU1", "DSP1"];

    Regards,
    Garrett
  • Hi Garret,

    The SharedRegion is apparently not very useful to me: most of my messages sent originate on the SYSBIOS side.

    If I allocate memory on a SharedRegion and send it over to the Linux side, a message arrives but the contents are wrong (first uint32_t is mangled).
    Can you confirm that this it is not supposed to work?

    ex41 sends messages from the SYSBIOS side to Linux but they are always allocated on the Heap0. Heap1, the SharedRegion Heap is only used for messages to the DSP/IPU, not Linux.

    Now I can use heap0 but the default size is set to 3MB and apparently a lot of it is in use. A few new MessageQ_alloc calls later they fail.

    - How can I see heap usage / debug heap allocation of heap0?
    - Is it sufficient to increase the heap size in config.bld?
    - How much memory does a MessageQ and a MessageQ_Message allocate on the heap?

    (My messages are usually header (32B) and an just 100 more Bytes data)


    Best regards,
    Lo2
  • Lo2,

    >>If I allocate memory on a SharedRegion and send it over to the Linux side, a message arrives but the contents are wrong (first uint32_t is mangled). Can you confirm that this it is not supposed to work?
    The message can be allocated with and without heap. The messages pass over between Linux and BIOS through rpmsg instead of SharedRegion. If you allocate memory on a SharedRegion, the message still arrives to Linux through rpmsg, but you need to ensure "the first field of the message must still be a MsgHeader. To make sure the MsgHeader has valid settings, the application must call MessageQ_staticMsgInit()", see processors.wiki.ti.com/.../MessageQ_Module. The SharedRegion is not really shared between Linux and SYS/BIOS.

    >>- How can I see heap usage / debug heap allocation of heap0?
    >>- Is it sufficient to increase the heap size in config.bld?
    >>- How much memory does a MessageQ and a MessageQ_Message allocate on the heap?

    Regarding heap usage, see processors.wiki.ti.com/.../HeapMP_Modules

    However increasing heap size doesn't seem to be a solution. It seems you keep allocating message and then use up heap. Have you freed the message by calling MessageQ_free after it's received by host?

    Each Message size can't exceed 512 bytes limitation from rpmsg buffer, and "the number of buffers will be computed from the number of buffers supported by the vring, upto a maximum of 512 buffers (256 in each direction). Each buffer will have 16 bytes for the msg header and 496 bytes for the payload. This will utilize a maximum total space of 256KB for the buffers." see Linux kernel drivers/rpmsg/virtio_rpmsg_bus.c. The vring buffer is allocated from CMA memory defined in Linux device tree and seen in resource table. Just FYI, there is a patch to tune the rpmsg buffer size, patchwork.kernel.org/.../ but not approved/merged in main stream yet.

    Regards,
    Garrett
  • Hi Garret,

    thanks for the fast reply!


    > However increasing heap size doesn't seem to be a solution. It seems you keep allocating message and then > use up heap. Have you freed the message by calling MessageQ_free after it's received by host?

    I do call MessagQ_free() at the Linux side.

    What I have is basically a free running SYSBIOS thread on the IPU in an endless loop doing:

    /* header */
    typedef struct{
    MessageQ_MsgHeader reserved;
    request_t request;
    uint32_t size;
    void* address;
    }memory_message_t;

    typedef struct{
    MessageQ_QueueId memoryQueue;
    MessageQ_Handle memoryResponseQueue;
    }sharedmemoryHandle_t;

    /* init */
    MessageQ_Params_init(&msgqParams);
    do {
    status = MessageQ_open(MEMORY_MESSAGE_QUEUE_NAME, &shm->memoryQueue);
    sleep(1);
    } while (status == MessageQ_E_NOTFOUND);

    MessageQ_Params_init(&msgqParams);
    sprintf(name,"MEMORYREPLY%i%i",MultiProc_self(),unique_id);
    shm->memoryResponseQueue = MessageQ_create(name, &msgqParams);

    /* later */
    endless loop{
    memory_message_t* msg;
    msg = (memory_message_t*)MessageQ_alloc(MEMORY_HEAP_ID, sizeof(memory_message_t));
    MessageQ_setReplyQueue(shm->memoryResponseQueue, (MessageQ_Msg)(msg));
    MessageQ_put(shm->memoryQueue, (MessageQ_Msg)msg);
    MessageQ_get(shm->memoryResponseQueue, (MessageQ_Msg*)&msg, MessageQ_FOREVER);
    MessageQ_free((MessageQ_Msg)msg);
    Task_sleep(200);
    }

    Now MEMORY_HEAP_ID is zero (using the default heap)
    I previously tried using a SharedRegion Heap, but as you explained, that works only for DSP<->IPU (sysbios-sysbios).

    Using the code above, it works for a while, but then fails: the request can only be 0 or 1 (an enum), but it has funny values after a while (on the Linux receiving side).
    I assume that the memory queue is thread safe, as the (test)code listed above runs in multiple threads on a AM57xx IPU (dual M4). For now I'm running one thread only, it fails too, so that is not the issue.

    Linux receiving code:

    endless loop (in a pthread){
    status = MessageQ_get(memoryQueue, (MessageQ_Msg *)&msg, MEMORY_POLL_INTERVAL_MS*1000);
    if(status==MessageQ_S_SUCCESS){
    MessageQ_QueueId queueId;
    queueId = MessageQ_getReplyQueue(msg);
    /* do stuff */
    MessageQ_put(queueId, (MessageQ_Msg)msg);
    }
    }


    Best regards,
    Lo2
  • Hi Lo2,

    Can you clarify what you mean by "but then fails: the request can only be 0 or 1 (an enum), but it has funny values after a while?" What values are you expecting and what values are you getting?

    Regards,
    Sahin
  • Hi Sahin,

    I created a new post about this:
    e2e.ti.com/.../640731

    There is an issue with the information attached to the message.

    My setup on the Linux side:
    Two pthreads running.

    - First thread creates a MessageQ "LOGGER" and loop{ polls it every 250ms}
    (I could use wait FOREVER but then I can't terminate the loop)
    The messages it receives are of type:
    typedef struct{
    MessageQ_MsgHeader reserved;
    UInt8 loglevel;
    char logentry[256];
    }log_message_t;
    Messages are received and contents processed (to file, stdout,..) works fine.
    Tested from IPU->Linux, DSP->Linux and Linux->Linux

    Messages are allocated on the IPU and freed in Linux


    - Second thread creates a MessageQ "MEMORY" and loop{ polls it every 250ms}
    (same here: I could use wait FOREVER but then I can't terminate the loop)
    The messages it receives are of type:
    typedef struct{
    MessageQ_MsgHeader reserved;
    request_t request;
    uint32_t size;
    void* address;
    }memory_message_t;

    With: typedef enum{MEM_ALLOCATE, MEM_FREE} request_t;

    Intent is to get CMEM blocks from Linux to the IPU via pointers (as in cmem example, big data example)
    Messages are allocated on the IPU and freed in Linux. There is no return path yet, there is also no access to CMEM yet. For now this is just a queue sending messages from the IPU to Linux.


    On the IPU (AM5728, so dual M4?), SYSBIOS creates three Tasks:
    - First one is a management/control task, logs a message every 1000 ticks to the LOGGER MessageQ
    - Second one is a endless loop, for testing it now does nothing but sending logs to the LOGGER queue
    - Third task uses the LOGGER queue too and also sends memory requests:
    MessageQ_Params mqp_memory;
    MessageQ_Handle mq_memory;
    MessageQ_QueueId mq_id;

    MessageQ_Params_init(&mqp_memory);
    do{
    status = MessageQ_open(MEMORY_MESSAGE_QUEUE_NAME, &mq_id);
    Task_sleep(1);
    }while(status==MessageQ_E_NOTFOUND);
    LOOP:{
    logger(3, "i task: alloc");
    /* allocate a message */
    //use default heap 0:
    memory_message_t* msg2 = (memory_message_t*)MessageQ_alloc(0, sizeof(memory_message_t));
    if (msg2 == NULL)System_abort("i task: Message alloc failed");

    /* fill contents... */
    msg2->request = MEM_ALLOCATE;
    msg2->size=i++;

    /* ...and send */
    MessageQ_put(mq_id, (MessageQ_Msg)msg2);
    }

    On the Linux side I get the messages and print the contents:
    LOOP:{
    status = MessageQ_get(mq_memory, (MessageQ_Msg*)&msg, MQ_POLL_INTERVAL_MS*1000);
    if(status==MessageQ_S_SUCCESS){
    printf("mem: request: %i, size: %i, address: 0x%x\n",
    msg->request, msg->size, msg->address);
    MessageQ_free((MessageQ_Msg)msg);
    }
    else if(status==MessageQ_E_TIMEOUT){
    }
    else{
    printf("memory thread: error receiving message 0x%x\n",status);
    }
    }

    Now here's the output:
    mem: request: 0, size: 0, address: 0x0
    mem: request: 0, size: 1, address: 0x0
    mem: request: 0, size: 2, address: 0x0
    mem: request: 0, size: 3, address: 0x0
    mem: request: 0, size: 4, address: 0x0
    mem: request: 0, size: 5, address: 0x0
    mem: request: 0, size: 6, address: 0x0
    mem: request: 0, size: 7, address: 0x0
    mem: request: 0, size: 8, address: 0x0
    mem: request: 0, size: 9, address: 0x0
    mem: request: 0, size: 10, address: 0x0
    mem: request: 0, size: 11, address: 0x0
    mem: request: 0, size: 12, address: 0x0
    mem: request: 0, size: 13, address: 0x0
    mem: request: 0, size: 14, address: 0x0
    mem: request: 0, size: 15, address: 0x0
    mem: request: 0, size: 16, address: 0x0
    mem: request: 1768843520, size: 17, address: 0x74206b73
    mem: request: 1768843520, size: 18, address: 0x74206b73
    mem: request: 1768843520, size: 19, address: 0x74206b73
    mem: request: 1768843520, size: 20, address: 0x74206b73
    mem: request: 1768710400, size: 21, address: 0x206b7361
    mem: request: 1768843520, size: 22, address: 0x74206b73
    mem: request: 1768843520, size: 23, address: 0x74206b73
    mem: request: 1768843520, size: 24, address: 0x74206b73
    mem: request: 1768843520, size: 25, address: 0x74206b73
    mem: request: 1768710400, size: 26, address: 0x206b7361
    mem: request: 1768843520, size: 27, address: 0x74206b73
    mem: request: 1768843520, size: 28, address: 0x74206b73
    mem: request: 1768843520, size: 29, address: 0x74206b73
    mem: request: 1768843520, size: 30, address: 0x74206b73
    ....

    If I run single task on the IPU this _seems_ to be ok, but I can't be sure.

    So I'd expect request to be '0' all the time. (Previously this could have been 0 or 1 but now I've reduced the code to always send 0).

    If you're interested I can send you the project by mail.

    Best regards,
    Lo2
  • Hi Lo2,

    Your code looks OK, can you please attach your project so we can test it on our end?

    Regards,
    Sahin
  • Hello Sahin,


    There is one issue I found already, and I assume it's the cause for the issue mentioned above:

    The size of an enum should be 'int', so sizeof(my_enum_t)==sizeof(int).

    The ti arm compiler for the M4 (ipu of AM57xx) says: IPU sizeof(my_enum_t): 1
    Linux/gcc says: Linux sizeof(my_enum_t): 4

    So if you use such an enum inside a MessageQ_Msg that will cause issues.

    For the issue above, I used this construct in the code:

    typedef enum{MEM_ALLOCATE, MEM_FREE} request_t;

    typedef struct{
    MessageQ_MsgHeader reserved;
    request_t request;
    uint32_t size;
    void* address;
    }memory_message_t;

    Due to the required cast, I get no errors but the size of request_t depends on the compiler used. It should be size of int (according to C11).
    So for an AM57xx system, I'd expect it to be 4 bytes on all three cores (A15, M4, C6000).
    I've checked for IPU and Linux and they are different.

    I'm not sure why the ti compiler makes it 1 byte only, I use the following compiler flags:
    -qq -pdsw225 -ppd=$@.dep -ppa -@configuro/compiler.opt
    With -@configuro/compiler.opt being whatever xs generated.

    Can you please check if the ti arm compiler is supposed to make enums of size 1 byte?
    What does the C6000 compiler do?

    If you need the code, can you please provide me an email address?

    Best regards,
    Lo2
  • Hi Lo2,

    For TI ARM, set the compiler flag --enum_type=int, this should set it to 4 bytes. The C6000 compiler sets it to 4 bytes by default, but users have the option of making it 1 byte by setting the compiler flag --small_enum.

    I will send you an e-mail shortly for the project.

    Best,
    Sahin