This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM57X: CMEM problems with IPC between ARM and DSP

Hi,

can anybody help me to understand how to use CMEM? I have following example:

I want to use the ARM for sending the DSP an image to run a median filter. Since I learned from my last thread (https://e2e.ti.com/support/arm/sitara_arm/f/791/p/500457/1822502#1822502) that the MessageQ-method will only allow a payload of 512 Bytes, it is impossible to send a whole picture from the ARM to the DSP with MessageQs. So I want to use CMEM to store the image in the specific CMEM memory and send the DSP only the pointer. But it looks like it is not working and I can't find an example anywhere!!! The CMEM Overview site (http://processors.wiki.ti.com/index.php/CMEM_Overview?keyMatch=cmem&tisearch=Search-EN-Everything) isn't very useful to me. I need to see how the whole thing works and not only some descriptions of the functions...

I am trying this:


ARM

        /* allocate message */
        msg = (App_Msg *)MessageQ_alloc(Module.heapId, Module.msgSize);

        if (msg == NULL) {
            status = -1;
            goto leave;
        }

        bufIn = CMEM_alloc(imgSize,&cMemAllocParams);
        if(bufIn == NULL)
            printf("CMEM_alloc failed!!!\n");
        else
            printf("CMEM_alloc okay (buf: %p)\n",bufIn);

        msg->info.in_data = (uint8_t*)CMEM_getPhys((void*)bufIn);
        if(msg->info.in_data == NULL)
            printf("CMEM_getPhys failed!!!\n");
        else
            printf("CMEM_getPhys okay (in: %p)\n",msg->info.in_data);


        /* set the return address in the message header */
        MessageQ_setReplyQueue(Module.hostQue, (MessageQ_Msg)msg);

        printf("Trying MessageQ_put...\n");
        /* send the message to the remote processor */
        if (MessageQ_put(Module.slaveQue1, (MessageQ_Msg)msg) < 0)
        {
            printf("MessageQ_put had a failure error\n");
            status = -1;
            goto leave;
        }
        printf("MessageQ_put okay!\n");

It seems to work, but with some strange output:

CMEM_alloc okay (buf: 0xa0161000)
[  161.087851] CMEMK Error: Failed to find a pool which fits 0x4b000
[  161.097870] CMEMK Error: get_phys: Unable to find phys addr for 0x0
[  161.109737] CMEMK Error: get_phys: get_user_pages() failed: -14
CMEM_getPhys okay (in: 0xa0000000)

Trying MessageQ_put...

MessageQ_put okay!



DSP:

        /* wait for inbound message */
        status = MessageQ_get(Module.slaveQue, (MessageQ_Msg *)&msg,
            MessageQ_FOREVER);

        Log_print0(Diags_ENTRY | Diags_INFO, "--> MessageQ_get okay!");

        if (status < 0) {
            goto leave;
        }

        if (msg->info.dataFlag== 1)
        {
            int i = 0;
            msg->info.dataFlag = 0;
            Log_print1(Diags_ENTRY | Diags_INFO, "--> msg->info.in_data (%p): ",msg->info.in_data);
            median_3x3_cn((unsigned char*)&msg->info.in_data, msg->info.cols,(unsigned char*)&msg->info.out_data);
        }

The address for the DSP seems right, but it is not working...wrong data at this address...what did I do wrong?

Here are the CMEM stats "cat /proc/cmem":

root@am57xx-evm:~# cat /proc/cmem

Block 0: Pool 0: 1 bufs size 0xc000000 (0xc000000 requested)

Pool 0 busy bufs:

Pool 0 free bufs:
id 0: phys addr 0xa0000000

Target: EVM AM572x

IPC: 3.40.01.08

Processor SDK Linux: 2.00.01.07

DSP:SYS/BIOS: 6.45.00.19

  • I will ask the software team to comment.
  • Kevin,


    Here is a CMEM example:

    and look at the function void demoK2H_initCmemBufs(int cached),

    I extract the CMEM APIs below:

     alloc_params.type = CMEM_POOL;
      alloc_params.alignment = 0;

      if ( cached == DEMO_BUFS_CACHED )
        alloc_params.flags = CMEM_CACHED;
      else
        alloc_params.flags = CMEM_NONCACHED;

      demoK2H_assert( (CMEM_init() == 0), 0, "ERROR: CMEM_init() ");

      cmem_buf_desc[0].physAddr = CMEM_allocPhys(2 * max_num_descs * demo_payload_size, &alloc_params);
      demoK2H_assert( (cmem_buf_desc[0].physAddr != 0), 0, "ERROR: CMEM_allocPhys() ");

      cmem_buf_desc[0].length = 2*max_num_descs*demo_payload_size;

      cmem_buf_desc[0].userAddr = CMEM_map( cmem_buf_desc[0].physAddr, cmem_buf_desc[0].length);
      demoK2H_assert( (cmem_buf_desc[0].userAddr != NULL), 0, "ERROR: CMEM_map() ");

    Also make sure you have -D_FILE_OFFSET_BITS=64 option added in compiler CFLAGS.

    Regards, Garrett

  • If I try it like this, it is not working:

    #define PAYLOADSIZE 0xc000000

    unsigned char * payload;
    unsigned long payloadPhys;
    payloadPhys = CMEM_allocPhys(PAYLOADSIZE, &params);
    //payload = CMEM_alloc(PAYLOADSIZE, &params);
    if(payloadPhys == NULL)
    {
    printf("CMEM_allocPhys failed!!!\n");
    status = -1;
    goto leave;
    }
    else
    printf("CMEM_allocPhys okay (payloadPhys: %p)\n!",payloadPhys);

    //payloadPhys = CMEM_getPhys(payload);
    payload = CMEM_map(payloadPhys,PAYLOADSIZE);
    if(payload == NULL)
    {
    printf("CMEM_map failed!!!\n");
    status = -1;
    goto leave;
    }
    else
    printf("CMEM_map okay (payload: %p)\n!",payload);

    memcpy(payload, in_data, IMG_IPC_SIZE);


    Output:
    MessageQ_open OK!!
    CMEM_allocPhys okay (payloadPhys: 0xa0000000)
    CMEM Error: map: Failed to mmap buffer at physical address 0xc000000a0000000
    !CMEM_map failed!!!



    Is there any documentation for this example? Why do I have to use the CMEM_allocPhys and CMEM_map functions? The example of "nano_test.c" is using CMEM_alloc and CMEM_getPhys.
  • Update:

    It is working now. For anybody who is interested in, I will try to explain here. It is something like a summery of all threads I have found and lead to solve this whole IPC problem between ARM and DSP.
    There are several steps you have to consider, to make IPC work between the ARM and DSP1:

    1. New entries in the RSC table for the DSP1:
    - Use a custom RSC table (processors.wiki.ti.com/.../IPC_Resource_customTable) and not the default one rsc_table_vayu_dsp.h in ipc_x_xx_xx_xx/packages/ti/ipc/remoteproc/
    - 3 new defines have to be set to let the DSP know where the physical memory of CMEM is, where its own virtual memory is and the size of it.
    Example:
    #define DSP_CMEM_IOBUFS 0x85000000 => virtual address of the DSP memory section
    #define CMEM_PHYS_IOBUFS 0xB0000000 => physical address of the CMEM (shared) memory section
    #define DSP_CMEM_IOBUFS_SIZE (SZ_1M * 90) => size of the CMEM section

    And don't forget the small modifications described in the custom table link, like change offset [18] to [19] and the addition of the new entry in the ResourceTable.


    2. Correct allocation of the CMEM
    - In my case the command line approach for CMEM, described here (processors.wiki.ti.com/.../CMEM_Overview) was not working properly, that's why I have to change it in the file am57xx-evm-cmem.dtsi. The location is in the Linux-SDK-Processor folder in board-support/linux-4.1.13+......../arch/arm/boot/dts/. There you have create an entry for the physical address described above.
    Example:
    cmem_block_mem_0: cmem_block_mem@b0000000 {
    reg = <0x0 0xb0000000 0x0 0x0c000000>;
    no-map;
    status = "okay";
    };

    In the CMEM section below in this file, you can decide the amount of pools and each size of them.
    Example:
    cmem_block_0: cmem_block@0 {
    reg = <0>;
    memory-region = <&cmem_block_mem_0>;
    cmem-buf-pools = <4 0x0 0x00100000>;
    };
    4 Pools with each size of 0x100000.

    - To rebuild the new changes, you have to be in the Linux top level linux-4.1.13... and call the command make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- am57xx-evm.dtb
    This will build a new dtb file, which you have to copy into the boot folder on the target and overwrite the existing one. The use the changes, you have to reboot the target first. After you can check it with cat /proc/cmem.

    3. Using the CMEM commands on the ARM side and the Resource commands on the DSP side
    - On the ARM side, you have to include the cmem.h file and the libs to use CMEM stuff.
    - You have to use following commands in this order to use CMEM:
    -> CMEM_init() => to init CMEM functions
    -> virtualBuffer = CMEM_alloc(Size, &CMEM_params) => for a virtual buffer, where you can copy your data
    -> CMEM_getPhys(virtualBuffer) => to get a physical address for the linked virtual buffer.

    Then you can send the physical address via MessageQ_put to the DSP.

    - On the DSP side you have to poll your MessageQ. To learn more about the MessageQ stuff, check out the example ex02messageq of DRA7xx_linux. After the ARM is sending a message with the physical address, you have to call Resource_physToVirt(physicalAddressCMEM, &virtualAddressDSP).
    -> virtualAddressDSP is equal to DSP_CMEM_IOBUFS as described above. With this you can use the data you put in on the ARM side.



    And that's the magic. Took me a long time to get everything together and I described this in the way I understood. Maybe it is a bit different from the correct way to understand, but at least it leads to a result. So if I am describing something wrong, I am sorry for that, please correct me.
  • Thanks for that write up Kevin. You are a bit ahead of us as you are already integrating ARM and DSP. We are still writing the code. I'll be sure to share this with our Linux developers. Thanks again.

    (It also interesting about the usage of virtual and physical addresses. See, I come from the DSP side only. So, I would have described DSP_CMEM_IOBUFS as the physical address to the DSP. I guess it is all relative. Your description is more correct though I suppose. I prefer to just say the DSP address and the ARM address, saying virtual confuses me.)
  • Question about this comment, "On the DSP side you have to poll your MessageQ. " I had planned to do the following:

    In a separate task on the DSP,

    int status = MessageQ_get(msgQRx->Module.slaveQue, (MessageQ_Msg *)&msg, MessageQ_FOREVER);

    This call should only return when the ARM sends the DSP a message via MessageQ. So you should not have to poll. Did you learn something different?
  • Yes, I should just say DSP and ARM address, more easy to understand :)

    Sorry for the confusion, that's what I have meant with polling, using MessageQ_get and wait until something will be received here...I thought "polling" would fit as the term here.
  • Let's call it pending. Polling implies a loop checking over and over some status bit or magic data word. Pending much better as it releases control to other threads. Glad you got it working!
  • Hi Alexander,

    The example worked with the one DSP, DSP1. What needs to be changed to also include using both DSP1 and DSP2? As I understand the reserved memory space for DSP2 is different than DSP1.

    Thanks

    Alexander Pimenov said:

  • Actually I didn't try it for both DSPs.
    As far as I understand, DSP2 is very similar to DSP1, except memory allocation via device tree.
    Now this is my guess:
    divide cmem between DSPs and write it's map to config.bld
    like:
    var evmDRA7XX_CMEM_DSP1 = {
            name: "CMEM", space: "data", access: "RW",
            base: 0xA0000000, len: 0x6000000,
            comment: "CMEM Memory (192 MB)"
    };
    var evmDRA7XX_CMEM_DSP2 = {
            name: "CMEM", space: "data", access: "RW",
            base: 0xA6000000, len: 0x6000000,
            comment: "CMEM Memory (192 MB)"
    };
    than put it in Build.platformTable["ti.platforms.evmDRA7XX:dsp1"] and Build.platformTable["ti.platforms.evmDRA7XX:dsp2"] respectively.

    Change PHYS_MEM_IPC_VRING for DSP2 in rsc_table_dsp.h as in am57xx-beagle-x15-common.dtsi at reserved-memory:
    for example dsp2_cma_pool: dsp2_cma@a3000000

    #define PHYS_MEM_IPC_VRING      0xa3000000

    Change this for both dsp's in rsc_table_dsp.h:

    (DSP1)
    #define PHYS_CMEM_IOBUFS 0xA0000000
    #define DSP_CMEM_IOBUFS_SIZE (SZ_1M * 96)

    (DSP2)
    #define PHYS_CMEM_IOBUFS 0xA6000000
    #define DSP_CMEM_IOBUFS_SIZE (SZ_1M * 96)

    For now I have no board accessible, so I can not tell you if this will work. But this is good point to start from.

  • Now I am getting a 'Failed to find a pool which fits 0x70' error. That is a small PAYLOADSIZE that one would think would not be a problem. The larger amount defined as (0x100000 * 7) also does not work.
    I think I will open another post for this.
    Thanks for the help.
  • Actually I was wrong, you should not split CMEM between cores.
    There are enough memory to fit dsp1 + dsp2 map for those arrays.
  • From what I understood regarding the reserved memory for DSP1 and DSP2, I thought the resource was correct in working with the amount reserved at 0x99000000 for DSP1 (64 MBytes). DSP2 is reserved at 0x9f000000 for 16 MBytes, so its resource table needs to work with the 16 MBytes allowed.

    Thanks
  • Looking at the rsc_table_dsp.h file the define,

    #define DSP_MEM_IPC_DATA 0x9F000000

    Appears to conflict with the DSP2 base address,

    #define PHYS_MEM_IPC_VRING 0x9F000000

    When I compile for both DSP1 and DSP2 and install even DSP1 does not work.
  • netrover,

    DSP_MEM_IPC_DATA is virtual address and carved out from CMA pool defined in DTS. The wiki has more details on memory map: processors.wiki.ti.com/.../Linux_IPC_on_AM57xx

    Regards,
    Garrett
  • I tried installing again using bind and get an error for DSP2

    [ 7427.886865] omap-rproc 41000000.dsp: dma_alloc_coherent err: 50331648

    Otherwise the install messages are what I usually see.
  • Hi,

    Kevin Schuster, Could you share your rsc_table_vayu_dsp.h file here?

    BR,

    vefone

  • Sorry, I dont have the sources anymore. But it was just a small modification like i posted it at the 21st April 2016.
  • Kevin Schuster,

    I had finished it.

    rsc_table_dsp.h

  • Example:
    cmem_block_0: cmem_block@0 {
    reg = <0>;
    memory-region = <&cmem_block_mem_0>;
    cmem-buf-pools = <4 0x0 0x00100000>;
    };
    4 Pools with each size of 0x100000.

    Hi,I want to know if I want to add another pool with size of 0x200000 in  cmem_block_0,how to change the code?