AM57X: CMEM problems with IPC between ARM and DSP

Kevin Schuster

Hi,

can anybody help me to understand how to use CMEM? I have following example:

I want to use the ARM for sending the DSP an image to run a median filter. Since I learned from my last thread (https://e2e.ti.com/support/arm/sitara_arm/f/791/p/500457/1822502#1822502) that the MessageQ-method will only allow a payload of 512 Bytes, it is impossible to send a whole picture from the ARM to the DSP with MessageQs. So I want to use CMEM to store the image in the specific CMEM memory and send the DSP only the pointer. But it looks like it is not working and I can't find an example anywhere!!! The CMEM Overview site (http://processors.wiki.ti.com/index.php/CMEM_Overview?keyMatch=cmem&tisearch=Search-EN-Everything) isn't very useful to me. I need to see how the whole thing works and not only some descriptions of the functions...

I am trying this:

ARM

        /* allocate message */
       msg = (App_Msg *)MessageQ_alloc(Module.heapId, Module.msgSize);

       if (msg == NULL) {
           status = -1;
           goto leave;
       }

        bufIn = CMEM_alloc(imgSize,&cMemAllocParams);
       if(bufIn == NULL)
           printf("CMEM_alloc failed!!!\n");
       else
           printf("CMEM_alloc okay (buf: %p)\n",bufIn);

        msg->info.in_data = (uint8_t*)CMEM_getPhys((void*)bufIn);
       if(msg->info.in_data == NULL)
           printf("CMEM_getPhys failed!!!\n");
       else
           printf("CMEM_getPhys okay (in: %p)\n",msg->info.in_data);

        /* set the return address in the message header */
       MessageQ_setReplyQueue(Module.hostQue, (MessageQ_Msg)msg);

       printf("Trying MessageQ_put...\n");
       /* send the message to the remote processor */
       if (MessageQ_put(Module.slaveQue1, (MessageQ_Msg)msg) < 0)
       {
           printf("MessageQ_put had a failure error\n");
           status = -1;
           goto leave;
       }
       printf("MessageQ_put okay!\n");

It seems to work, but with some strange output:

CMEM_alloc okay (buf: 0xa0161000)
[ 161.087851] CMEMK Error: Failed to find a pool which fits 0x4b000
[ 161.097870] CMEMK Error: get_phys: Unable to find phys addr for 0x0
[ 161.109737] CMEMK Error: get_phys: get_user_pages() failed: -14
CMEM_getPhys okay (in: 0xa0000000)

Trying MessageQ_put...

MessageQ_put okay!

DSP:

        /* wait for inbound message */
        status = MessageQ_get(Module.slaveQue, (MessageQ_Msg *)&msg,
            MessageQ_FOREVER);

        Log_print0(Diags_ENTRY | Diags_INFO, "--> MessageQ_get okay!");

        if (status < 0) {
            goto leave;
        }

        if (msg->info.dataFlag== 1)
        {
           int i = 0;
            msg->info.dataFlag = 0;
            Log_print1(Diags_ENTRY | Diags_INFO, "--> msg->info.in_data (%p): ",msg->info.in_data);
            median_3x3_cn((unsigned char*)&msg->info.in_data, msg->info.cols,(unsigned char*)&msg->info.out_data);
        }

The address for the DSP seems right, but it is not working...wrong data at this address...what did I do wrong?

Here are the CMEM stats "cat /proc/cmem":

root@am57xx-evm:~# cat /proc/cmem

Block 0: Pool 0: 1 bufs size 0xc000000 (0xc000000 requested)

Pool 0 busy bufs:

Pool 0 free bufs:
id 0: phys addr 0xa0000000

Target: EVM AM572x

IPC: 3.40.01.08

Processor SDK Linux: 2.00.01.07

DSP:SYS/BIOS: 6.45.00.19

over 9 years ago

0 Biser Gatchev-XID over 9 years ago

TI__Guru**** 393215 points

I will ask the software team to comment.

0 Garrett Ding over 9 years ago in reply to Biser Gatchev-XID

TI__Mastermind 43296 points

Kevin,

Here is a CMEM example:

and look at the function void demoK2H_initCmemBufs(int cached),

I extract the CMEM APIs below:

alloc_params.type = CMEM_POOL;
alloc_params.alignment = 0;

if ( cached == DEMO_BUFS_CACHED )
alloc_params.flags = CMEM_CACHED;
else
alloc_params.flags = CMEM_NONCACHED;

demoK2H_assert( (CMEM_init() == 0), 0, "ERROR: CMEM_init() ");

cmem_buf_desc[0].physAddr = CMEM_allocPhys(2 * max_num_descs * demo_payload_size, &alloc_params);
demoK2H_assert( (cmem_buf_desc[0].physAddr != 0), 0, "ERROR: CMEM_allocPhys() ");

cmem_buf_desc[0].length = 2*max_num_descs*demo_payload_size;

cmem_buf_desc[0].userAddr = CMEM_map( cmem_buf_desc[0].physAddr, cmem_buf_desc[0].length);
demoK2H_assert( (cmem_buf_desc[0].userAddr != NULL), 0, "ERROR: CMEM_map() ");

Also make sure you have -D_FILE_OFFSET_BITS=64 option added in compiler CFLAGS.

Regards, Garrett

0 Kevin Schuster over 9 years ago in reply to Garrett Ding

Expert 1210 points

If I try it like this, it is not working:

#define PAYLOADSIZE 0xc000000

unsigned char * payload;
unsigned long payloadPhys;
payloadPhys = CMEM_allocPhys(PAYLOADSIZE, &params);
//payload = CMEM_alloc(PAYLOADSIZE, &params);
if(payloadPhys == NULL)
{
printf("CMEM_allocPhys failed!!!\n");
status = -1;
goto leave;
}
else
printf("CMEM_allocPhys okay (payloadPhys: %p)\n!",payloadPhys);

//payloadPhys = CMEM_getPhys(payload);
payload = CMEM_map(payloadPhys,PAYLOADSIZE);
if(payload == NULL)
{
printf("CMEM_map failed!!!\n");
status = -1;
goto leave;
}
else
printf("CMEM_map okay (payload: %p)\n!",payload);

memcpy(payload, in_data, IMG_IPC_SIZE);

Output:
MessageQ_open OK!!
CMEM_allocPhys okay (payloadPhys: 0xa0000000)
CMEM Error: map: Failed to mmap buffer at physical address 0xc000000a0000000
!CMEM_map failed!!!

Is there any documentation for this example? Why do I have to use the CMEM_allocPhys and CMEM_map functions? The example of "nano_test.c" is using CMEM_alloc and CMEM_getPhys.

0 Kevin Schuster over 9 years ago in reply to Kevin Schuster

Expert 1210 points

Update:

It is working now. For anybody who is interested in, I will try to explain here. It is something like a summery of all threads I have found and lead to solve this whole IPC problem between ARM and DSP.
There are several steps you have to consider, to make IPC work between the ARM and DSP1:

1. New entries in the RSC table for the DSP1:
- Use a custom RSC table (processors.wiki.ti.com/.../IPC_Resource_customTable) and not the default one rsc_table_vayu_dsp.h in ipc_x_xx_xx_xx/packages/ti/ipc/remoteproc/
- 3 new defines have to be set to let the DSP know where the physical memory of CMEM is, where its own virtual memory is and the size of it.
Example:
#define DSP_CMEM_IOBUFS 0x85000000 => virtual address of the DSP memory section
#define CMEM_PHYS_IOBUFS 0xB0000000 => physical address of the CMEM (shared) memory section
#define DSP_CMEM_IOBUFS_SIZE (SZ_1M * 90) => size of the CMEM section

And don't forget the small modifications described in the custom table link, like change offset [18] to [19] and the addition of the new entry in the ResourceTable.

2. Correct allocation of the CMEM
- In my case the command line approach for CMEM, described here (processors.wiki.ti.com/.../CMEM_Overview) was not working properly, that's why I have to change it in the file am57xx-evm-cmem.dtsi. The location is in the Linux-SDK-Processor folder in board-support/linux-4.1.13+......../arch/arm/boot/dts/. There you have create an entry for the physical address described above.
Example:
cmem_block_mem_0: cmem_block_mem@b0000000 {
reg = <0x0 0xb0000000 0x0 0x0c000000>;
no-map;
status = "okay";
};

In the CMEM section below in this file, you can decide the amount of pools and each size of them.
Example:
cmem_block_0: cmem_block@0 {
reg = <0>;
memory-region = <&cmem_block_mem_0>;
cmem-buf-pools = <4 0x0 0x00100000>;
};
4 Pools with each size of 0x100000.

- To rebuild the new changes, you have to be in the Linux top level linux-4.1.13... and call the command make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- am57xx-evm.dtb
This will build a new dtb file, which you have to copy into the boot folder on the target and overwrite the existing one. The use the changes, you have to reboot the target first. After you can check it with cat /proc/cmem.

3. Using the CMEM commands on the ARM side and the Resource commands on the DSP side
- On the ARM side, you have to include the cmem.h file and the libs to use CMEM stuff.
- You have to use following commands in this order to use CMEM:
-> CMEM_init() => to init CMEM functions
-> virtualBuffer = CMEM_alloc(Size, &CMEM_params) => for a virtual buffer, where you can copy your data
-> CMEM_getPhys(virtualBuffer) => to get a physical address for the linked virtual buffer.

Then you can send the physical address via MessageQ_put to the DSP.

- On the DSP side you have to poll your MessageQ. To learn more about the MessageQ stuff, check out the example ex02messageq of DRA7xx_linux. After the ARM is sending a message with the physical address, you have to call Resource_physToVirt(physicalAddressCMEM, &virtualAddressDSP).
-> virtualAddressDSP is equal to DSP_CMEM_IOBUFS as described above. With this you can use the data you put in on the ARM side.

And that's the magic. Took me a long time to get everything together and I described this in the way I understood. Maybe it is a bit different from the correct way to understand, but at least it leads to a result. So if I am describing something wrong, I am sorry for that, please correct me.

0 Christopher Peters over 9 years ago in reply to Kevin Schuster

Genius 3380 points

Thanks for that write up Kevin. You are a bit ahead of us as you are already integrating ARM and DSP. We are still writing the code. I'll be sure to share this with our Linux developers. Thanks again.

(It also interesting about the usage of virtual and physical addresses. See, I come from the DSP side only. So, I would have described DSP_CMEM_IOBUFS as the physical address to the DSP. I guess it is all relative. Your description is more correct though I suppose. I prefer to just say the DSP address and the ARM address, saying virtual confuses me.)

0 Christopher Peters over 9 years ago in reply to Kevin Schuster

Genius 3380 points

Question about this comment, "On the DSP side you have to poll your MessageQ. " I had planned to do the following:

In a separate task on the DSP,

int status = MessageQ_get(msgQRx->Module.slaveQue, (MessageQ_Msg *)&msg, MessageQ_FOREVER);

This call should only return when the ARM sends the DSP a message via MessageQ. So you should not have to poll. Did you learn something different?

0 Kevin Schuster over 9 years ago in reply to Christopher Peters

Expert 1210 points

Yes, I should just say DSP and ARM address, more easy to understand :)

Sorry for the confusion, that's what I have meant with polling, using MessageQ_get and wait until something will be received here...I thought "polling" would fit as the term here.

0 Christopher Peters over 9 years ago in reply to Kevin Schuster

Genius 3380 points

Let's call it pending. Polling implies a loop checking over and over some status bit or magic data word. Pending much better as it releases control to other threads. Glad you got it working!

0 Alexander Pimenov over 8 years ago in reply to Kevin Schuster

Prodigy 30 points

Here you are :)
ger_zel@bitbucket.org/.../cmem.git

0 netrover over 8 years ago in reply to Alexander Pimenov

Intellectual 850 points

Hi Alexander,

The example worked with the one DSP, DSP1. What needs to be changed to also include using both DSP1 and DSP2? As I understand the reserved memory space for DSP2 is different than DSP1.

Thanks

Alexander Pimenov said:
Here you are :)
ger_zel@bitbucket.org/.../cmem.git

0 Alexander Pimenov over 8 years ago in reply to netrover

Prodigy 30 points

Actually I didn't try it for both DSPs.
As far as I understand, DSP2 is very similar to DSP1, except memory allocation via device tree.
Now this is my guess:
divide cmem between DSPs and write it's map to config.bld
like:
var evmDRA7XX_CMEM_DSP1 = {
        name: "CMEM", space: "data", access: "RW",
        base: 0xA0000000, len: 0x6000000,
        comment: "CMEM Memory (192 MB)"
};
var evmDRA7XX_CMEM_DSP2 = {
        name: "CMEM", space: "data", access: "RW",
        base: 0xA6000000, len: 0x6000000,
        comment: "CMEM Memory (192 MB)"
};
than put it in Build.platformTable["ti.platforms.evmDRA7XX:dsp1"] and Build.platformTable["ti.platforms.evmDRA7XX:dsp2"] respectively.

Change PHYS_MEM_IPC_VRING for DSP2 in rsc_table_dsp.h as in am57xx-beagle-x15-common.dtsi at reserved-memory:
for example dsp2_cma_pool: dsp2_cma@a3000000

#define PHYS_MEM_IPC_VRING      0xa3000000

Change this for both dsp's in rsc_table_dsp.h:

(DSP1)
#define PHYS_CMEM_IOBUFS 0xA0000000
#define DSP_CMEM_IOBUFS_SIZE (SZ_1M * 96)

(DSP2)
#define PHYS_CMEM_IOBUFS 0xA6000000
#define DSP_CMEM_IOBUFS_SIZE (SZ_1M * 96)

For now I have no board accessible, so I can not tell you if this will work. But this is good point to start from.

0 netrover over 8 years ago in reply to Alexander Pimenov

Intellectual 850 points

Now I am getting a 'Failed to find a pool which fits 0x70' error. That is a small PAYLOADSIZE that one would think would not be a problem. The larger amount defined as (0x100000 * 7) also does not work.
I think I will open another post for this.
Thanks for the help.

0 Alexander Pimenov over 8 years ago in reply to netrover

Prodigy 30 points

Actually I was wrong, you should not split CMEM between cores.
There are enough memory to fit dsp1 + dsp2 map for those arrays.

0 netrover over 8 years ago in reply to Alexander Pimenov

Intellectual 850 points

From what I understood regarding the reserved memory for DSP1 and DSP2, I thought the resource was correct in working with the amount reserved at 0x99000000 for DSP1 (64 MBytes). DSP2 is reserved at 0x9f000000 for 16 MBytes, so its resource table needs to work with the 16 MBytes allowed.

Thanks

0 netrover over 8 years ago in reply to netrover

Intellectual 850 points

Looking at the rsc_table_dsp.h file the define,

#define DSP_MEM_IPC_DATA 0x9F000000

Appears to conflict with the DSP2 base address,

#define PHYS_MEM_IPC_VRING 0x9F000000

When I compile for both DSP1 and DSP2 and install even DSP1 does not work.

0 Garrett Ding over 8 years ago in reply to netrover

TI__Mastermind 43296 points

netrover,

DSP_MEM_IPC_DATA is virtual address and carved out from CMA pool defined in DTS. The wiki has more details on memory map: processors.wiki.ti.com/.../Linux_IPC_on_AM57xx

Regards,
Garrett

0 netrover over 8 years ago in reply to netrover

Intellectual 850 points

I tried installing again using bind and get an error for DSP2

[ 7427.886865] omap-rproc 41000000.dsp: dma_alloc_coherent err: 50331648

Otherwise the install messages are what I usually see.

0 vefone over 8 years ago in reply to Kevin Schuster

Expert 2440 points

Hi,

Kevin Schuster, Could you share your rsc_table_vayu_dsp.h file here?

BR,

vefone

0 Kevin Schuster over 8 years ago in reply to vefone

Expert 1210 points

Sorry, I dont have the sources anymore. But it was just a small modification like i posted it at the 21st April 2016.

0 vefone over 8 years ago in reply to Kevin Schuster

Expert 2440 points

Kevin Schuster，

I had finished it.

rsc_table_dsp.h

0 bin xu44 over 7 years ago in reply to Kevin Schuster

Prodigy 180 points

Example:
cmem_block_0: cmem_block@0 {
reg = <0>;
memory-region = <&cmem_block_mem_0>;
cmem-buf-pools = <4 0x0 0x00100000>;
};
4 Pools with each size of 0x100000.

Hi,I want to know if I want to add another pool with size of 0x200000 in cmem_block_0,how to change the code?

Processors

Processors forum

AM57X: CMEM problems with IPC between ARM and DSP