PROCESSOR-SDK-AM57X: Transitioning from using CMEM to DMA-heap/dma-buf

Part Number: PROCESSOR-SDK-AM57X

Tool/software:

I am attempting to move from using CMEM to DMA-heap/dma-bufs on the AM57x using SDK9.03.

The existing application has a CMEM area of memory in the OCMC for DSP1 and this memory is shared between the DSP and the ARM. The memory is divided into different regions and the DSP and ARM invalidate the cache as control of a particular region moves from one processor to the other. There is one region of memory as far as CMEM is concerned but there are about 40+ different smaller regions as used by the DSP and the ARM.

I have tried using DMA-heap and have added a reserved-memory area and seen the corresponding /dev/dma_heap/my_reserved_memory device file. The alloc ioctl can be used to get a dma-buf and the sync ioctls can be used to control the cache for the entire reserved-memory area. This is fine for a simple demo but is not sufficient for the existing application.

I was hoping the alloc ioctl could actually allocate a piece of the reserved memory but it seems this is not the case. There is only one block of memory and it is not clear how to divide it into pieces. The sync ioctls seem to only operate on the whole chunk of memory for the dma-buf.

It is possible to specify multiple reserved-memory areas (up to 7 with default kernel code or more with a change to a #define and a rebuilt kernel) but that seems like having to put more information into the device tree than might be desirable.

Is it possible to specify a pool details so that the dma-heap pool is multiple blocks rather than just one? It seems like the genalloc system has some ability to support this for devices but the carveout-dma-heap code does not seem to make use of this.

How can a linux user program get cache control over specific regions of memory without having to specify low-level details in the device tree?  This was possible with cmem. How can it be done using dma-heap and dma-bufs? Am I misunderstanding the capabilities of dma-heap/dma-buf? Is having 40+ dma-heap reserved memory regions the only available alternative?

  • Hello John,

    I will confer internally and get back to you this week on this.

    -Josue

  • Hello John,

    First of all I would like to reiterate that the Big data IPC example was dropped in SDK 8.2.

    Second, memory pools should not have been used in OCMC ram, only in DDR.

    Could you help me understand how you came to the following conclusion?

    was hoping the alloc ioctl could actually allocate a piece of the reserved memory but it seems this is not the case. There is only one block of memory and it is not clear how to divide it into pieces. The sync ioctls seem to only operate on the whole chunk of memory for the dma-buf.

    -Josue

  • I wasn't starting with the big data example. It is actually a stripped down real-application example from a customer that was using CMEM.

    Sorry to hear that it is only for DDR. CMEM seems to have been working fine with OCMC.

    Thinking it was possible to do multiple allocations from a single dma-heap was a wrong assumption on my part. I probably should have realized that the "alloc" ioctl was not at all like a malloc since there wasn't any way to give back something that was allocated.

    I switched to trying to use DDR with just 3 different heaps to implement a circular queue. The read index is one heap with read/write access on the ARM. The write index is a second heap with read-only access on the ARM. And the buffers are a third-heap with read-only access on the ARM. A similar use of the memory was working on the older SDK 6 using CMEM.

    With the new SDK9 and trying to use dma-heap instead of CMEM, I get very inconsistent results. The initialization has the ARM opening the heaps, allocating the buffers to get the dma-buf file descriptors and getting the physical addresses which are sent to the DSP via an RpMSG.

    The read index with read/write access gets a DMA_BUF_SYNC_START, the value is set to an initial value, and then a DMA_BUF_SYNC_END is done to hopefully do a cache write back (Cache_wb) to get the value into DDR. Then another DMA_BUF_SYNC_START is done so that the normal state for the read index is with the ARM in a SYNC_START. When the read index is incremented, the value is written and then a SYNC_END and SYNC_START sequence is done to write the new value to the memory and end with an open SYNC_START state.

    When the DSP gets the RpMSG, it converts the physical addresses to virtual addresses. Then, the initial read-index value is read (after a Cache_inv) and this is written to the write index so the circular buffer looks empty. A Cache_wb() is done on the write-index to push the value into memory. Then a reply message is sent to the ARM.

    When the ARM gets the reply message, it does a SYNC_START on the write-index which hopefully does a Cache_inv and then reads the write-index value. It is expecting this value to be the initial read-index value but this is not always the case.

    This initial sequence seems pretty simple but the results are not consistent.

    The DSP frequently does not seem to get the correct initial read-index value. And the ARM does not always get the write-index value from the DSP. I have tried using devmem2 on the ARM to verify the values and sometimes it seems like the value from the DSP is not written to the DDR and sometimes it seems like devmem2 shows the correct write-index value but the test program did not see the correct value when it reads the write-index..

    It almost seems like the cache routines are not working as expected on either the ARM or the DSP. I am not sure how to deal with that or verify this. 

    When my frustration level goes down I can probably put together a simpler program that could be posted.  For now, if any of the above prompts any suggestions from you, these would be most appreciated.

    Thanks.

  • John,

    Due to BW, I will have a look at this next week.

    -Josue

  • John,

    I will not have time for this this week. 

    I appreciate your patience.

    -Josue