This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Memory shared between peripheral DMA and DSP

Other Parts Discussed in Thread: DM3730

Hi!

I have a buffer filled by external device attached to my DM3730 via GPMC. DMA handles it well enough.

But then I want to process data using DSP (and DSPLink accordingly) and send it somewhere away. So, how can I map DMA buffer and userspace DSPLink channel buffer  on the same memory region to avoid unnecessary data copying?

linux kernel v. 2.6.37 based on TI PSP

  • Sharing of a buffer between DSP and Linux on ARM can be accomplished using the TI CMEM product:
    http://processors.wiki.ti.com/index.php/CMEM_Overview

    It's fairly easy to for a user application to allocate a buffer (w/ CMEM_alloc()), get its physical address (w/ CMEM_getPhys(), and package that in a MSGQ message that is sent to the DSP with MSGQ_put().  The harder part is to get the DMA to use this buffer as its destination, since DMA is typically handled internally by the Linux kernel, and there's no easy way to give this buffer to the kernel from user space.

    CMEM manages memory that is kept away from the Linux kernel (via restricting the Linux kernel memory w/ "mem=##M" on the kernel command line).  If this is an avenue that you'd like to consider, I can tell you more about the details (unless the details are already clear to you).

    Regards,

    - Rob

  • Thank you, Robert!

    Concerning buffer allocation: I guess, it should be allocated by kernel driver, because application can be relaunched, hanged, etc. So, if I allocate buffer by dma_alloc_coherent(), I suppose I can call mmap() with some parameters for access in my user application. But, as you said, I cannot easily request buffer from reserved memory... so I really want to know any possible ways. 

  • 1) Robert, I have encountered strange behaviour: when I'm allocating memory for buffers by CMEM_alloc, MSGQ system starting to fail with allocating memory for MSGQ messages.

    Size of reserved (for DSP) memory = 128Mb. I've tried with 120, 100 and 80 Mb allocated with CMEM_alloc, and MSGQ_alloc fails anyway. I cannot understand, how should I fix the problem...

    UPDATE: cmem init string is "pools=10x30000, phys_start=0x98000000 phys_end=0xa0000000"

    cmem printout: "allocated heap buffer 0xda000000 of size 0x7fb0000"

    Looks like all 128Mb is heap, and there is no pool? Maybe that's why MSGQ_alloc() fails? 

    2) Concerning memory sharing, I've decided to renounce an attempts to get kernel, DSP and user share one memory block. Using kernel 2.6.37, I cannot allocate large coherent DMA buffers (I have 4*256Kb maximum). While as userspace needs 4*32Mb buffers for smoothing NAND writing latency.

  • In DSPLink docs I've read that _create, _execute and _delete gpp-side code blocks should be called from the same tread. Is this rule affects CMEM functions too?

    UPDATE: I've moved cmem functions to the same thread as other DSP functions in, but It would not affect anything. Still MSGQ fails at alloc() and Oops 817 appears on MSGQ_transportClose(0) call.

    GPP-side code based on message example from DSPLink package:

    1512.gpp_main.c

    8284.gpp_main.h

    7167.scale.c

    5582.scale.h

    2308.scale_os.c

    6646.scale_os.h

    And user App thread calling all dsp functions:

    4011.p347_dev_helper.cpp

  • Looks like I have almost done... Issue has been partially solved by allocating memory with CMEM_alloc() and passing addresses to DMA by dma_map_single. So I've constructed a DMA chain using dma_map_single() and dma_unmap_single().

    Unfortunately, I've encountered another issue... maybe it has relation to cache, I'm not sure. New problem: sometimes all received data becomes  = 0. ARM prefetching issue or something else? Calling flush_cache_all() before mapping and unmapping has no effect.

    UPDATE: simplified call order

    to start dma:
    for (all channels in chain) {
        dat = dma_map_single(NULL,_some_surely_correct_virtual_address_,2048,DMA_FROM_DEVICE);
        dma_sync_single_for_device(NULL,dat,2048,DMA_FROM_DEVICE);
    }
    omap_start_dma(first_channel_in_chain);
    
    in dma callback:
    omap_stop_dma(last_channel_in_chain);
    for (all channels in chain) {
        dma_unmap_single(NULL,_stored_from_above_dma_address_,2048,DMA_FROM_DEVICE);
        dma_sync_single_for_cpu(NULL,dat,2048,DMA_FROM_DEVICE);
    }

  • So.... I have allocated buffers for DMA from HEAP instead of POOL, and issue has disappeared. I don't know why. POOL and HEAP were both with CMEM_NONCACHED in params. Anyway, I have allocated "fake" buffer at start of DSPLINK reserved memory to prevent corruption of RESETCTRL and CODEMEMORY sections (using CMEM HEAP, I cannot specify desirable start address for buffer).

    Now, I have a new questions:

    1) Can I allocate RESETCTRL and CODEMEMORY sections of DSPLINK  at the end of reserved memory space&?

    2) Is using two shared memory regions compulsory? There is SHAREDENTRYID0 and SHAREDENTRYID1. Can I allocate only 1 region, or no one?

    3) Maybe there is any hidden method to allocate buffer in CMEM_HEAP with preferrable start address? I don't see any in CMEM function, but who knows...

  • Here is some info regarding your DSPLink specific question.

    The DSPLink section can be re-allocated but you need to be very careful.  Ultimately, you need to make sure both the GPP-side configuration and the DSP-side agree or your application can go south pretty quickly.

    Take a look at following TI Wiki pages regarding some of this.

    http://processors.wiki.ti.com/index.php/Changing_DSPLink_Memory_Map

    http://processors.wiki.ti.com/index.php/Determining_DSPLink_shared_memory_size_requirements