This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: Shared memory with dma_buf heaps not syncing

Part Number: TDA4VM


Hey TI!

I have the following situation:

A72

running Linux 5.4 from the SDK with dma_buf enabled and the following in devicetree:

&reserved_memory {
	#address-cells = <2>;
	#size-cells = <2>;

	shared_mem_test {
		compatible = "dma-heap-carveout";
		reg = <0x00 0xaf000000 0x00 0x2d000000>;
	};
...

And the following code adapted from the SDK (only important lines listed):

#define DMA_HEAP_NAME "/dev/dma_heap/shared_mem_test"
#define DMA_HEAP_ALLOC_FLAGS (0u)

m_dma_heap_fd = open(DMA_HEAP_NAME, O_RDONLY | O_CLOEXEC);

struct dma_heap_allocation_data data;
data.len        = size;
data.fd_flags   = O_CLOEXEC | O_RDWR;
data.heap_flags = DMA_HEAP_ALLOC_FLAGS;

m_dma_buf_fd = -1;
int ret = ioctl(m_dma_heap_fd, DMA_HEAP_IOCTL_ALLOC, &data);
m_dma_buf_fd = data.fd;

m_virtAddr = mmap(nullptr, size, PROT_WRITE | PROT_READ, MAP_SHARED, m_dma_buf_fd, 0u);

C71 DSP

Bare metal software like this:

volatile uint64_t* val = reinterpret_cast<volatile uint64_t*>(0xaf000000);
*val                   = 10;
Cache_wb(reinterpret_cast<void*>(0xaf000000), 8, Cache_Type_ALL, TRUE);


while (run)
{
    // sleep 1 sec
    (*val)++;
    Cache_wb(reinterpret_cast<void*>(0xaf000000), 8, Cache_Type_ALL, TRUE);
}

Oberservation

Reading the values on the A72 works with polling, but something with the synchronization is wrong.

So sometimes I instantly see it incremented by the DSP, sometimes the A72 test program lags for seconds and misses values.

I also tried syncing the cache/mem on the A72 via:

struct dma_buf_sync sync_flags;
sync_flags.flags = DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END;
int status = ioctl(m_dma_buf_fd, DMA_BUF_IOCTL_SYNC, &sync_flags);

Which also did not help. Since the same thing worked good when accessing the memory via /dev/mem and mmap(), it must have something to do with dma_buf.

Can you give me any advice on this?

PS: I know that polling is not good and I will use interrupts/mailbox later. This is only about accessing shared memory from multiple CPUs.

  • Hi Jan,

    Can you share the test case? So that i can reproduce that locally.

    Best Regards,
    Keerthy

  • Yes sure.

    On the DSP it's just a loop which increments the value in shared memory every second.

    On the A72 I now have this code:

    volatile uint64_t* val = reinterpret_cast<volatile uint64_t*>(virtAddress);
    uint64_t _val          = 1000;
    
    while (m_runInLoop)
    {
        if (*val != _val)
        {
            std::printf("val: %u\n", *val);
            _val = *val;
        }
        std::this_thread::sleep_for(std::chrono::milliseconds(50));
    }

    The code uses the dma_buf heap mapping in userspace from the post above.

    The output then looks like this:

    12
    13
    15
    17
    22
    28
    29
    37
    ...

    Normally I would expect that the loop detects that the value changes every second and prints it.

    But instead sometimes it works fine and sometimes it takes few seconds where no value change is detected.

    I even put the dma_buf sync code from the initial post in the loop before reading the value and it does not help.

    Br Jan

  • Hi Jan,

    I went through the code and your objective seems very close to an already existing vision application.

    https://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/07_00_00_11/index_FDS.html

    I am using the latest 7.0 based PSDKRA.

    I booted to Linux prompt and did the following:

    cd /opt/vision_apps
    source ./vision_apps_init.sh
    
    ./vx_app_linux_arm_ipc.out

    Output: https://pastebin.ubuntu.com/p/s88wDKp7Ht/

    So can you try that and let me know if that is working for you?

    Best Regards,
    Keerthy

  • Turns out the fast polling of memory on the A72 side was the problem.
    When I slow down the polling it works fine.
    Maybe you can explain why fast polling on the A72 side is a problem, using /dev/mem to access physical shared mem did not show this problem.

    Aside from this the issue is solved I guess.

    Note: I'll implement interrupts/mailbox hardware anyway. The polling was just a test.

    Br Jan