DVRRDK question on allocating buffer in DSP

Thomas Lo

Other Parts Discussed in Thread: SYSBIOS

I want to create a buffer with size of the frame in DSP, so I can use it for processing and temporary storage.

I have tried following the SCD module using Utils_memAlloc(frame_size, SharedRegion_getCacheLineSize(SYSTEM_IPC_SR_CACHED) ); to allocate buffer.

but then it is cached so I have to call ti_sysbios_family_c64p_Cache_inv and ti_sysbios_family_c64p_Cache_wb which is seems to use up a lot of DSP loading?

I tried just calling the Cache_inv and Canche_wb without other processing in between and the DSP loading increase by 30%. is this normal? I am passing 16ch D1 + 16ch CIF to DSP but only the D1 channels would call the Cache_inv and Canche_wb during my test.

or I should allocate a buffer from some where else? Thanks.

over 12 years ago

0 Ritesh Rajore over 12 years ago

TI__Expert 3065 points

Thomas,

You need to take care of cache_inv and cache_wb only if you have enabled cache on DSP side. This can be verified from cfg file FC_RMAN_IRES_c6xdsp.cfg. Please check mar bits that are enabled. By default complete region is cached for DSP.

Utils_memAlloc() tries to allocate memory from SR2 first and then if it is not available then it looks into Tiler region and/or (SR1) Bitstream buffer. You can check memory map and accordingly you can set MAR bits for (SR2) Frame Buffer Region. Just to verify cached/uncached region, you can check MAR bit status of the region from where memory is allocated. Please see AlgLink_ScdVACreate() function in scdLink_alg.c file for more details.

If you are requesting a frame memory then you can use simpler API i.e. Utils_memFrameAlloc(). Its easy to use.

0 Badri Narayanan over 12 years ago in reply to Ritesh Rajore

TI__Guru 59700 points

Why are you doing cache invalidate operations ? If you use it as a temporary buffer there is no need to do any cache coherency operation. SCD does cache coherency operation because the buffer is sent to A8 .

0 Thomas Lo over 12 years ago in reply to Ritesh Rajore

Expert 1560 points

Thanks Ritesh I would try..

Badri, I don't want cache coherency operation. But I followed SCD module to allocate buffer and it requires cache handling, otherwise it is not updated. So I ask which is a better way to allocate a temporary buffer for use.

0 Badri Narayanan over 12 years ago in reply to Thomas Lo

TI__Guru 59700 points

Allocating buffer from SharedRegion 1 does _not_ mean you have to do any cache coherency operation. Remove the code for cache coherency operation in your case as it is not meaningful. Cache management is automatically done by c674 core. If only c674 is accessing the buffer you allocate _do_ _not_ do cache coherency operation as it will unnecessarily waste CPU cycles. As I mentioned cache coherency is required in SCD case because the buffer is sent to A8 which will not happen in your case. Cache coherency has nothing to do with buffer allocation

0 Thomas Lo over 12 years ago in reply to Badri Narayanan

Expert 1560 points

sorry for asking so much as I am not familiar with cache issue.

if I need to use DMA to copy to or from this buffer to the frame buffer. Is cache operation needed? Thanks.

0 Badri Narayanan over 12 years ago in reply to Thomas Lo

TI__Guru 59700 points

If you read from a buffer in DDR that is in a cache enabled region it will be fetched into c674 L2/L1 cache. All further references to that buffer will be to cache.

If the contents of the buffer are then modified in DDR by another initiator like EDMA then the contents of DDR (actual updated contents) and contents of L2/L1 cache are no longer coherent.

c674 does not have a h/w support for snoop and invalidate between DDR and L2 cache and you have to programmatically take care of doing Cache_inv before reading the buffer contents.

If you write to a buffer in DDR that is in a cache enabled region it will be written into c674 L2/L1 cache and the line will be marked as dirty. Data will be updated in DR only when the line is evicted by the cache controller.

If the contents of the buffer are read from DDR by another initiator like EDMA then the contents of DDR (stale data ) and contents of L2/L1 (updated data) cache are no longer coherent.

c674 does not have a h/w support for snoop and invalidate between DDR and L2 cache and you have to programmatically take care of doing Cache_wb before reading the buffer contents by EDMA.

Based on above you can decide whether you need cache coherency operation or not. It is preferable not to touch the buffer in DDR by CPU . Optmized algorithms generally have a ping pong buffer in L2 SRAM.

A line of data is fetched from DDR to L2 SRAM by DMA, processed and written back to DDR by DMA.

When above processing is happening the next line is fetched in parallel to the pong buffer.

This way the CPU wait time for DMA xfer to be completed is minimized and also there is no need to do cache coherency operation on the buffer in DDR as it is never accessed by CPU directly.

Processors

Processors forum

DVRRDK question on allocating buffer in DSP