• Resolved

TMS320C6678: C6678 Memory transfer (copy) performance

Part Number: TMS320C6678

Hi,

I would like to ask support regarding what is the fastest way to transfer data from MSMC to MSMC.

The source buffer is in a virtual non-cached while the destination buffer is in the standard cached MSMC. Cache is on L1

The buffer is of 2048 integers (8192 B, 8 kB), measures are taken with the TSCL/TSCH register. I have done multiple tests:

1) Transferring data with memcpy using the real cached address instead of the virtual address for source buffer, without doing any cache_inv or cache_wb, takes averagely 2.3us. Of course in this case data is not really transferred from memory to memory but it is in the cache of the specific core instead. memory performance (in cache) is 8kB/2.3us = 3.48GB/s (not even close to 16GB of declared MSMC, also this is on cache which should be faster)

2) Transferring data with memcpy using the real cached address instead of the virtual address for source buffer, doing cache_wb only, takes averagely 3us. Memcpy operation takes a little more than 2us and the cache_wb a little less than 1us. memory performance (data is taken from cache and wrote after wb to MSMC) is 8kB/3us = 2,7GB/s

3) Transferring data with memcpy using the real cached address instead of the virtual address for source buffer, doing cache_wb and cache_inv, takes averagely 6.1us. Memcpy operation takes 4.5us, invalidate takes 0.9us and wb takes 0.7. I can't ever understand why the memcpy takes much more than before for the same operation. memory performance (data correctly transferred MSMC to MSCM) is 8k/6.1us = 1.3GB/s

 

4) Transferring data with memcpy using the virtual address for source buffer, doing cache_wb, takes averagely 40us. I will not go in the detail of the two operation, Memcpy is the one that takes around 39us. can't understand why. memory performance (data correctly transferred MSMC to MSCM) is 8k/40us = 200MB/s


5) Transferring data with EDMA needs no cache operation, data is correctly transferred from MSMC to MSCM and takes around 4us. Let's say there is 1 us of overhead (even though i know is less). Memory performance 8k/3us = 2,7GB/s.

Based on this topic  I would expect a much faster transfer. Is there someting wrong that I am doing?

I would like to have this data transfer in the shortest time as possible. for double access to MSMC (read and write) I would expect to have something similar to 8GB/s for the complete transfer. Am I wrong?

Please any advice and suggestion is very appreciated.

Thank you very much for your help in advance.

Best Regards,

Fabrizio