I try to read data from DDR3 (512 MBytes) by 1 core of a TMS320C6678, the testbench conditions are as follow :
- L1D Cache=32K ; L1P Cache=32K
- L2 Cache=256K
- Read buffer=512 MBytes
- Consecutive reads from the buffer to the bancs level (core level)
=> Performance of about 2 GBytes Per Second
I found it very small, comparing it to L1 for example (about 17 GBytes/s)
Am I doing something wrong ? or is it normal ?
Thanks