I am using the EDMA3 low level driver (edma3_lld_01_06_00_01) and DSP/BIOS 5.33.06 with CCS3.3 on a DSK6455. I extracted the DAT functions from the CSL2_DAT_DEMO example included in the LLD and build a library (DAT_ll_*). The functionality of the DAT functions in my application have been verified manually with the debugger and during runtime. The problem is the performance of the DAT functions! However, the DAT overhead is minimal and consists almost of EDMA3_DRV calls.
Measuring the performance of copying 1GiB of data from the external to the internal memory in chunks results in a total copying performance of about 230MiB/s. All sections are mapped to the internal memory, except the data section of "dext". Thus, the memory interface should not be stressed. Here you can see some code snippets of my test project:
#pragma DATA_ALIGN(dstBuff, 8)
#pragma DATA_SECTION(dstBuff, "dint")
unsigned int dstBuff[SIZE];
#pragma DATA_ALIGN(srcBuff1, 8)
#pragma DATA_SECTION(srcBuff1, "dext")
unsigned int srcBuff1[SIZE];
...
{
float time_cycles, time_ms;
unsigned long long int u64T1, u64T2;
TSCH = 0; TSCL = 0;
u64T1 = _itoll(TSCH, TSCL);
for (i = 0; i < (1024 * 1024 * (1024)) / SIZE; i++)
{
waitId[0] = DAT_ll_copy2d(DAT_2D2D, srcBuff1, dstBuff, 4, SIZE / 8, 8);
DAT_ll_wait(waitId[0]);
}
u64T2 = _itoll(TSCH, TSCL);
time_cycles = (float)(u64T2 - u64T1);
time_ms = (float)time_cycles / (float)GBL_getFrequency(); // in msec
printf("performance DAT_ll_copy2d: %fMiB/s\n", 1024. * (1024.) / time_ms);
}
I expected much higher performance rates than 230MiB/s. Please let me know how I can increase the performance?
Regards, Christoph