Other Parts Discussed in Thread: FFTLIB
Hello!
I use VLFFT demo from e2e.ti.com/.../303599 for calculate 1024K FFT. It's run on Evaluation Board TMDSEVM6678LE.
I use SDK CCS 6.2.0.
I am using the folowing products to compile the VLFFT project are listed below:
XDCTools 3.23.4.60
EDMA3 LLD 2.11.5
IPC 1.24.2.27
MCSDK 2.1.2.6
SYS/BIOS 6.33.4.39
I can't achieve same characteristics which were performed in document "Very large FFT for TMS320C6678 processors". I have next results fo FFT 1024K:
8 Core: 32.484165 ms vs 6.403 ms in document;
4 Core: 35.519242 ms vs 9.605 ms in document;
2 Core: 58.152080 ms vs 19.328 ms in document;
1 Core: 115.402698 ms vs 38.557 ms in document.
Also calculations are dismatch reference results. log is:
max error index: 0
real, 8.515877, real_ref: 36592189439.999998
imag, 8.515877, imag_ref: 36592189439.999998
Fail!!!
Why does it happen?vlfft.zip
[C66xx_0] DMA channel 0: 0 DMA channel 1: 0 pass init 1 pass init 2 pass init 3 max num of cores: 8 num of working cores: 8 total size FFT: 1048576 1st iter FFT: 1024 2nd iter FFT: 1024 [C66xx_1] DMA channel 0: 2 [C66xx_2] DMA channel 0: 4 [C66xx_3] DMA channel 0: 6 [C66xx_4] DMA channel 0: 8 [C66xx_5] DMA channel 0: 10 [C66xx_6] DMA channel 0: 12 [C66xx_7] DMA channel 0: 14 [C66xx_1] DMA channel 1: 2 [C66xx_2] DMA channel 1: 4 [C66xx_3] DMA channel 1: 6 [C66xx_4] DMA channel 1: 8 [C66xx_5] DMA channel 1: 10 [C66xx_6] DMA channel 1: 12 [C66xx_7] DMA channel 1: 14 [C66xx_1] The test start! [C66xx_2] The test start! [C66xx_4] The test start! [C66xx_5] The test start! [C66xx_6] The test start! [C66xx_3] The test start! [C66xx_7] The test start! [C66xx_0] Core0 start initializing data array Core0 finish initializing data array Sync up all the cores [C66xx_1] vlfft initial sync [C66xx_2] vlfft initial sync [C66xx_3] vlfft initial sync [C66xx_4] vlfft initial sync [C66xx_5] vlfft initial sync [C66xx_6] vlfft initial sync [C66xx_7] vlfft initial sync [C66xx_0] The test is starting! start of loop: 0 [C66xx_3] The test is complete! [C66xx_4] The test is complete! [C66xx_5] The test is complete! [C66xx_6] The test is complete! [C66xx_7] The test is complete! [C66xx_0] The test is complete [C66xx_1] The test is complete! [C66xx_2] The test is complete! [C66xx_0] Number of Clocks per FFT = 32484165 Avg timer per fft = 32.484165 ms max error index: 0 real, 8.515877, real_ref: 36592189439.999998 imag, 8.515877, imag_ref: 36592189439.999998 Fail!!! [C66xx_0] DMA channel 0: 0 DMA channel 1: 0 pass init 1 pass init 2 pass init 3 max num of cores: 8 num of working cores: 4 total size FFT: 1048576 1st iter FFT: 1024 2nd iter FFT: 1024 [C66xx_1] DMA channel 0: 2 [C66xx_4] DMA channel 0: 8 [C66xx_6] DMA channel 0: 12 [C66xx_1] DMA channel 1: 2 [C66xx_4] DMA channel 1: 8 [C66xx_6] DMA channel 1: 12 [C66xx_1] The test start! [C66xx_4] The test start! [C66xx_6] The test start! [C66xx_3] DMA channel 0: 6 DMA channel 1: 6 The test start! [C66xx_5] DMA channel 0: 10 DMA channel 1: 10 The test start! [C66xx_2] DMA channel 0: 4 [C66xx_7] DMA channel 0: 14 [C66xx_2] DMA channel 1: 4 [C66xx_7] DMA channel 1: 14 [C66xx_2] The test start! [C66xx_7] The test start! [C66xx_0] Core0 start initializing data array Core0 finish initializing data array Sync up all the cores [C66xx_1] vlfft initial sync [C66xx_2] vlfft initial sync [C66xx_3] vlfft initial sync [C66xx_4] vlfft initial sync [C66xx_5] vlfft initial sync [C66xx_6] vlfft initial sync [C66xx_7] vlfft initial sync [C66xx_0] The test is starting! start of loop: 0 [C66xx_3] The test is complete! [C66xx_4] The test is complete! [C66xx_5] The test is complete! [C66xx_6] The test is complete! [C66xx_7] The test is complete! [C66xx_0] The test is complete [C66xx_1] The test is complete! [C66xx_2] The test is complete! [C66xx_0] Number of Clocks per FFT = 35519242 Avg timer per fft = 35.519242 ms max error index: 0 real, 36860624895.999998, real_ref: 8.578377 imag, 36860624895.999998, imag_ref: 8.578377 Fail!!! [C66xx_7] DMA channel 0: 14 DMA channel 1: 14 [C66xx_5] DMA channel 0: 10 [C66xx_6] DMA channel 0: 12 [C66xx_5] DMA channel 1: 10 [C66xx_6] DMA channel 1: 12 [C66xx_1] DMA channel 0: 2 DMA channel 1: 2 [C66xx_3] DMA channel 0: 6 DMA channel 1: 6 [C66xx_0] DMA channel 0: 0 DMA channel 1: 0 pass init 1 pass init 2 pass init 3 max num of cores: 8 num of working cores: 2 total size FFT: 1048576 1st iter FFT: 1024 2nd iter FFT: 1024 [C66xx_2] DMA channel 0: 4 [C66xx_4] DMA channel 0: 8 [C66xx_2] DMA channel 1: 4 [C66xx_4] DMA channel 1: 8 The test start! [C66xx_1] The test start! [C66xx_3] The test start! [C66xx_5] The test start! [C66xx_6] The test start! [C66xx_7] The test start! [C66xx_2] The test start! [C66xx_0] Core0 start initializing data array Core0 finish initializing data array Sync up all the cores [C66xx_1] vlfft initial sync [C66xx_2] vlfft initial sync [C66xx_3] vlfft initial sync [C66xx_4] vlfft initial sync [C66xx_5] vlfft initial sync [C66xx_6] vlfft initial sync [C66xx_7] vlfft initial sync [C66xx_0] The test is starting! start of loop: 0 [C66xx_6] The test is complete! [C66xx_7] The test is complete! [C66xx_0] The test is complete [C66xx_1] The test is complete! [C66xx_2] The test is complete! [C66xx_3] The test is complete! [C66xx_4] The test is complete! [C66xx_5] The test is complete! [C66xx_0] Number of Clocks per FFT = 58152080 Avg timer per fft = 58.152080 ms [C66xx_0] DMA channel 0: 0 DMA channel 1: 0 pass init 1 pass init 2 pass init 3 max num of cores: 8 num of working cores: 1 total size FFT: 1048576 1st iter FFT: 1024 2nd iter FFT: 1024 [C66xx_1] DMA channel 0: 2 [C66xx_2] DMA channel 0: 4 [C66xx_7] DMA channel 0: 14 [C66xx_1] DMA channel 1: 2 [C66xx_2] DMA channel 1: 4 [C66xx_7] DMA channel 1: 14 [C66xx_1] The test start! [C66xx_2] The test start! [C66xx_7] The test start! [C66xx_5] DMA channel 0: 10 [C66xx_6] DMA channel 0: 12 [C66xx_5] DMA channel 1: 10 [C66xx_6] DMA channel 1: 12 [C66xx_5] The test start! [C66xx_6] The test start! [C66xx_4] DMA channel 0: 8 DMA channel 1: 8 The test start! [C66xx_3] DMA channel 0: 6 DMA channel 1: 6 The test start! [C66xx_0] Core0 start initializing data array Core0 finish initializing data array Sync up all the cores [C66xx_1] vlfft initial sync [C66xx_2] vlfft initial sync [C66xx_3] vlfft initial sync [C66xx_4] vlfft initial sync [C66xx_5] vlfft initial sync [C66xx_6] vlfft initial sync [C66xx_7] vlfft initial sync [C66xx_0] The test is starting! start of loop: 0 [C66xx_5] The test is complete! [C66xx_6] The test is complete! [C66xx_7] The test is complete! [C66xx_0] The test is complete [C66xx_1] The test is complete! [C66xx_2] The test is complete! [C66xx_3] The test is complete! [C66xx_4] The test is complete! [C66xx_0] Number of Clocks per FFT = 115402698 Avg timer per fft = 115.402698 ms max error index: 0 real, 8.519783, real_ref: 36592189439.999998 imag, 8.519783, imag_ref: 36592189439.999998 Fail!!!