Hello,
I want to benchmark the floating-point FFT on C6678, I executed that DSPLIB routine : _DSPF_sp_fftSPxSP ; which is that last update written in linear assembly, for N=16k (16384) I mesured a count of : 466 244 clock cycles ; all my data is in L2, L1D cache is activated in cache (32k), and the compiler optimizations are active (-o3) ..
Is that what is expected to get ?
Thanks
Hi,
This is a discussion about the fft performance in: http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/125157.aspx
For the 16K, my benchmarks, on DDR3 (EVM) with 512K L2 cache enable, give 530887 cycles. Your result seems resonable to me.