Hello,
Recently I'm facing with a fft performance issue.
As you can see on the following results there is a pretty big difference between a theoretical complexity (cycles(soll)) and a real one.
You can notice an especially poor performance for a input length = 32768.
FFT benchmark for 500000000 Hz
DSPLIB SPxSP (1024) cycles(soll)=14464.0000 cycles(ist)=32799.0000 cycles(ratio)=0.4409 time=0.0655 ms
DSPLIB SPxSP (4096) cycles(soll)=69781.0000 cycles(ist)=145969.0000 cycles(ratio)=0.4780 time=0.2919 ms
DSPLIB SPxSP (8192) cycles(soll)=164010.0000 cycles(ist)=439627.0000 cycles(ratio)=0.3730 time=0.8792 ms
DSPLIB SPxSP (16384) cycles(soll)=327850.0000 cycles(ist)=1020660.0000 cycles(ratio)=0.3212 time=2.0413 ms
DSPLIB SPxSP (32768) cycles(soll)=753855.0000 cycles(ist)=5856574.0000 cycles(ratio)=0.1287 time=11.7131 ms
DSPLIB cfftr2(32768) cycles(soll)=983082.0000 cycles(ist)=3171030.0000 cycles(ratio)=0.3100 time=6.3420 ms
I'm using DM814x and DSPF_sp_fftSPxSP function comes from a dsplib674x.h
Program code and data are placed into external memory.
MAR registers are configured. Cache seems to work either.
In attachment you can find cfg and bld files.
Could you please help me with that ? What else should I check ?