Hello,
I’ve implemented the Real FFT method described here:
http://processors.wiki.ti.com/index.php/Efficient_FFT_Computation_of_Real_Input
and the results match the standard size Complex FFT, and are as expected. However, the timing used is way out of line for the Real FFT approach, and specifically for the FFT_Split operation. For instance, with a test of 128 samples, the 64 point FFT is 2 us, while the FFT_Split is 28 us!
Here’s my function calls:
DSPF_sp_fftSPxSP( ( FFT_SIZE >> 1 ), buffer_fft_in, twiddle, buffer_fft_tmp, bit_rev, fft_radix, FFT_OFFSET, ( FFT_SIZE >> 1 ) );
FFT_Split( ( FFT_SIZE >> 1 ), buffer_fft_tmp, a, b, buffer_fft_out );
and related memory locations are:
11812c00 _a
11812e00 _b
11813c30 _bit_rev
11813820 _buffer_fft_in
11813410 _buffer_fft_out
11813000 _buffer_fft_tmp
They seem properly aligned, per my use of the DATA_ALIGN pragma.
Can someone help provide some things I should start looking at, to explain the time of the split?
Thanks,
Robert