This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

c6678 fft time

Hello, I am a beginner in C66x dsps and i am quite interested in it!

The question is that i debug a project in the path "Texas Instruments\dsplib_c66x_3_0_7",the name of project is DSPF_sp_fftSPxSP_66_LE_ELF.

I want to do a 1024 float fft ,before fft I use t1=clock(),after fft I use t2=clock();then t=t2-t1=378748,because the c6678 is 1.25GHZ,so I calculate,the time of fft is 302us,it is too long ,In the data AVNET offered,for single precison floating point fft ,2048pt,radix 4,c66x@1.25GHZ the time is 14us.I want to know why this happen?I use simulator ,does it matter?

Thanks in advance!

  • Jie,

    Be sure that you are using the Cycle Approximate simulator.  The Functional Simulator will not give cycle accurate results. 

    Are you building the Debug Version or the Release Version of the project?  (Debug is the default).  The Debug version is not optimized.

    I ran this onC6678 hardware with all of the data in internal memory.  The only thing I changed in the project was to make N = 1024 (as it was set at 256) in order to get a 1024 point fft. 

    When I ran the debug version, I got a similar result as you did.  379393 cycles for optimized C. 

    When I built the release version, the results were significantly better.  36729 cycles for natural C, and 12828 cycles for the optimized C.  The latter corresponds to 10.26 us. 

    If you are using the cycle accurate simulator and the release version of the project, I would expect you would see similar results.

    Regards,

    Dan

  • Hello jie wang,

    I have the same question about the time consumption of FFT function call which is provided by TI DSPLIB for C66x. When testing the fixed point FFT function DSP_fft16x16, I find that the actual time consumption of DSP_fft16x16 @ 2048 pt is 7395 clks, corresponding to 5.92 us, which is much longer than 4.46 us refferenced from a maual offered by TI & AVNET. I build the test project in C6678 device cycle approximate simulator with release version and I don't know why my test of time is quite different from the refferenced data.

    Moreover, according to your expression, I think you should also come from China, right? If so, how can I contact with you for further discussion?

    Regards,

    bitbad

  • Hello Dan,

    I have the similar question about the FFT calculation time consuption on C6678 simulator. As you mentioned, I use the C6678 Device Cycle Approximate Simulator and build the FFT test project with release version, however, the actual time consumption of fixed point FFT function DSP_fft16x16 call @ 2048 pt is 7395 clks corresponding to 5.9 us, longer than the refferenced 4.46 us from a manual offered by TI & AVNET. I don't know whether some optimization options need to set properly to reduce the time of DSP_fft16x16 funtion call, can you give me some suggestions on how to reach up to the best effiency according to the manual, or is there some benchmark of the DSPLIB fft function to reffer?

    Many thanks,

    bitbad

  • Perhaps this is an issue with the simulator. Jie Wang, are you still experiencing this?

  • hello DanRinkes:

    After I use release mode and Cycle Approximate simulator.My result is15.2us for optimized c,2048pt,radix 4,floating point FFT ,use the DSPF_sp_fftSPxSP from the dsplib_c66x.

    It is longer than 14us from a manual offered by TI & AVNET.