This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Time cost and Memory config of C6678 FFT

Other Parts Discussed in Thread: TMS320C6678

Hello

I'm a beginner of TMS320C6678, and have more doubts after reading the discussion about FFT on C6678 from TI E2E Community, especially the difference of cycles between cycle approximate simulator and  the manual offered by TI & AVNET.

Because of the urgency of research &design,it's need to evaluate the performance of TMS320C6678.

There is a 32768 float FFT realized by DSPLIB funcion on a single core ,My doubts are  how I can configure the data memory and how many cycles it will  cost.

If it is a 64K float FFT realized by DSPLIB funcion on a single core, what about the configuration of memory and cycles of cost.

Thanks for your suggestion!

  • As far as I know, we always benchmark the routines using the simulator with flat 0 wait states memory

    (These are the numbers in the table that is part of the release in \\dsplib_c66x_3_1_1_1\docs\DSPLib_c66xTest_Report.html ) for example.

    You understand that we cannot assume that the data is in external memory, your system might be different than us, and you may use a 16 bit external memory, lower clock cycle and so on.  We do not know what else you may have in L2 so we are not sure if the data fits into L2.

    So what if the FFT size it too large for L1 cache (This is really your question, I assume)

    My suggestion – Get an EVM, ($400), put the data where you want it to be  L2 or MSM memory or DDR and measure the performances. 

    I would add that we have an example code that run 1M FFT on 1, 2, 4 or 8 cores (floating point, SP, complex) and moves the data in and out L2  using the EDMA in such a way that every time core is working on a single 1K size (which is 4*2*1k = 8K bytes, fit into L1 cache). The algorithm how to do it is given in the following paper.

     

    High-Performance Parallel FFT Algorithms
    for the HITACHI SR8000
    Daisuke Takahashi
    Information Technology Center, University of Tokyo
    2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan

     

    Ran