A few days ago ,I made a experiment on 6678.The following is what I did.
1.I closed the cache,and tested the FFT efficiency on L1D,namely the the efficiency of L1DSRAM.
2. I open the L1D cache,and tested the FFT efficiency on LL2.
And from the results ,I could know that the efficiency of LL2 is near to L1DSRAM,and the L1DSRAM's FFT efficiency is very influenced by the places of the input ,output and twiddle factor.
Question 1:why the efficiency of LL2 is near to L1DSRAM.Theoretically the efficiency of L1DSRAM is more faster than LL2,and what is the bottleneck ?
Question 2:why the L1DSRAM's FFT efficiency is influenced by the places of the input ,output and twiddle factor so much?