I am in the midst of testing the performance of the C66X core for the 66AK2H14. The claimed capability is around 19.2 GFlops I assume using the SIMD operations. Obviously this is marketing and optimal, but I am only getting around 3 GFlops in both DSP library FFT functions and my own test loops. So the questions are:
1) Does the C66x DSP library use any of the vector instructions (it doesn't seem like it does)?
2) Besides turning on all optimizations and options in the C compiler is there anything that would get the compiler to actually use these SIMD instructions or do they have to be coded in assembly?
Thanks.