I use DSP_fir_sym module from DSPlib in my project.
First it was the realization DSP_fir_sym_cn.c which is contained into dsplib_c66x_3_1_1_1 package.
Next I try to use optimyzed module from library dsplib.a66 and the rezults are disappointing to me... code with nonoptimized module works almost 2 times faster!
I try to move optimized code to my C source... The resulting perfomance is the same as it was with object library usage.
Project is bilt using CCS5.4, BIOS 6.35, HW platform - TMDSEVM6670.
All data resides into L2SRAM, L1P and L1D are configured as CACHE. C optimization is ON (-o3).
The same results I get using function DSP_mat_trans in the case data resides into DDR3 memory.
What's may be the reason?