Tool/software: TI C/C++ Compiler
I implementd the gain function below to C6657 EVM.
However, it seems that opmimazation doesn't perform well as I expected.
Question 1)
In .asm file (output by -k optiion), is it correct to determine "ii" and "Loop unroll multipe" means
the cycle of each loop and the number of Unroll, respectively?
If "ii = 20" and "Unroll Multiple = 2", this means 1loop cost 10cycle ( 20/2 ) as a result?
Question 2)
If Question 1 is true, as you can see in the gain function below,
the optimization result completely changes depend on the way of writing from 10cycle to 0.5cycle per audiosample.
I hope to write as ”case A", but the optimization doesn't peform well (10 cycle/sample).
"Case D" is best but it costs overhead for copying "pfData array".
In my experience for coding to C6747, "A" can output high performance.
Would it be possible to optimize the function "case A", by using #pragma or optimizing option.
void CCompressor::Calc( float pfData[256] )
{
int i;
float pfTemp[256];
float fPreGain = m_fPreGain; // member variable
memcpy( pfTemp, pfData, sizeof(float)*SHIFT_SIZE );
// gain function
for (i = 0; i < 256; i++){
pfData[i] *= m_fPreGain; // A ii = 20, Loop Unroll = 2x --> 10 cycle/sample
// pfData[i] *= fPreGain; // B ii = 8, Loop Unroll = 8x --> 1 cycle/sample
// pfTemp[i] *= m_fPreGain; // C ii = 2, Loop Unroll = 2x --> 1 cycle/sample
// pfTemp[i] *= fPreGain; // D ii = 4, Loop Unroll = 8x --> 0.5 cycle/sample
}
// another processing
;
;
}