The compiler generates surprisingly slow code for the example below
Given this struct and function definition:
typedef struct _S32_FP
{
S32 value;
S32 fractionBits;
} S32_FP;
static inline S32_FP S32_FP_SCALE_UP(S32_FP a)
{
S32_FP result;
S32 shiftLeft = _norm(a.value);
result.fractionBits = a.fractionBits + shiftLeft;
result.value = a.value << shiftLeft;
return result;
}
This code is much slower:
response.denominator = S32_FP_SCALE_UP(response.denominator);
response.numerator = S32_FP_SCALE_UP(response.numerator);
...than this equivalent, manually inlined, version:
const S32 shiftLeftDenominator = _norm(response.denominator.value);
response.denominator.fractionBits += shiftLeftDenominator;
response.denominator.value <<= shiftLeftDenominator;
const S32 shiftLeftNumerator = _norm(response.numerator.value);
response.numerator.fractionBits += shiftLeftNumerator;
response.numerator.value <<= shiftLeftNumerator;
The compiler appears to get in trouble optimizing the struct assignments.
The compiler option used are:
"C:\ti\ccsv5\tools\compiler\c6000/bin/cl6x" -fr <???> -c -mv64+ --diag_warning=225 --interrupt_threshold=70000 --abi=coffabi -O3 --auto_inline=10000