Compiler/TMS320F28377D: Intrinsic wrapped in inline function prevents optimisation

IainRist

Part Number: TMS320F28377D

Tool/software: TI C/C++ Compiler

For portability reasons I have wrapped some intrinsics in inline functions, e.g. static inline int16_t min_i16(int16_t a, int16_t b){ return __min(a, b);}; If I code a tight loop using this inline function the compiler does not produce optimal code.

The code (main.c):

#include <stdint.h>

int16_t array_min_explicit( int16_t *a, int num )
{
int16_t m = *a++;
#pragma UNROLL(1)
do
{
m = __min( m, *a++);
} while ( num-- );

return m;
}

inline int16_t min_i16( int16_t a, int16_t b) { return __min(a, b); }

int16_t array_min_inline( int16_t *a, int num )
{
int16_t m = *a++;
#pragma UNROLL(1)
do
{
m = min_i16( m, *a++);
}
while ( num -- );

return m;
}

The compiler command line:

"C:/ti/ccsv6/tools/compiler/ti-cgt-c2000_17.9.0.STS/bin/cl2000" -v28 -ml -mt --cla_support=cla1 --float_support=fpu32 --tmu_support=tmu0 --vcu_support=vcu2 -O4 --opt_for_speed=5 --fp_mode=relaxed --include_path="C:/ti/ccsv6/tools/compiler/ti-cgt-c2000_17.9.0.STS/include" --symdebug:none --diag_warning=225 --diag_wrap=off --display_error_number -k --preproc_with_compile --preproc_dependency="main.d" "../main.c"

The assembly output of interest:

_array_min_inline:

MOVZ AR6,AL
MOV AL,*XAR4++
$C$L1:
MOV AH,*XAR4++
MIN AL,AH
BANZ $C$L1,AR6--
; branchcc occurs
LRETR

_array_min_explicit:

MOVZ AR5,AL
MOV AL,*XAR4++
RPT AR5
|| MIN AL,*XAR4++
LRETR

The difference between the two codes is the use of a fast RPT loop vs a slow BANZ loop. Am I doing something wrong? The compiler documentation (SPRU514O 2.11 first line) suggests that inline functions are fully expanded prior to any optimisation.

over 7 years ago

0 George Mock over 7 years ago

TI__Guru**** 251140 points

Thank you for informing us about this issue, and submitting a complete test case. I can reproduce it. I filed CODEGEN-4445 in the SDOWP system to have this addressed. It does not report a bug against the compiler, but is a request for performance improvement. You are welcome to follow it with the SDOWP link below in my signature.

Thanks and regards,

-George

Code Composer Studio™︎

Code Composer Studio forum

Compiler/TMS320F28377D: Intrinsic wrapped in inline function prevents optimisation