Tool/software:
TL;DR: Why doesn't the TI compiler (e.g., cl2000 v22.6.1.LTS) automatically map standard mathematical functions such as atan2f
, sinf
, and cosf,
included via <cmath>
or <math.h>, t
o their corresponding TMU-accelerated intrinsics (e.g., ATANPUF32
, SINPUF32
, COSPUF32
) when TMU support is explicitly enabled (via --tmu_support=tmu1
) and optimization flags are appropriately set (-O4
, --opt_for_speed=5
, etc.)?
In the following illustrative case:
#include <cmath>
float test(float x, float y) {
float a = atan2(x, y);
float b = sin(a);
float c = cos(a);
return a + b + c;
}
The compiled output clearly shows that the standard atan2f
, sinf
, and cosf
functions are called as external symbols, despite the presence of TMU and FPU64 support:
LCR #||atan2f||
...
LCR #||sinf||
...
LCR #||cosf||
Only when manually overriding the standard functions with inline wrappers that explicitly call the intrinsics (__atan2
, __sin
, __cos
) is efficient TMU-based code generation produced:
inline float atan2(float x, float y) { return __atan2(y, x); }
inline float sin(float x) { return __sin(x); }
inline float cos(float x) { return __cos(x); }
This results in the expected use of hardware instructions:
ATANPUF32 R0H,R0H
...
SINPUF32 R1H,R1H
COSPUF32 R2H,R2H
Given that TMU support is enabled, one would expect the compiler and standard library headers to detect the target architecture and transparently map these high-level math functions to their optimized, hardware-accelerated counterparts.
The build command is the following:
/opt/ti/ti-cgt-c2000_22.6.1.LTS/bin/cl2000 -c \ -D=USE_20MHZ_XTAL -D=_FLASH -D=CPU1 -D=__TMS320C28XX__ \ --issue_remarks --abi=eabi --tmu_support=tmu1 \ --float_support=fpu64 --gen_opt_info=2 \ -v28 -ml -O4 -op=3 --c_src_interlist --auto_inline \ --verbose_diagnostics --advice:performance=all \ --opt_for_speed=5 --preproc_with_compile --keep_asm \ -I/opt/ti/ti-cgt-c2000_22.6.1.LTS/include -z main.obj -o exec.out
Shouldn’t the TI standard C library (math.h
) and C++ wrapper (<cmath>
) integrate target-specific logic (e.g., via conditional macros or inline definitions) to redirect standard math functions to hardware intrinsics when available? Why is this redirection left entirely to the user? Is there a compiler flag or a TI-provided configuration that enables automatic mapping to intrinsics under TMU-supported builds?