I am performing benchmarking on mathlib and find inconsistent result against that stated in MATHLIB_c66x_TestReport.html.
Test Conditions:
- C6000 Code Generation Tools 7.3.1
- C6608 Device Cycle Approximate Simulator, Little Endian
- Import mathlib_c66x_3_0_2_0\packages\ti\mathlib\src\sinsp\c66\sindp_66_LE_ELF in CCS 5.03 for double precision evaluation.
During my first test all profile cycle display are 0 using clock(), I am not clear what is wrong; maybe some initialization is not performed.Then I use TSC instead to perform profiling. Below is the print result.
[TMS320C66x_0] RTS: 217 cycles
[TMS320C66x_0] ASM: 180 cycles
[TMS320C66x_0] C: 317 cycles
[TMS320C66x_0] Inline: 297 cycles
[TMS320C66x_0] Vector: 306 cycles
The result in MATHLIB_c66x_TestReport.html:
Problems
1. From my test result, "ASM" is fastest; however inline is fastest in MATHLIB_c66x_TestReport.html
2. "Vector" speed is very slow.
3. It is found in MATHLIB_c66x_TestReport.html that "TCI6608 Device Functional Simulator, Little Endian" is used for testing. Why not use Device Cycle Approximate Simulator since
I also perform the evaluation in EVM, similar result is achieved.
Can anyone help? Thanks.
Boll


