Hi Folks,
I am starting to get my feet wet with developing code for the 674x DSP core. I am using the TI816x device, and, as such I am working with the xdc tools and cgt packages that are included with the ezsdk. I have been working through the RTSC tutorial and I have been trying to convert the samples to create elf binaries (as required by the syslink/sysbios system on the TI816x device. In the process I have found that the same "Hello World" sample app in lesson 5 takes a different number cycles to execute in the simulator depending whether it is compiled to COFF or to ELF (with the ELF taking more cycles). This is not whole-program optimization. Here are the results:
$ /home/bj/ti-ezsdk_dm816x-evm_5_01_00_77/xdctools_3_20_08_88/packages/ti/platforms/sim64Pxx/Linux/kelvin prog.xe674
simulating Joule FP ISA
Hello World
Simulation done:
Total cycles: 4391
Core cycles (excl. stalls): 4390 ( 99.98%)
Nop cycles: 1977 ( 45.02%)
Stall cycles and overlapped stall cycles
Total stall cycles: 1 ( 0.02%)
XP : 1 ( 0.02%)
[snip]
$ /home/bj/ti-ezsdk_dm816x-evm_5_01_00_77/xdctools_3_20_08_88/packages/ti/platforms/sim64Pxx/Linux/kelvin prog.x674
simulating Joule FP ISA
Hello World
Simulation done:
Total cycles: 3480
Core cycles (excl. stalls): 3455 ( 99.28%)
Nop cycles: 1471 ( 42.27%)
Stall cycles and overlapped stall cycles
Total stall cycles: 25 ( 0.72%)
XP : 25 ( 0.72%)
[snip]
That is a pretty significant overhead (roughly 25%) just for a different executable format. Does anyone have any idea why this would be? Would whole-program optimization make this go away (I am still trying to figure out how to turn that on).
Are there any suggestions as to how to make this difference go away?
TIA, B.J.