I'm using Tesla C64x+ DSP OMAP4 and I don't managed to get pipelined loop optimization as described in spra666.pdf. Performance is therefore very poor. If my package.bld includes:
libAttrs: {
copts: "-on2 -o3 -mt -s -al -mw -mv6400+",
defs: ""
}
I start to see some assembly SPLOOP as well as "Loop will be splooped" in the LST file corresponding to my code. But the linker fails as it's confused between Tesla and mv6400+.
If I don't add this last option, there is no piped loop optimization in the code. I can link and execute on the device. But performance is poor. Here is a full thread where the problem comes from:
http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/112/p/345768/1211817.aspx
Question: Is there a magic option to enable SPLOOP for the Tesla C64x+ DSP?