Tool/software: TI C/C++ Compiler
Hi,
I am running into a problem with executing benchmarks on the AM57 DSP. The code compiles and runs correctly, but I feel I am getting a fraction of the performance I should be getting. I suspect my compiler is not properly set up for optimizing my code. The competing solution on an A9 is running much faster, which in theory should not be the case.
Unfortunately I am not sure what switches to flip or where to find them.
Can you help? I am hoiping to get 32 MAC's per cycle, but it appears that is not happening.
Best Regards, Blake