Hello,
I'm currently using the OMAP3530 with CCS version 4.2.1 and DSP/BIOS 5.41.04.18. I'm running an algorithm on the resident C64x+ core and I'm encountering a puzzling problem involving the L1D and L1P cache memory. I have all data structures in internal memory split between L1DSRAM and IRAM and the code is in DDR. However, when I move the code into IRAM the algorithm performance in terms of cycles needed to complete, gets noticeably worse. Why would I take a performance hit by moving the code inside? I would've thought I'd get improved performance.
Thanks,
Len