Hello,
I've coded an assembly function on C64x+ of the EVM TMS320C6474 board, but the execution time is not as expected : the number of cycles my asm function performs should lead to 19 ms as the execution time (according to theory .. by counting the number of cycles of my function), however, i obtain on the board 37 ms .. the only explanation i found of that difference is because "memory stalls" especially that my asm function uses data on DDR and L2 .. i'm wandering then if it's normal to get those results, knowing that i'm activating the cache .. is memory stalls the only factor that could lead to that difference ?
note :
- data is aligned carefully so as to have adjacent data treated by the algorithm (no far jumps in DDR or L2)