We were trying to benchmark our code running on TI AM2634 control card when we found that the first time it gets executed, it seems to runs much slower (requires more than 2x CPU cycles). Specifically to illustrate the problem, we use 64 x NOP instructions to benchmark using the CPU cycle counter as shown below. The code is put in a 1ms timer ISR (the only one in the system) so it runs periodically and all code runs in RAM in one of the cores of AM2634. The first time the ISR executes, we found that CycleNum = 150. The second time it runs, CycleNum goes down to a lower number. Only after the 3rd or the 4th time it executes, CycleNum then gets to be equal to 64 and stays constant at 64.
What is the reason for this behavior? Is this an expected behavior? Is this related to CPU instruction caching? Or did we miss something in our configuration of the CPU core?. Please explain. This issue is causing us a big problem because we need to fit our control loop within a very tight execution time requirement and having code run more than 2x as slow the first few times is just not acceptable.
Below is the code we use to produce the issue:
uint32_t CycleNum;
uint32_t t1, t2, t3;void isr_1ms_task
{
CycleCounterP_reset();t1 = CycleCounterP_getCount32();
__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");
__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");
__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");
__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");__asm(" NOP");
t2 = CycleCounterP_getCount32();t3 = CycleCounterP_getCount32();CycleNum = t2 - t1 - (t3 - t2) + 1;}