This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

LC4357 CPU performance

Other Parts Discussed in Thread: HALCOGEN

Hi

I do some CPU execution time measurements of my motor control application. I do have 2 interrupts, the ADC EOC and the RTI0 1ms timer interrupt. For time measurement I do use the PMU.
The ADC EOC is fired periodically in sync with my PWM (say 30kHz) and does interrupt the 1ms interrupt.

Caching is enabled. MPU settings taken from default LC4357 (with FreeRTOS) HALCoGen 4.01.

What I do observe:
The first ISR calculation in my ADCEOC lasts longer than the following ones. The second ISR execution time is a bit faster, the third and 4th even more fast. Then it reaches a minimum. Lets say the longest was 26us and the minimum is 21us (for the same calculations).

Then after 1ms my timer interrupt occurs. After this 1ms execution the ADCEOC calculations do last longer again. So the pattern described above restarts after every 1ms. If I change the 1ms to 2ms I see the pattern repeating with 2ms. So it has something to do with this other task.

The very first calculations (say about 4 ISR calls on PWM enbale) do even last longer (start at 31us not 26us).


All code is executed from RAM.

For my app I have to calculate with the worst case (longest duration). So the very first calc will define my max PWM frequency.

Compared on other targets:

  • 28335: This code executes absolutely deterministic (every ISR needs same amount of time)
  • LS1227: The code execution time is a bit more la la
  • LC4357: The code execution time varies extremely.

Questions:
What can cause this effect? A wrong configuration or a is this the normal effect because of the cache?
How can I get rid of this?

What is the best configuration for best CPU performance?

Thank you for clarification!

Roger

  • Roger,

    This effect is caused by the cache.
    ARM Cortex-R cache architecture does not provide a cache locking mechanism.

    On your first run for this ISR, you will have a cache miss. This code will be cached.
    On following call to this ISR, the code may still be in cache making the execution faster.
    But the cache controller may evict a part of your ISR (to cache something else) As result, a following call may take longer to execute.

    I don't see how to minimize this effect (aside from disabling the cache)

  • Also note that the CPU speculative prefetch is another a factor in the variation of execution time.  Even with the TCM based R4 devices, it is common to see some execution jitter due to the results of the last few dozen branches.

    The C28x device in comparison does not have branch speculation or cache; this improves determinism but reduces performance.  Each of these devices is built with different compromises and you need to consider what is most critical to your application.

    Regards,

    Karl

  • Thank you Jean-Marc and Karl for this information!

    OK, so I have to deal with those facts...

    Would it be possible to put the code of my fast ISR in a "normal cacheable" memory area and the whole rest of the code in a non cacheable area? So the slower code will run much slower, but the cache will not be evicted periodically (if it's not more than 32KB).

    Could this be successful? Can I configure this on the MPU tab sheet of HALCoGen? (in conjunction with linker cmd file)

    Roger