Other Parts Discussed in Thread: SYSCONFIG
Dear TI team,
the MCU+ SDK 08.05 re-enabled support for nested interrupts. The release notes for 08.05 list MCUSDK-1016 as fixed:
MCUSDK-1016 | Semaphore does not function as expected when "post" call is present in multiple ISRs at different priorities | DPL | 7.3.0 onwards | AM64x, AM243x | Fixed |
Since SDK 08.00, nested interrupts have been disabled as a workaround for MCUSDK-1016:
MCUSDK-1016 | Semaphore does not function as expected when "post" call is present in multiple ISRs at different priorities | DPL | 7.3.0 onwards | AM64x, AM243x |
Interrupt nesting should be disabled. SDK disables interrupt nesting by default. |
Unfortunately it seems that there are still issues with nested interrupts with MCU+ SDK 08.05. Our EtherCAT master (running on AM64x R5f w/ MCU+ SDK) experienced timeouts because the cyclic task didn't run at the expected intervals, and our EtherCAT slave application inexplicably dropped back to SAFEOP.
We created some minimal test cases to be able to debug the issue, and while two ISRs that triggered two separate semaphores to unblock two tasks didn't cause any problems even if one ISR preempted the other, we found that an interrupt that preempted the FreeRTOS tick interrupt caused severe problems.
- 1 task executing a while loop with only a vTaskDelay(1)
- 1 task executing a while loop blocking for a semaphore and then checking how longs its been (ClockP_getTimeUsec()) since the last run
- if it's been < 80% / > 120% of the expected period a warning is output
- 1 TimerP timer configured via SysConfig for cyclic operation with a 997µs (to make it drift versus the 1000µs tick period) cycle that posts the semaphore from its ISR
With this test program I've seen two different failures:
- Often FreeRTOS would crash because it tried to schedule a task with task handle NULL, eventually causing an undefined instruction exception
- Sometimes the task blocking on the semaphore would wake up ~1 cycle (990-1005µs) too late, despite the processor being otherwise idle
We've implemented a trace feature within the MCU+ SDK that allows us to trace execution of HWIs, task switches and so on, and found that the issues always occured when the tick interrupt got preempted by our TimerP interrupt.
TI's R5f port's vPortTimerTickHandler calls into xTaskIncrementTick without disabling interrupts. Within xTaskIncrementTick a lot of code executes without entering a critical section that is obviously not intended to be preempted. FreeRTOS seems underdocumented, and I couldn't find any explicit specification that states which functions must only be called with interrupts disabled, but e.g. the Cortex-M3/M4 ports, including the one for the M4 in MCU+ SDK, disable interrupts around xTaskIncrementTick.
After adding portDISABLE_INTERRUPTS() / portENABLE_INTERRUPTS() calls in vPortTimerTickHandler (mcu_plus_sdk_am64x_08_05_00_24\source\kernel\freertos\portable\TI_ARM_CLANG\ARM_CR5F\port.c) we haven't seen any issues within the test program nor within our EtherCAT master application. Tests with our EtherCAT slave application are pending, as are further tests with more complex master applications.
- Is this a known issue and is there a fix from TI available?
- Are there any known (stability) issues with MCU+ SDK 08.05?
- Could someone from TI with knowledge about how the R5f port is intended to work verify our findings and if our proposed fix is a) necessary and b) sufficient?
Best Regards,
Dominic