Part Number: PROCESSOR-SDK-AM65X
Other Parts Discussed in Thread: SYSBIOS
Dear TI team,
I've been investigating an issue with our own application that looks very similar to another issue posted before (https://e2e.ti.com/support/processors/f/791/t/787929), but that thread unfortunately ended without a solution since the problem "disappeared".
The symptoms are very similar to what the original post describes: Our application has a thread that repeatedly calls Task_sleep(1), and eventually that thread "dies" because according to ROV the timeout is 4 billion + (actually a small negative number). In our case there are also two interrupts involved, the timer interrupt and a PRU interrupt (from the ethercat slave firmware).
What happens is that the timer interrupt is triggered, and the interrupt processing gets within ti_sysbios_knl_Clock_doTick__I() up to the point where (&ti_sysbios_knl_Clock_Module__state__V)->ticks++ has been incremented but ->swiCount++ not yet. At this point the timer interrupt is preempted by a PRU interrupt which runs until completion. At that moment a NEW timer interrupt is being processed, i.e. the core is not resuming processing of the original timer interrupt, but starts executing from ti_sysbios_family_arm_v7r_keystone3_Hwi_dispatchIRQ__I(). That interrupt processing also moves through ti_sysbios_knl_Clock_doTick__I(), increments ->ticks++ once more, and then triggers the clock SWI. At that point ->ticks has been incremented twice, but ->swi_count was only incremented once, causing the timeout for our Task_sleep(1) to be missed. If I manually "correct" the corresponding Clock object's ->currTimeout processing resumes just fine.
The real issue is apparently that the timer interrupt is preempted by the PRU interrupt but then doesn't get to finish processing, but rather takes the timer interrupt again.
The VIM chapter in the TRM (revision D) has a small paragraph on interrupt priorization that seems to suggest that the VIM would keep track of preempted interrupts:
If the CPU switches this interrupt to active (by reading the
VIM_FIQVEC/VIM_IRQVEC register), then the currently active interrupt will be pushed onto a stack. When
an interrupt is cleared by reading the VIM_FIQVEC/VIM_IRQVEC register, if there are any interrupts on
the stack, the first entry is popped off and put back into the VIM_ACTFIQ/VIM_ACTIRQ register, so that
software may continue where it left off.
Unfortunately the TRM is otherwise rather sparse on the details of this "stack", e.g. it doesn't say anywhere how deep that stack is.
I'm currently using SYS-BIOS from 6.75.02 from processor SDK 06.00, but I also gave SYS-BIOS 6.76.02 (latest download on the website) a try, unfortunately with the same results.
Regards,
Dominic