This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/AM6548: Potential TI RTOS bug with R5F core with nested interrupts

Part Number: AM6548
Other Parts Discussed in Thread: SYSBIOS

Tool/software: TI-RTOS

Hello Experts,

I have an issue when triggering a cyclic interrupt from PRU to R5F core.

The cycle time of this interrupt is 125µs (plus 2 PRU cycles).

Because the TIRTOS Clock interrupt runs with 1ms cycle time (per default using the dmtimer peripheral), the clock interrupt and my PRU->R5F interrupt drift slowly against each other.

So each raise condition occurs you can think of. I can see it on the oscilloscope using some GPIO of the AM65_IDK (TIMER_IO0/1 at J19).

The R5F VIM seem to support interrupt priorities from 0 (highest) to 15 (lowest):

  • for my cyclic interrupt I use a TIRTOS HWI with the priority 5
  • the TIRTOS clock timer has priority 15 (I guess)

In addition I have 2 TIRTOS tasks which just doing a Task_sleep(1) and triggering some GPIOs, so that I see if they are alive.

With this interrupt priorization, the tasks die after some seconds.

Then, when starting ROV, I can see that they suddenly do a Task_sleep(4294949643) which is 0xFFFFBB0B instead of Task_sleep(1). 

Now my assumption is that the RTOS/HAL interrupt handler has a bug, so that some memory regions get overwritten.

Proove: When I turn the priority of my interrupt from 5 to 15, everything works well !

My source code for R5F interrupt creation is as follows:

// This registers an ISR to an event from PRU via SYSBIOS functionality.
// It's required, that the component MAIN2MCU_LVL_INTRTR0 is already configured.
void registerISROnPRUEventViaSysbios()
{
    Hwi_Handle hwi0;
    Hwi_Params hwiParams;
    Error_Block eb;

    Error_init(&eb);
    Hwi_Params_init(&hwiParams);
    hwiParams.arg = 0;
    hwiParams.enableInt = true;
    hwiParams.priority = 5;   // R5F-VIM has priorities from 0 (highest) to 15 (lowest)
                              // 2019-03-29 it works only with priority 15
                              // all other priorities lead to a system hang
                              // is there a conflict with the timer interrupt?
    hwiParams.maskSetting = ti_sysbios_interfaces_IHwi_MaskingOption_ALL; // disable all other interrupts

    hwi0 = Hwi_create(CSL_MCU0_INTR_MAIN2MCU_LVL_INTR0_OUTL_2, pruIrqHandler, &hwiParams, &eb);
    if (hwi0 == NULL) {
        UART_printf("ERROR: Hwi create failed\n");
    }
}

And my R5F interrupt functions is:

// Interrupt Handler PRU to R5F
static void pruIrqHandler(uintptr_t foobar)
{
    // clear the interrupt source in ICSSG2 on INTC
    PRUICSS_pruClearEvent(s_PruHandle,PRU0_ARM_EVENT);

    // clear R5F interrupt
    // Not needed, the R5 Interrupt is cleared in calling function Hwi_dispatchIRQC()
    //Osal_ClearInterrupt(0, CSL_MCU0_INTR_MAIN2MCU_LVL_INTR0_OUTL_2);

    g_u32PruR5fIrqCnt++;

    //GPIO_write(HAL_GPIO_TIMER_IO0,0);
    GPIOPinWrite_v0(CSL_GPIO1_BASE, 88, 0U);
}

In the PRU I use just

// Define the delay between groups of frames
//   200 MHz: 125us is 25000 * 5ns
//   225 MHz: 125us is 28125 * 4.444ns
#define DELAY_BETWEEN_FRAME_GROUPS	28125

    while(1) {

        // Wait for 125us
        while (PRU0_CTRL.CYCLE < DELAY_BETWEEN_FRAME_GROUPS);
        PRU0_CTRL.CYCLE = 0;

        gpio1PinSet(88);  // set TIMER_IO0 (pin 88)

        // trigger ARM interrupt
        // set bit0 for Host Int0
        // set bit5 for strobe/enable of bit0...3
        __R31 = 0x21;   // trigger host interrupt

        sharedMemPtr->u32Debug1++;
    }

Of course I configured the interrupt router according to E2E entry https://e2e.ti.com/support/processors/f/791/p/784653/2904469

What do you think?

Could this be a bug in the TIRTOS or R5F/HAL interrupt ecosystem?

Best regards,

  Rüdiger

  • It's behaving as though interrupts have been disabled for more than a Clock_tickPeriod (ie more than 1ms).

    This can cause the timeout value in the Clock objects used by Task_sleep() to get improperly evaluated.

    Have you tried different maskSetting values? Does Hwi_MaskSetting_LOWER or Hwi_MaskSetting_SELF behave better?

    I'm just trying to rule out some possibilities.

    Alan

  • Another thing you might try independent of the maskSetting experiments is explicitly setting the Clock.tickMode to Clock.TickMode_PERIODIC in your .cfg file:

    Clock.tickMode = Clock.TickMode_PERIODIC;

    If this corrects the problem, it adds support to the idea that interrupts are being disabled for a long period of time.

    If you let the application run more after the condition occurs, can you confirm that the Clock 'ticks' value is increasing in the Clock module ROV view? If not, then a Clock timer interrupt was not properly serviced.

    Alan
  • Hello Alan,

    I do not disable the interrupts for 1ms, I have only a kind of framework application with nearly no application functionality.

    But I checked now HWI drivers in more detail:

    For this MCU, the HWI maskSetting is not supported at all:
    see 
    bios_6_75_01_05\packages\ti\sysbios\family\arm\v7r\keystone3\Hwi.c

    in contrast with the TMS570 implementation in
    bios_6_75_01_05\packages\ti\sysbios\family\arm\v7r\vim\Hwi.c

    Your proposal setting TickMode_PERIODIC will have no effect, because this is already the default value at this platform.

    The issue which I detected is not so much my problem, since I now synchronize the Clock tick with my own cyclic interrupt (125us).
    Therefore I call at each 8th cycle of my cyclic interrupt Clock_tick() and before enabling my interrupt I call Clock_tickStop(). 

    But I fear that other customers run into the problem that Task_sleep() don't work anymore as soon as they have any HWIs running!
    Moreover, because this seems to be a race condition issue, it could occur extremely seldom and will be very hard to analyze.

    Best regards,
    Ruediger

  • Hi Ruediger,

    Can you give a sample project that shows this? We will dive into more. We don't know of any problems in this area, but would like to rule a kernel issue.

    Todd
  • I incorrectly assumed from your Hwi configuration that you were using a device which used the v7r.vim.Hwi module, which does support the maskSetting parameter.

    If your code will be calling Clock_tick(), the formal procedure for configuring this behavior is to set Clock.tickSource to Clock.TickSource_USER in your .cfg file:

        Clock.tickSource = Clock.TickSource_USER;

    This eliminates the need to call Clock_tickStop() and also frees up the timer that would otherwise be automatically allocated to provide the periodic interrupt.

    Are you saying that you worked around the Task_sleep() problem you were having by calling Clock_tick() yourself?

    Alan

  • Todd and Alan,

    thank you for your answers.

    Yes, this workaround seems to be stable. If I would use TickSource_USER, then I needed to start the PRU and my interrupt from PRU very early. Otherwise SYSBIOS would not correctly start at all (I failed giving this a try).

    Todd, for sending my example project to you, do I need to post it here as ZIP file attachment?

    Regards,

      Ruediger

  • I am sorry, I cannot reproduce this issue anymore.
    Even when I undo my recent changes.
    Could be that I had an issue either in the linker file or the SYSBIOS cfg file.

    Sorry for the false alarm.

      Ruediger

  • Thanks for following up with us. If it pops up again and this thread gets locked, just start a new one and reference this one.

    Todd