This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/TI-RTOS-MCU: RTOS latency TM4C1294NCPDT

Part Number: TI-RTOS-MCU


Tool/software: TI-RTOS

Hi,

I am using MCU SPI to read from a slave device. Slave signalizes that is ready to provide the data by triggering my external interrupt. In the interrupt I post a semaphore to unlock the SPI_transmission. It was working ok when I was testing it with low bandwidth. At the end my slave triggers interrupt with 31.3kHz frequency, I need to read then 2 16bit frames. Once I operate with this frequency the delay between slave trigger signal and performing SPI task starts to be too big. In the external interrupt routine I just toggle a pin (just for reference) and post SPI semaphore. In the scope screenshot below the blue signal is a reference signal which should be toggled once code jumps to the drdy() routine, the green one is slave's trigger (falling edge is callign drdy().There is ~4us delay. The delay is even bigger when you look at the violet waveform which is one of the SPI's signal (slave enable signal). 

I am using two tasks and one hwi. One task doesn't have while() loop and just runs at the begging to configure slaves(DAC_conf ()). The other task masterTaskFxn() is a SPI transmission task. I can see in the ROV that first task is terminated once it does the initialization. Then just the idle(), masterTaskFxn(), and hwi seem to be active. Is that significant delay a result of latency introduced by rtos scheduler  itself or I can do better?

void hardware_init(void)
{

//    SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOB);
             // enable Timer 2
    //
    // Auxilaty SPI pins
    //
    GPIOPinTypeGPIOOutput(GPIO_PORTB_BASE, GPIO_PIN_5);  // Init PF4 as input
    GPIOPinWrite(GPIO_PORTB_BASE, GPIO_PIN_5,32);  // Init PF4 as input
    GPIOPinTypeGPIOOutput(GPIO_PORTB_BASE, GPIO_PIN_3);  // Init PF4 as input
    GPIOPinWrite(GPIO_PORTB_BASE, GPIO_PIN_3,8);  // Init PF4 as input
    GPIOPinTypeGPIOOutput(GPIO_PORTB_BASE, GPIO_PIN_2);  // Init PF4 as input
    GPIOPinWrite(GPIO_PORTB_BASE, GPIO_PIN_3,4);  // Init PF4 as input
}
/*
 *  ======== GPIO/DRDY HWI ========
 */
void drdy(unsigned int index)
//void drdy(void)
{
//    Semaphore_post(TCP_semaphore);
    GPIO_toggle(Board_TEST); // Init PF4 as input
    Semaphore_post(SPI_semaphore);

}

/*
 *  ======== SPI measurement transmission task ========
 */
Void masterTaskFxn (UArg arg0, UArg arg1)
{
    SPI_Handle masterSpi;
    SPI_Transaction masterTransaction;
    SPI_Params spiParams;
    bool transferOK;

    SPI_Params_init(&spiParams);
    spiParams.transferMode = SPI_MODE_BLOCKING;
    spiParams.transferTimeout = SPI_WAIT_FOREVER;
    spiParams.transferCallbackFxn = NULL;
    spiParams.mode = SPI_MASTER;
    spiParams.bitRate = 10000000;
    spiParams.dataSize = 16;
    spiParams.frameFormat=SPI_POL1_PHA1;

    Semaphore_pend(SPI_semaphore, BIOS_WAIT_FOREVER);

    /* Initialize SPI handle as default master */
//    masterSpi = SPI_open(0, &spiParams);
    masterSpi = SPI_open(Board_SPI0, &spiParams);
    masterTxBuffer2[0]=0b0001001000000000;
    masterTxBuffer2[1]=0b0000000000000000;
    masterTxBuffer[0]=0b00010010;

    /* Enable interrupts */
    GPIO_enableInt(Board_BUTTON0);

while(1)
{
    Semaphore_pend(SPI_semaphore, BIOS_WAIT_FOREVER);
    /* Initialize master SPI transaction structure */
    ctr1=ctr1+1;
    masterTransaction.count = 1; //Numbers of frames for a transaction
    masterTransaction.txBuf = (Ptr)masterTxBuffer;
    masterTransaction.rxBuf = (Ptr)masterRxBuffer;
    /* Initiate SPI transfer */
    transferOK = SPI_transfer(masterSpi, &masterTransaction);

//    if(transferOK) {
//        /* Print contents of master receive buffer */
//        System_printf("Master: %s\n", masterRxBuffer);
//    }
//    else {
//        System_printf("Unsuccessful master SPI transfer");
//    }
}

    /* Deinitialize SPI */
    SPI_close(masterSpi);

    System_printf("Done\n");

    System_flush();
}


/*
 *  ======== Config task ========
 */
Void DAC_conf (UArg arg0, UArg arg1)
{

    SPI_Handle masterSpi_dac;
    SPI_Transaction masterTransaction_dac;
    SPI_Transaction masterTransaction_s2p;
    SPI_Transaction masterTransaction_adc;
    SPI_Params spiParams_dac;
    bool transferOK;

    SPI_Params_init(&spiParams_dac);
    spiParams_dac.transferMode = SPI_MODE_BLOCKING;
    spiParams_dac.transferTimeout = SPI_WAIT_FOREVER;
    spiParams_dac.transferCallbackFxn = NULL;
    spiParams_dac.mode = SPI_MASTER;
    spiParams_dac.bitRate = 5000000;
    spiParams_dac.dataSize = 16;
    spiParams_dac.frameFormat=SPI_POL1_PHA1;

//            Semaphore_pend(SPI_semaphore_DAC, BIOS_WAIT_FOREVER);
            /* Initialize SPI handle as default master */
            masterSpi_dac = SPI_open(0, &spiParams_dac);
            //    masterSpi = SPI_open(Board_SPI0, NULL);

            /* Initialize master SPI transaction structure */
            masterTransaction_dac.count = 1; //Numbers of frames for a transaction
            masterTransaction_dac.txBuf = (Ptr)DAC_Buffer;
            masterTransaction_dac.rxBuf = (Ptr)masterRxBuffer;

            GPIOPinWrite(GPIO_PORTB_BASE,GPIO_INT_PIN_5,0);  // Init PF4 as input
            /* Initiate SPI transfer */
            transferOK = SPI_transfer(masterSpi_dac, &masterTransaction_dac);
            GPIOPinWrite(GPIO_PORTB_BASE,GPIO_INT_PIN_5,32); // Init PF4 as input
            DAC=1;

//            Semaphore_pend(SPI_semaphore_DAC, BIOS_WAIT_FOREVER);
            /* Initialize master SPI transaction structure */
            masterTransaction_s2p.count = 1; //Numbers of frames for a transaction
            masterTransaction_s2p.txBuf = (Ptr)S2P_Buffer;
            masterTransaction_s2p.rxBuf = (Ptr)masterRxBuffer;

            GPIOPinWrite(GPIO_PORTB_BASE,GPIO_INT_PIN_3,0);  // Init PF4 as input
            /* Initiate SPI transfer */
            transferOK = SPI_transfer(masterSpi_dac, &masterTransaction_s2p);
            GPIOPinWrite(GPIO_PORTB_BASE,GPIO_INT_PIN_3,8); // Init PF4 as input
            DAC=2;

//            Semaphore_pend(SPI_semaphore_DAC, BIOS_WAIT_FOREVER);
            GPIOPinConfigure(GPIO_PD2_SSI2FSS);
            GPIOPinTypeSSI(GPIO_PORTD_BASE, GPIO_PIN_0 | GPIO_PIN_1 |
                                            GPIO_PIN_2 | GPIO_PIN_3);

            masterTxBuffer[0]=0b0100000100000001;
            /* Initialize master SPI transaction structure */
            masterTransaction_adc.count = 1; //Numbers of frames for a transaction
            masterTransaction_adc.txBuf = (Ptr)masterTxBuffer;
            masterTransaction_adc.rxBuf = (Ptr)masterRxBuffer;
            transferOK = SPI_transfer(masterSpi_dac, &masterTransaction_adc);
            DAC=3;

//            Semaphore_pend(SPI_semaphore_DAC, BIOS_WAIT_FOREVER);
            masterTxBuffer[0]=0b00001000;
            /* Initialize master SPI transaction structure */
            masterTransaction_adc.count = 1; //Numbers of frames for a transaction
            masterTransaction_adc.txBuf = (Ptr)masterTxBuffer;
            masterTransaction_adc.rxBuf = (Ptr)masterRxBuffer;
            transferOK = SPI_transfer(masterSpi_dac, &masterTransaction_adc);
            DAC=4;

    /* Deinitialize SPI */
    SPI_close(masterSpi_dac);
    /* Open new SPI onnection that will send the measurements in a loop */
    Semaphore_post(SPI_semaphore);
    /* Enable interrupts */
    GPIO_enableInt(Board_BUTTON0);
//    GPIOIntEnable(GPIO_PORTB_BASE, GPIO_PIN_4);

    if(transferOK) {
        /* Print contents of master receive buffer */
        System_printf("Configuration done \n");
    }
    else {
        System_printf("Unsuccessful master SPI transfer");
    }
    System_flush();
}


/*
 *  ======== main ========
 */
int main(void)
{

    /* Construct BIOS objects */
    Task_Params taskParams2;
    Task_Params taskParams3;

    /* Call board init functions */
    Board_initGeneral();
    Board_initGPIO();
    Board_initSPI();
    SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOD);
    GPIOPinTypeGPIOOutput(GPIO_PORTD_BASE, GPIO_PIN_2);  // Init PF4 as input
    GPIOPinWrite(GPIO_PORTD_BASE, GPIO_PIN_2,4);  // Init PF4 as input

    hardware_init();

    /* install Button callback */
    GPIO_setCallback(Board_BUTTON0, drdy);

//    /* Enable interrupts */
//    GPIO_enableInt(Board_BUTTON0);


/* Construct master threads */
    Task_Params_init(&taskParams2);
    taskParams2.priority = 2;
    taskParams2.stackSize = TASKSTACKSIZE;
    taskParams2.stack = &task0Stack;
    Task_construct(&task0Struct, (Task_FuncPtr)masterTaskFxn, &taskParams2, NULL);
/* Construct master DAC Task threads */
    Task_Params_init(&taskParams3);
    taskParams3.priority = 1;
    taskParams3.stackSize = TASKSTACKSIZE;
    taskParams3.stack = &task1Stack;
    Task_construct(&task1Struct, (Task_FuncPtr)DAC_conf, &taskParams3, NULL);


    /* Start BIOS */
    BIOS_start();

    return (0);
}

  • Hi Lukasz,

    System_printf data is stored in an internal buffer. However, the System_flush will write this data to the CIO buffer. When there is a '\n' in the data, the target gets halted so CCS can read the data. This usually will mess up real-time operation. Things will get worse if the string to print out is long. Please try to remove the System_flush and then use Tools->ROV->SysMin to look at the System_printf output and see if this makes a difference.
  • Hi Charles,

    I have commented all Sysytem_flush() instances. It didn't improve it unfortunately.
  • In case of a Non-RTOS configuration the reference toggle needs ~500-600ns once slave signalizes its status. With RTOS I am getting 4-6us. 32 bit on a ~32kHz rate should be targeted more as a non-RTOS solution?
  • If you are asking, "Does operating w/out the RTOS "Reduce Interrupt Latency"" - then I would - w/out any doubt - reply, "Yes!"
    From memory - "Entry into the ISR occurs w/in 14 System Clocks" when the MCU operates, "Free from the RTOS."
  • Thank you.

    My question is more: based on your experience the application that I am working at seems practical for an RTOS?
  • I'm not that experienced w/your application -  (almost) never would firm/I employ other than a,  "Vendor Agnostic" IDE or RTOS.      Devoting great time/effort to "so limited a source implementation" is outside the interest/comfort of our clients, investors, & our team.      (Note that ARM Cortex is "rich" w/M0, M3, M4 & M7 - there are many vendors - I cannot fathom how, "One & only one" can be (willingly),  "Locked onto!")

    I'm unsure if you've properly summarized - and "weighted" - the entirety of your application - so that such "practicality" may be judged.     As you've learned - little is "free" in this field - any advantages provided by an RTOS come at (some) operational cost.     In your case - the interrupt response time is degraded - you are the best judge of such impact...

    Rather than,  "my judgement" - would not,  "your testing" - prove "Best for your measured determination?"      Would not your, "Replacement of the RTOS" with the MCU's "SysTick" - potentially enable a "less intrusive - less penalizing, Real-Time System" - with vastly improved interrupt response?      Such - at minimum - warrants (even) a brief (measured) investigation - does it not?

    The Cortex M3 (past LM3S, here) and M4 (past LX4F & now TM4C) may achieve an interrupt latency of 12 cycles - but that (only) with,  "zero wait state memory systems."    

    Other advantages of the ARM Cortex are the "Nesting of Interrupts" - "Prioritization" - and (even) "Pre-emption" - which promotes an interrupt to "top dog" - insuring that it is serviced w/minimal "jitter" - independent of the MCU's current state...   (i.e. completely halts any "interrupt in progress" - completes its service & exits...only then allowing the "interrupted ISR" to continue...)

    Small World Dept: (if your school is MIT - decade + past  - firm/I provided design & product for the, "Plasma Fusion Gyrotron" - employed both @ MIT & Princeton.)

  • "There is ~4us delay."

    that's 300 - 500 instructions, give or take. a little bit long but not bad, considering the rtos overhead.

    what I typically do is to set up a test run to get a sense of how much that overhead is. Set up a blinky as one task and a few empty loops. see how fast you can blink. then run that same blinking code all naked and compare the speed loss.

  • As follow to your post of 17 Feb, 18:13 in which:

    Lukasz Huchel said:
    In case of a Non-RTOS configuration the reference toggle needs ~500-600ns once slave signalizes its status.   With RTOS I am getting 4-6us.

    When the MCU is operated at a clock rate (and w/in a memory region) which enables,  "0 wait state" - the interrupt latency is 12 clocks.     (in the case of the '123 - running 40MHz SysClock  - 300nS interrupt latency results.)      It is thus suspected that your - "Non-RTOS" operation - is experiencing "wait states."

    That delta (300nS vs. 4-6µS ... peaking @ 20:1) may or may not prove significant - such is for you & your application to decide.

    Is it (just) the RTOS which is to blame?     And - if so - have you explored (other) RTOS implementations - to discover how they compare?     That's (proper) investigative effort - often required by tech - is it not?

    Another's advice - "Employ blinky" - iirc avoids interrupts - thus may not render an,  "Apples to apples comparison."     (i.e. it provides "some" - but less than "full/proper" - comparative insight...)

  • Hi Lukasz,

    Take a look at the SYS/BIOS release notes in your TI-RTOS. You'll find a link to timing benchmarks. We document up several fundamental timing scenarios. For example:

     

    The SYS/BIOS User Guide also has a section in it with more details about what each row means.

    I'd make sure you are using the nonInstrumented TI-RTOS Driver library and the custom BIOS library without logging enabled. I generally leave asserts on in the kernel until the very end. The overhead for assert checking in the kernel is not much and it's worth being able to find pilot errors faster (e.g. passing an invalid parameter into an API).

    You can still have an interrupt in the application that is not managed by the kernel (we call it a zero-latency interrupt since the kernel does zero latency). The caveat is that ISR cannot call into the kernel if it could change the scheduler (e.g. cannot call Semaphore_post).

    Todd

  • While this chart is helpful - both references (1 & 2) are omitted.

    ARM lists "Interrupt Latency" at 12 cycles - but that (only) when "0 wait state memory" is "in play."    It is suspected that the (missing) reference (2) would "echo" that "0 wait state memory" requirement.    And that this chart presents (only) performance attained via, "0 wait state memory."

    It is noted that 135/12 represents better than an "11 times" Latency Penalty - brought on by this RTOS...

  • Please refer to the documentation I pointed to for a full description. I did not include the full page of the table and footnotes to minimize the length of the thread.