This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MSPM0G1107: FreeRTOS context switching latency

Part Number: MSPM0G1107
Other Parts Discussed in Thread: MSPM0G3507, , SYSCONFIG, LP-MSPM0G3507

Tool/software:

We're running some benchmarks and trying to determine why the task context switch seems relatively slow when trying to wake a task from an ISR (using a semaphore or FreeRTOS task notification). It's taking about 25 us with the CPU clock running at 80 MHz, which comes out to about 2000 CPU cycles. We were expecting the latency to be much faster. The task is set to the highest priority, We started with the mspm0_sdk_2_01_00_03\examples\rtos\LP_MSPM0G3507\kernel\posix_demo example and reconfigured it for an MSPM0G1107.

Here's the entirety of the UART task and ISR code:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/*
* Includes
*/
// Project-specific
#include "ti_msp_dl_config.h"
// Standard C library
#include <assert.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <string.h>
// RTOS header files
#include <FreeRTOS.h>
#include <portmacro.h>
#include <semphr.h>
// TI
#include <ti/drivers/dpl/HwiP.h>
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Here's a logic analyzer capture showing the latency between the received byte and the transmitted byte.

In main.c, the task stack size was increased and priority set to max:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
diff --git a/main.c b/main.c
index 0f392b2..b1c096a 100644
--- a/main.c
+++ b/main.c
@@ -49,7 +49,7 @@
extern void *RS845_thread(void *arg0);
/* Stack size in bytes */
-#define THREADSTACKSIZE 256
+#define THREADSTACKSIZE 1024
/* Set up the hardware ready to run this demo */
static void prvSetupHardware(void);
@@ -73,8 +73,8 @@ int main(void)
pthread_attr_init(&attrs);
/* Set priority, detach state, and stack size attributes */
- priParam.sched_priority = 1;
- retc = pthread_attr_setschedparam(&attrs, &priParam);
+ priParam.sched_priority = configMAX_PRIORITIES - 1;
+ retc = pthread_attr_setschedparam(&attrs, &priParam);
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SysConfig setup:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/**
* These arguments were used when this file was generated. They will be automatically applied on subsequent loads
* via the GUI or CLI. Run CLI with '--help' for additional information on how to override these arguments.
* @cliArgs --device "MSPM0G110X" --part "Default" --package "VQFN-32(RHB)" --product "mspm0_sdk@2.01.00.03"
* @v2CliArgs --device "MSPM0G1107" --package "VQFN-32(RHB)" --product "mspm0_sdk@2.01.00.03"
* @versions {"tool":"1.21.1+3772"}
*/
/**
* Import the modules used in this configuration.
*/
const GPIO = scripting.addModule("/ti/driverlib/GPIO", {}, false);
const GPIO1 = GPIO.addInstance();
const SYSCTL = scripting.addModule("/ti/driverlib/SYSCTL");
const TIMER = scripting.addModule("/ti/driverlib/TIMER", {}, false);
const TIMER1 = TIMER.addInstance();
const UART = scripting.addModule("/ti/driverlib/UART", {}, false);
const UART1 = UART.addInstance();
const ProjectConfig = scripting.addModule("/ti/project_config/ProjectConfig");
/**
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

FreeRTOSConfig.h file is mostly unmodified from the example. Only the CPU clock was adjusted:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
diff --git a/FreeRTOSConfig.h b/FreeRTOSConfig.h
index 20e4040..b5e2ac5 100644
--- a/FreeRTOSConfig.h
+++ b/FreeRTOSConfig.h
@@ -70,7 +70,7 @@
#define configUSE_16_BIT_TICKS 0 /* Only for 8 and 16-bit hardware. */
/* Constants that describe the hardware and memory usage. */
-#define configCPU_CLOCK_HZ ((unsigned long) 32000000)
+#define configCPU_CLOCK_HZ ((unsigned long) 80000000)
/* Smallest stack size allowed in words */
#define configMINIMAL_STACK_SIZE ((unsigned short) 128)
#define configMAX_TASK_NAME_LEN (12)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

  • For the UART ISR trigger and executed should not >1us when you running the CPU at 80MHz, you can test it to toggle a GPIO at end of the UART ISR. 

    Could you help to provide a simple demo code based on LP-MSPM0G3507 here that can reproduce this issue? I can help to test is on myside. 

  • I uploaded a minimal test example to GitHub here: https://github.com/derrick-senva/mspm0_uart_latency. I used the LP-MSPM3507 dev kit (early revision with the 48 MHz crystal) and based it off mspm0_sdk_2_01_00_03\examples\rtos\LP_MSPM0G3507\kernel\posix_demo.

    The program simply responds to bytes transmitted from the integrated XDS backchannel UART. I used PuTTY to spam ASCII characters to the MSPM0. The clock tree is configured for the CPU to run at 80 MHz (from the 48 MHz crystal). UART0 is configured for 1 Mbaud, no parity, 1 stop bit, and assigned to pins PA10 and PA11. The single POSIX thread with maximum priority waits on a semaphore to be given from the UART's receive ISR and then immediately transmits a single ASCII character. I made sure to use xSemaphoreGiveFromISR from the UART ISR. I also assigned the interrupt an NVIC priority of 1 so that it doesn't exceed configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY.

    The latency captured by a logic analyzer measured about 25 us between the stop bit from the XDS UART and the start bit of the MSPM0 UART's response. This is consistent with my previously posted capture. This latency seems to be caused from something with the FreeRTOS implementation in this example project. I speculated it might have had something to do with power modes being toggled during the context switch but couldn't find any evidence for that, so now I'm out of ideas as to what's causing this large delay.

    I also tried running from a bare metal example without the RTOS as you suggested. I started with the project from mspm0_sdk_2_01_00_03\examples\nortos\LP_MSPM0G3507\driverlib\uart_echo_interrupts_standby. I made some simple modifications to mimic the program above, but instead of waiting on a semaphore, I simply polled a variable that was toggled by the ISR. In this scenario, I measured a latency of about 1.5 us which is much closer to what I'd expect.

  • Hi Derrick,

    I can reproduce your issue on myside that delay of the time is about 25us, but when you enabled the optimization to fast it can be improved to about 18us

    I think that should also a large number that compared with 1.5us you expected. The other hand option is you can try to create tasks instead of pthread, due to the task have the notify feature that will be 45% faster than the semaphore, I do not fund such items in pthread, maybe you can search something similar with the task notify feature, please let me know.

  • I switched to a FreeRTOS task and notification and it reduced latency from 25 to 23 us (from around 2000 to 1840 CPU cycles). Branch is here: https://github.com/derrick-senva/mspm0_uart_latency/tree/task_notification. That's a decent improvement in terms of raw cycles, but it still seems like an issue with the FreeRTOS port/implementation for the MSPM0. Either that or I simply don't have the correct configuration/setup for fast context switching.

    I was searching around for other people's experience with context switching times and it does seem like most people are getting better results. Specifically, the official FreeRTOS FAQ here: https://www.freertos.org/Why-FreeRTOS/FAQs/Memory-usage-boot-times-context#what-is-the-context-switch-time. They cite a much better switching time using a Cortex-M3 on Keil. Although it's not a direct one-to-one comparison, I would assume the M0 architecture is close enough. They measured 84 CPU cycles which is a huge disparity. Is there someone with deep knowledge on the implementation that can advise us on how to reduce the latency, even if it involves modifying the low-level port code? Our application specifically requires low latency and multi-tasking.

  • Let me check with out tools team to see if any comments there.

  • Here is the response from our FreeRTOS expert that you can refer to 

    The reasons

    • Optimization level in the application as well as freertos project.
    • After the main task held on semaphore take, the scheduler calls the idle task and then goes to low power mode and sleep. When a UART RX interrupt occurs, the scheduler runs a wake-up sequence and then executes the ISR. The added time is not the "hardware" wakeup time of the device, it is coming from extra code that runs to reinitialize the timers following a wakeup. During this time the clock source also gets switched. This is the way that FreeRTOS works. It is not specific to TI.

    Solution:

    • To disable the low power mode in freertos, I have changed #define configUSE_TICKLESS_IDLE 0 in FreeRTOSConfig.h
    • I have changed the optimization level to fast both in the application and the freertos project.
    • I can achieve 4.2us context switching time which is pretty decent.
    • I have attached the projects for your reference.
  • Thanks for the detailed response. I don't see any attachments for the project you mentioned(?), but I can attempt the changes you listed and report back here afterwards.

  • Looks like I need your reference project to see what else I have configured wrong.

    I changed the optimization level of both projects to fast and also set #define configUSE_TICKLESS_IDLE 0, but somehow it made the context switching time worse. It appears to only wake the task on my configured 1 ms RTOS tick interval. You can see below that the M0 TX (response) is always spaced around 1 ms intervals.

  • Derrick,

    If I look at the example code from your original posting, it looks like you have omitted the call to portYIELD_FROM_ISR() which is required in order to force a context switch. When you have configUSE_TICKLESS_IDLE 1, I believe the scheduler is forced to run on every wakeup from low power mode, but when you disable it, you are no longer waking up from low power mode and forcing the scheduler to run.

    The behavior your describe is consistent with omitting this call. See the following example:

    https://freertos.org/Documentation/02-Kernel/04-API-references/10-Semaphore-and-Mutexes/17-xSemaphoreGiveFromISR

    Thanks,

    Stuart

  • I don't see any attachments for the project you mentioned(?)

    Please refer this

     workspace_v12.7_.zip