This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Synchronization in SysBios SMP-mode with ARMv7's relaxed memory model

Other Parts Discussed in Thread: SYSBIOS

Hi,

I'm currently using SysBios (6.37.02.27) in SMP-mode on an ARM v7 processor. As in SMP-mode, two tasks (or one task and an ISR) actually run in parallel at the same time, the issue of memory model pops up when considering synchronization between two concurrent control flows (two tasks or task/ISR) . As ARM v7 has a very relaxed memory model [2], explicit ordering instructions ("dmb", or "sync") are required [3] in order to guarantee correctness. There are tons of literature about this topic [1], [3], [4], both from academia and practical engineering (notably the Linux Kernel).

However, the SysBios manual does not state ANYTHING about this topic. For example, does the use of a Semaphore include the necessary memory barriers? How about the Event module? Does the following example work?

int global_var_result = 0;

void tasks1(...)
{
    /* produces result */
    [...]
   
    /* store result */
    global_var_result = ...;
   
    /* Set event for task2, see below */
    Event_post(...);
   
    [...]
}
 
void task2(...)
{
    Event_Wait(...)
   
    /* read result */
    .. = global_var_result;
}

or does it have to be changed to

void tasks1(...)
{
    /* produces result */
    [...]
   
    /* store result */
    global_var_result = ...;
   
    smp_wmb(); // <- enforce write ordering!
   
    /* Set event for task2, see below */
    Event_post(...);
   
    [...]
}
 
void task2(...)
{
    Event_Wait(...)
   
    smp_rmb(); // <- enforce read ordering!
   
    /* read result */
    .. = global_var_result;
}

Note: The smp_wmb() and smp_rmb() are the names used in the Linux Kernel.

My questions are:

1. What guarantees are included in the following SysBios synchronization primitives: Semaphore, Mutex,  Gate* (all Gate-Type APIs), Event?

2. What guarantees are not included in the following SysBios synchronization primitives: Semaphore, Mutex,  Gate* (all Gate-Type APIs), Event?

3. Are there any built-in SysBios primitives (like "smp_rmb" in Linux Kernel) that can be used or do I have to code this in assembly language myself?

4. Is there something similar to the "spinlock" primitive in the Linux-Kernel that enables mutual exclusion between both tasks and ISRs? If not how can this be achieved in SMP-enabled SysBios?

Thanks for your answers. Best Regards,

Matthias

[1] http://www.cl.cam.ac.uk/~pes20/weakmemory/#PA

[2] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf

[3] https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/Documentation/memory-barriers.txt?id=refs/tags/v4.0.5

[4] preshing.com/.../memory-barriers-are-like-source-control-operations

  • Hi Matthias,

    I am in the process of looking into your questions and will get back to you on them shortly.

    Steve
  • Hi Matthias,

    Since you are using SYS/BIOS 6.37 SMP support, I am guessing you are running a SMP app on a Cortex-M3/M4 device ?

    Matthias Rosenfelder said:

    My questions are:

    1. What guarantees are included in the following SysBios synchronization primitives: Semaphore, Mutex,  Gate* (all Gate-Type APIs), Event?

    2. What guarantees are not included in the following SysBios synchronization primitives: Semaphore, Mutex,  Gate* (all Gate-Type APIs), Event?

    The Cortex-M devices inherently guarantee the ordering of loads and stores. They also do not include aggressive speculative execution logic (like many Cortex-A devices do). Therefore, unlike Cortex-A devices, it is possible to reduce the use of barriers on Cortex-M devices. (see http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/BIHGJICF.html 

    Our current implementation of Semaphore, Mutex, Gate* and Event does not include barriers. I believe our design is such that it is able to ensure correct operation without the use of barriers.

    Recently (as of SYS/BIOS 6.41), we added SMP support to Cortex-A15 and on these we did include the necessary barriers in our code as the hardware requires the use of barriers to ensure correct operation.

    Matthias Rosenfelder said:

    3. Are there any built-in SysBios primitives (like "smp_rmb" in Linux Kernel) that can be used or do I have to code this in assembly language myself?

    At present, we do not have any built-in barrier instruction primitives in SYS/BIOS. We may add them in future releases.

    Matthias Rosenfelder said:

    4. Is there something similar to the "spinlock" primitive in the Linux-Kernel that enables mutual exclusion between both tasks and ISRs? If not how can this be achieved in SMP-enabled SysBios?

    One way of protecting a data structure shared between a Task and a Hwi is to call Hwi_disable()/restore(key) within the task to protect the data structure being shared. The Hwi_disable() internally acquires an inter-core lock and effectively serves like a spinlock. The Hwi code does not need to include a Hwi_disable()/restore(key) block as the Hwi dispatcher acquires the inter-core lock before calling the Hwi function. A Hwi_disable may be required if interrupt nesting needs to be temporarily disabled within the Hwi function.

    Hope this helps.

    Best,

    Ashish

  • Hi Ashish,


    thank you for your answer. Yes, this helps a lot!

    Since you are using SYS/BIOS 6.37 SMP support, I am guessing you are running a SMP app on a Cortex-M3/M4 device ?

    For this project the answer is yes. However, I personally don't like the idea that the correctness of my code is only guaranteed because of certain CPU microarchitecture implementation details. After all, software implementation is about re-usability of code. Other project come and go (at least in the company I'm working for), and I can only hope that nobody will use this RTOS layer code I'm currently writing for non-Cortex-M devices.

    Nevertheless, taking the quote from the ARM website that DMBs are redundant (but still recommended) into consideration, I believe we can continue our development here. Using the Hwi_disable() / enable() as a spinlock is a nice way to solve the mutual exclusion problem between all possible concurrent control flows. (If I understood this correctly)

    Still, a final question. You wrote:

    A Hwi_disable may be required if interrupt nesting needs to be temporarily disabled within the Hwi function.

    So ISRs can be interrupted by another higher-priority ISRs in SysBios!? What about the stack used for those (two) ISRs? Do they share one or does every ISR have its own stack?

    Is this transparent? I.e. is there anything the ISR code needs to take into consideration if it is just using private data (not accessed by any other ISR) and nobody else "talks" to the device the ISR is servicing?

    Thanks a lot for your help!

    Best regards,

    Matthias

  • Hi Matthias,

    Matthias Rosenfelder said:

    Nevertheless, taking the quote from the ARM website that DMBs are redundant (but still recommended) into consideration, I believe we can continue our development here. Using the Hwi_disable() / enable() as a spinlock is a nice way to solve the mutual exclusion problem between all possible concurrent control flows. (If I understood this correctly)

    That is correct, you can use Hwi_disable()/restore() to resolve mutual exclusion problems.

    Matthias Rosenfelder said:

    Still, a final question. You wrote:

    A Hwi_disable may be required if interrupt nesting needs to be temporarily disabled within the Hwi function.

    So ISRs can be interrupted by another higher-priority ISRs in SysBios!? What about the stack used for those (two) ISRs? Do they share one or does every ISR have its own stack?

    Yes, by default auto nesting support is enabled (see cdoc) which implies that an ISR can be pre-empted by a higher priority ISR. This can be disabled by setting Hwi.dispatcherAutoNestingSupport to false in the app's cfg file. All nested ISRs being serviced on a given core share the same ISR stack. It is possible to configure which core services each interrupt (see cdoc) and there is a separate ISR stack per core.

    Matthias Rosenfelder said:

    Is this transparent? I.e. is there anything the ISR code needs to take into consideration if it is just using private data (not accessed by any other ISR) and nobody else "talks" to the device the ISR is servicing?

    Yes, it should be transparent in that case.

    Best,

    Ashish

  • Ashish Kapania said:

    That's a very helpful application note (AN321), in particular this page: http://infocenter.arm.com/help/topic/com.arm.doc.dai0321a/BIHGJBBF.html

    I hadn't noticed the subtle distinction between the ARMv7-M architectural memory model (permits reordering) and the Cortex-M implementation of the architecture (no reordering).

  • Robert Cowsill said:

    That's a very helpful application note (AN321), in particular this page: http://infocenter.arm.com/help/topic/com.arm.doc.dai0321a/BIHGJBBF.html

    I hadn't noticed the subtle distinction between the ARMv7-M architectural memory model (permits reordering) and the Cortex-M implementation of the architecture (no reordering).

    Robert,

    please note that although some microarchitectures (like Cortex-M) don't do reordering during runtime execution of the program, in certain circumstances you still need to prevent compiler reordering to happen in order to ensure correctness on SMP systems. Here is an example of a bug where no such compiler reordering was enforced:

    https://e2e.ti.com/support/embedded/tirtos/f/355/t/446574

    Thus, you would either need to code those sections in assembly in order to eliminate the compiler from reordering instructions or have to have something in C like the "barrier()"-primitive from the Linux kernel.

    In summary, with SysBios going from single-CPU to SMP, you are leaving the wonderfully simple world of serial programming and enter the nasty, dark, error-prone and non-intuitive world of parallel programming. Welcome! ;-)