This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Avoid race condition between ISR and main

Other Parts Discussed in Thread: MSP430F5529

Hi all,

This is my first post here and also I'm a noob when it comes to microcontrollers and I need some help.

Please look at the following code example:

volatile int x = 0;

int main(void)
{
    HAL_initPorts();
    HAL_initClocks();
    __enable_interrupt();
    while(1)
    {
        if (x)
        {
            //do some work
            x = 0;
        }
    }
}

void some_isr(void)
{
    x = 1;
}

Now from my understanding I can see a potential race condition can occur during the following scenario:

1) some_isr was triggered and set x = 1

2) main continued its work and if (x) resulted in a true value

3) right before the line "x = 0" in main the ISR was triggered again setting x = 1 again

4) main continued its work and set x = 0 not "knowing" the ISR was triggered again and it will not enter the "if (x)" condition on the next iteration

So how can this be avoided? In higher level programming languages I guess I would use some kind of a mutex or auto reset event to prevent this race condition.

Would disabling interrupts from within the ISR and then re-enabling them in main be a possible solution? If so is this recommended or can it have side-effects?

Thanks

  • Hi Itay,

    Itay Zafrir said:
    Would disabling interrupts from within the ISR and then re-enabling them in main be a possible solution? If so is this recommended or can it have side-effects?

    Normally interrupts are disabled during the actual execution of the ISR anyway, but you're talking I think about if you can disable interrupts for after the ISR so that they are only re-enabled in main after the if statement has finished its execution and modification of x, correct?

    A typical example of doing something like this for MSP430 might look like this:

    volatile int x = 0;
    
    int main(void)
    {
        HAL_initPorts();
        HAL_initClocks();
        __enable_interrupt();
        while(1)
        {
            __bis_SR_register(LPM3_bits + GIE); //go to LPM and enable interrupts
            __no_operation();
    
            if (x)
            {
                //do some work
                x = 0;
            }
        }
    }
    
    void some_isr(void)
    {
        x = 1;
        __bic_SR_register_on_exit(LPM3_bits + GIE); //wake from LPM and disable further interrupts
        __no_operation();
    }

    Now this example uses low power modes - in the while loop you go  to LPM and enable interrupts (using __bis_SR_register() ) to set the LPM and global interrupt enable GIE bit. When the ISR fires, it uses the __bic_SR_register_on_exit() function to clear the LPM and global interrupt enable bit after the ISR exits. This means that the flow will be:

    • go to LPM and enable interrupt
    • when interrupt occurs, enter ISR, set x = 1
    • wake from LPM with interrupts disabled
    • proceed through while loop operations and through if statement, x = 0
      • (interrupts still disabled so any more interrupts occurring will just set their interrupt flags but not actually go to ISR)
    • loop back to top of while(1), where we again go to LPM and enable interrupts again
      • At this point any interrupt flags that were set while interrupts were disabled will be serviced/cause a call to the ISR)

    Does this make sense? This is a very typical style of program flow on MSP430, similar to what you might see in a lot of code examples.

    Potential consequence for this is that you are globally disabling all interrupts. If you have another interrupt in your program that you want to be able to still interrupt during the while loop, instead of clearing GIE at the end of the ISR, you'd want to just include another line to clear the specific interrupt enable bit for some_isr (let's call it some_ie), and re-enable that some_ie bit after the end of the if statement.

    Regards,

    Katie

  • Additionally to Katie’s advice.

    In your, I guess simplified example, this situation can happen. One easy solution is to increment/decrement the ISR counter;

    volatile int x = 0;
    
    int main(void)
    {
    	HAL_initPorts();
    	HAL_initClocks();
    	__enable_interrupt();
    	while(1)
    	{
    		while (x != 0)
    		{
    			//do some work
    			x--;
    		}
    	}
    }
    
    void some_isr(void)
    {
    	if (x < 0xFFFF) {x++;}
    }
    

  • Hi Katie and thanks for your quick and insightful reply. It makes perfect sense.

    "Normally interrupts are disabled during the actual execution of the ISR anyway, but you're talking I think about if you can disable interrupts for after the ISR so that they are only re-enabled in main after the if statement has finished its execution and modification of x, correct?"

    Yes this is exactly what I meant.

    Now I just want to clarify a couple points:

    1)

    "At this point any interrupt flags that were set while interrupts were disabled will be serviced/cause a call to the ISR"

    I just want to make sure I understand this correctly, so when interrupts are disabled and lets say an interrupt does occur, it's ISR will be fired once interrupts are re-enables right? Meaning I will not "miss" any interrupts?

    2) If i choose NOT to enter LPM this shouldn't have any effect on the general pattern you posted, am I correct?

    3) Now I'd like to turn the above into a more of a real example. I'm using the TI's USB developers stack. And lets say that "some_isr" is actually     a USBHID_handleDataReceived/USBCDC_handleDataReceived function callback (which from what I see it's called from within the USB_UBM_VECTOR ISR). So my question are:

    1) Should I disable interrupts on exit from within the callback function (USBHID_handleDataReceived) even though it's not a "proper" ISR but a function called from within an ISR?

    2) The ISR which calls USBHID_handleDataReceived also calls __bic_SR_register_on_exit(LPM3_bits); if a certain condition is met. So if the answer to 1 is YES then what effect can multiple calls to __bic_SR_register_on_exit() have? Is this valid or will it cause problems?

    3) Specifically to the USB example; According to the USB API reference I can see that it's possible to disable/enable specific USB events thus enabling/disabling the USBHID_handleDataReceived callback. Would this be preferred over disabling interrupts? Wouldn't this cause data loss?

    Thanks very much for your help.

  • Thanks!
    I can see how this can work as well. But I actually wanted to know/understand the pattern Katie pointed out.
  • Itay Zafrir said:

    1)

    "At this point any interrupt flags that were set while interrupts were disabled will be serviced/cause a call to the ISR"

    I just want to make sure I understand this correctly, so when interrupts are disabled and lets say an interrupt does occur, it's ISR will be fired once interrupts are re-enables right? Meaning I will not "miss" any interrupts?

    You could still "miss" an interrupt if multiple of the same interrupt came in before you serviced them - they aren't pushed onto some stack or anything. Meaning, while you have interrupts disabled, if you have the interrupt come in twice, the ISR when you re-enable interrupts will only be serviced once. This is because all it it is going off of is the IFG, so if the IFG is already set and another interrupt comes in, you can't know. This is why if you have some time critical interrupts that need to continue being serviced, you wouldn't want to globally disable all interrupts, instead maybe just disabling the specific interrupt for your particular critical code section (the one for some_isr). Or you do something more like Leo's suggestion - there are definitely multiple ways to handle this and it kind of depends on the nature of your project which way will make the most sense for you.

    Please also note that if you re-enable interrupts and multiple different interrupts have come in while they were disabled, the ISRs are going to get serviced in the priority order not the order they came in - you can find this list of interrupt priority in the datasheet - on msp430f5529 for example www.ti.com/lit/gpn/msp430f5529 it is on p. 21 Table 4 - the right column shows the priority, with reset and NMIs for faults as highest priorities.

    Itay Zafrir said:
    2) If i choose NOT to enter LPM this shouldn't have any effect on the general pattern you posted, am I correct?

    The only difference this makes in the above example is that if you go to LPM you are not executing code in the while loop - you are basically waiting for the interrupt and at that point will wake. Without going to LPM you'll just be looping constantly, but doing nothing while x = 0 because you aren't entering the if statement. So not much difference (other than power consumption), unless you have other code going on outside that if-statement. The LPM is useful if you specifically want to wait for an interrupt and don't have other things to handle, you can just go to LPM and wait instead of wasting power spinning in an effectively empty loop. So depends on your application and what matters to you, what you'd want to do.

    A lot of typical MSP430 applications are trying to be as low power as possible and so they spend most of their time in LPM, and have a while loop with a switch statement or series of if's in it - whenever the part wakes after servicing an ISR, the switch/conditionals check variables to see if you want to go to a new part of your software state machine, does any appropriate handling, then goes back to sleep.

    An app note you may want to look at is www.ti.com/lit/pdf/slaa294 MSP430 Software Coding Techniques. It has a flow chart in section 2 and discussion of this type of interrupt-driven code architecture.

    Itay Zafrir said:

    3) Now I'd like to turn the above into a more of a real example. I'm using the TI's USB developers stack. And lets say that "some_isr" is actually     a USBHID_handleDataReceived/USBCDC_handleDataReceived function callback (which from what I see it's called from within the USB_UBM_VECTOR ISR). So my question are:

    1) Should I disable interrupts on exit from within the callback function (USBHID_handleDataReceived) even though it's not a "proper" ISR but a function called from within and ISR?

    2) The ISR which calls USBHID_handleDataReceived also calls __bic_SR_register_on_exit(LPM3_bits); if a certain condition is met. So if the answer to 1 is YES then what effect can multiple calls to __bic_SR_register_on_exit() have? Is this valid or will it cause problems?

    3) Specifically to the USB example; According to the USB API reference I can see that it's possible to disable/enable specific USB events thus enabling/disabling the USBHID_handleDataReceived callback. Would this be preferred over disabling interrupts? Wouldn't this cause data loss?

    I am not 100% sure about multiple __bic_SR_register_on_exit calls in the same ISR, but I'd probably try to avoid it because I'm not sure if it could cause problems. The USB example you bring up is definitely a more difficult one because of its more unusual interrupt handling. One thing that might make it easier would be to instead move your interrupt disabling to right before your critical code section, just to eliminate the ISR confusion here - put it like:

        while(1)
        {
            //other code handling
    
            __disable_interrupt();
            if (x)
            {
                //do some work
                x = 0;
            }
            __enable_interrupt();
        }
    

    This is probably good just to keep interrupts only disabled for the smallest window of time possible anyway, if you have other things going on in your system.

    For your point #3, in the case of the USB stack I believe it would be recommended to disable a particular event rather than disabling the USB interrupt itself in this case. so instead of __disable_interrupt you might want to use USB_setEnabledEvents() instead to enable/disable the event. Or you can add some sort of event counter into your event handling more similar to what Leo suggested if you are worried about missing something - but I have a feeling you could have problems anyway if you have 2 events come in (e.g. receiving USB packets) in that you don't want your received data to get overwritten. Best to keep everything as short and efficient as possible to avoid this.

    Hope this helps,

    Katie

  • Thanks a million Katie.
  • multiple __bic_SR_register_on_exit calls are essentially the same as multiple x&=~y calls - they modify the stack.stored status register and clear a bit that was already clear. However, this intrinsic can only be called in the ISR itself, not in a function that is called by the ISR. Because only while compiling the ISR, the compiler knows where on the stack is the value to modify.

    If you want to be sure you always have the number of main loops matching the number of interrupt events, then the counter approach is the one to go. The decrement of the counter variable (as long as it is char or int) is atomic, the volatile ensure sit is properly handled by main even if the ISR modifies it int he background.
    If you only want to ensure that after an interrupt (or multiple ones), the whole main loop is executed at least once (e.g. after a button press), then disabling and re-enabling interrupts around the if check and flag reset is sufficient. It does, however, depend on your application and whether there are other ISRs that might need to be active all the time.
  • My rule of thumb is to never use disenable and re-enable interrupts.

    Only if it’s expressly needed, for example when a –very short- critical operation starts which absolutely not may be interrupted or if the following code can generate an unwanted interrupt himself and needs to be cleared first, otherwise it’s not done.

    And certainly not for a Button operations which is that slow that it easy can be handled without the use of interrupts or changing its enabling, only the exit from LPM and start a Main cycle is here useful.

  • I second that. Usually, main and ISRs should be written in a way that they synchronize fine.
    Exceptions are when both use a long or float variable (as altering this is not a monolithic operation), or right before entering LPM, so the interrupt won't happen when the IE bit is set but LPM hasn't been entered yet. And while writing to flash.

**Attention** This is a public forum