Help on dealing with USCI39 on a 5xx MSP430

Patrick Allison

Other Parts Discussed in Thread: MSP430F5335

Hi:

I'm really struggling to understand how to deal with USCI39 on a 5xx MSP430. Are there any more technical details available anywhere on this errata, or are there any TI engineers more familiar with this errata? (Specifically what "unpredictable code execution" means?)

I've been observing situations where, when attempting to return from an interrupt handler (which sets GIE, obviously), the MSP430 apparently jumps to the completely wrong location. Setting a breakpoint at the 'RETI' instruction and attempting to single-step afterwards results in the FET returning an "Unknown State" error, and while SR has been set to the correct value (the previous stack pointer) and SP has been incremented by 4, PC is something else entirely. Which sounds like the "unpredictable code execution" bit in USCI39, although it's much more reproducible than that errata would imply.

So attempting to work around this, I ran into a stumbling block. Specifically, the errata says that you should

"Disable the UCSTTIFG, UCSTPIFG and UCNACKIFG before the GIE is set. After GIE is
set, the local interrupt enable flags can be set again."

However, there are two cases where this is impossible: first, when the uC is deciding whether or not to go to sleep, and second, in an interrupt handler. In the first case, it needs to

Disable interrupts
Check whatever condition it needs to to decide whether or not to go to sleep
Set GIE and some of the LPM bits (e.g. GIE + CPUOFF) atomically

After step 3, it obviously can't re-enable the local interrupt flags, because it's no longer running.

I can't think of any way out of this, other than "no low power mode." Is there something I'm missing here?

In the second case, obviously RETI pops off SR, reenabling interrupts, and the previous SP. So I guess you could do something like test UCBxIE for (UCSTTIE/UCSTPIE/UCNACKIE), preserve them, clear UCBxIE, reenable interrupts, and then restore UCBxIE before returning from interrupt. That is,

cmp.b  &UCBxIE
jz     do_reti
push.b &UCBxIE
clr.b  &UCBxIE
eint
nop
pop.b  &UCBxIE
do_reti:
reti

for every interrupt handler. I think this is safe, because although interrupts could pop up between "eint nop pop.b", presumedly none of them would modify UCBxIE since they would see it as cleared.

over 10 years ago

0 Jens-Michael Gross over 10 years ago

Guru 227245 points

The solution is simple. For the erratum to trigger, it is necessary that the status IFGs are already set, their IE bits are set and then immediately after setting GIE the IFG bits are cleared by hardware.
The minimum time between setting a status flag and clearing it is known by the minimum bus timings. The critical time window is one MCLK cycle after the EINT/LPM entry.
So if you clear (or handle) all set IFG status interrupts before entering LPM/enabling GIE, they can't be set again and then cleared within this time, unless your MCLK is slower than the I2C baudrate.

0 Patrick Allison over 10 years ago in reply to Jens-Michael Gross

Prodigy 250 points

Hmm - since UCBxIV acts like a priority-encoded UCBxIFG, it seems like there's a pretty straightforward way to do it.

If at the end of every ISR (including all the I2C code paths that are combined in the I2C ISR), you branch to the I2C ISR, if it's using the "add UCBxIV to PC" trick, then having "reti" at slot 0 will automatically do it. That way, the time between UCBxIFG not being set and sleep is minimal: probably something like ~9-10 clock cycles. Which means any reasonable MCLK should be fine. (It screws up interrupt priorities, obviously, but whatever).

Then for entering LPM, you could really sleaze it and do:
1) disable interrupt
2) check to see if we can sleep
3) push #(address of instruction after LPM)
4) push LPM + GIE
5) branch to I2C ISR just like the normal interrupts do

Then the 'reti' at the end of the I2C ISR restores the stack pointer and automatically enters LPM and sets GIE. Decently large overhead for entering LPM, but at least it's safe.

0 Jens-Michael Gross over 10 years ago in reply to Patrick Allison

Guru 227245 points

The problem happens if the CPU starts granting an interrupt (waking CPU, storing return address etc). At the end of this process, it fetches the ISR address from the interrupt vector table. If during these few clock cycles, where you can't do anything in code, the interrupt has been cleared by hardware, the CPU doesn't know which interrupt vector to fetch. Probably reading the first vector of the table, or whatever. At least this is what I think is happening.

Putting the switch over xxxIV in a do/While(1) loop and using case 0 for the reti is a common practice to save some CPU time. Yes, it spoils interrupt priorities a bit. But then, you are in this ISR only because no other higher priority interrupt was pending. And you should keep your ISR code as s short as possible anyway.

Your approach of a forced fake entry into ISR isn't bad. However, messing with the stack may interfere with compiler code optimization, so it needs to be done with inline assembly.
However, it alone doesn't save you. If the ISR exits, it may be that a different ISR is executed and during this, the USCI interrupt is set. Now the other ISR exits and right when the CPU tries to call the USCI ISR, the hardware clears the interrupt and you're hosed again. To make it bullet-proof, it seems to be getting really complex.
Luckily, I never had to deal with these interrupts at all, as in all my projects, the MSP was the (single) master :)

0 Greg Dunn over 10 years ago in reply to Jens-Michael Gross

Intellectual 475 points

I seem to be running into some of the same issues with the MSP430F5335. Have you had luck with any of the previously posted approaches in solving the USCI39 problem in your application?

I am currently not using low power mode but will be as I complete my project. I am however having some strange behavior like a "calla #FUNCTION" jumping to 2 bytes past the "FUNCTION" target address and a "popm.a #4,R11" instruction either executing twice or being skipped. Both of these of course cause very bad things to happen. These strange behaviors always seem to occur after my portENABLE_INTERRUPTS macro executes. I am using FreeRTOS in my application and the portENABLE_INTERRUPTS macro is defined in the FreeRTOS port modules. I created a custom version of portENABLE_INTERRUPTS to deal with USCI39:

#define portENABLE_INTERRUPTS() UCB1IE &= ~(UCNACKIE+UCSTPIE+UCSTTIE); \
_NOP(); \
_EINT(); \
UCB1IE |= (UCNACKIE+UCSTPIE+UCSTTIE); \
_NOP(); \
_NOP(); \
_NOP(); \
_NOP()

The 4 NOPS at the end seemed to be required to avoid the strange behaviors described above. Adding the exact code as shown in the workaround for USCI39 alone didn't seem to resolve the strange behavior. The portENABLE_INTERRUPTS macro is always called anytime the application needs to enable interrupts.

Have you seen any of the above mentioned strange behaviors in your application?

Do you need to do special handling of the ic2 interrupt enable flags on exit of every interrupt where the GIE bit may be set? USCI39 specifies that it applies when "The GIE is set by software (e.g. EINT)". Does this mean that a reti would not be a problem?

Any help would be greatly appreciated.

Thanks!

0 Jens-Michael Gross over 10 years ago in reply to Greg Dunn

Guru 227245 points

To answer the last question first: A RETI is a little bit different from a EINT or DINT. The last two are actually instructions that alter the status register. They take two CPU cycles, one for reading the instruction, one for setting the flag (by using the constant generator, as GIE is #8, this is a pure register operation). But during the second cycle, the CPU will fetch the next instruction. While enabling interrutp smay also cause an interrupt to be granted at the same time. For clearing GIE, this may cause an interrupt still occuring after the next instruction following the DINT. For setting GIE this may cause other problems. Such as a debugger triggering a breakpoint right after entering LPM while the CPU is in sleep mode and no waking interrupt has occurred yet. Or USCI 39.

In case of USCI39, the problem is, that the CPU may set GIE while an USCI interrutp is still pending. But while the CPU is granting the interrupt, the I2C hardware clears the interrupt itself (e.g. UCSTPIFG is cleared because a start has been detected, which clears the pending stop interrupt).

My guess is, that the enabling (and therefore granting) of the interrupt happens at exactly the same time as the USCI performs its interrupt logic update, causing a racing condition. The quesiton is: what is 'unpredictabel code execution'? Personally, I'd expect perhaps an ISR call where there is no more interrupt pending (which is perhaps unpredictable as it may happen or not, but no problem for a properly written ISR)

But maybe the interleaving I described before may get messed-up, causing a return address pushed on stack that is a word short of what it should be - or the execution of a fetched instruction being skipped/replaced by the ISR.
Anyway, one NOP should be sufficient, I don't know why you need 4 NOPs.

Personally, I didn't have the problem at all. I don't use these interrupts. I never had to implement an I2C slave in my projects, and none of the projects so far required a background I2C handling (they can't continue with the code anyway until the data was received, so polling is fine). So USCI39 didn't hit me yet.

P.s.: don't use '+' for combining bits. '|' is the proper bit operator. While an arithmetic operation usually yields the same result, it is not always the case. Especially when dealing with predefined bit combinations. Adding a bit twice does something completely different than setting the bit with a bitwise OR (it sets the next higher bit instead, if not already set etc.). Best case, it is just bad style, worst case it is a difficult to track bug source.

0 Patrick Allison over 10 years ago in reply to Jens-Michael Gross

Prodigy 250 points

Without knowing what the actual problem is, I don't see how you can think RETI wouldn't be a problem too. EINT modifies the status register, but so does RETI - it pops it off the stack.

In the end for development I abandoned low power mode, and then for production switched to an MSP430 with an eUSCI, which seems far less broken. It's unfortunate that the documentation for the USCI39 bug is so poor, but I just couldn't trust that I understood how to work around it.

0 Jens-Michael Gross over 10 years ago in reply to Patrick Allison

Guru 227245 points

RETI is an internal multi-cycle operation. After fetching the status register back from stack, it also performs some more things, like adding BIT16-20 to the return address, before it fetches the next instruction. During this, the interrupt handling mechanism is disabled, so the conflicting race condition of enabling and granting interrupts and clearign them can't happen.
This was already necessary to not mess-up the stack when an NMI occurs during a RETI, or cause a stack overflow by nesting interrupts.

0 Greg Dunn over 10 years ago in reply to Patrick Allison

Intellectual 475 points

We are at the point where we have about 5 different boards using the MSP430F5335 and would really like to not have to start over and rev all of these boards to use a different part if possible. I definitely understand your solution however. We have gotten the TI systems team involved and are hoping to get a firm answer back as to whether the RETI instruction can actually trigger USCI39 or not. I have added some additional high frequency externally triggered interrupts that occur asynchronously to MCLK to my application to try and duplicate any problem that the RETI instruction may trigger. I have been running a test on 4 boards for the past week and haven't been able to duplicate anything so far. At this point, it seems like the published USCI39 workaround plus 4 additional NOPS appear to correct the problem. If you would like to review any more detail on the exact issues that I have seen, you can search for the "MSP430F5335 Crashing" thread. Thank you very much for your comments!

**Attention** This is a public forum

MSP low-power microcontrollers

MSP low-power microcontroller forum

Help on dealing with USCI39 on a 5xx MSP430