This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MSP430FR5989: Unexpected Timer B behavior where an interruption is lost

Part Number: MSP430FR5989

Hi,

I am having an unexpected behavior of the TimerB causing a wdt reset. The microcontroller that I am using is a MSP430FR5989.

CONTEXT:
The microcontroller periodically calls a routine that counts the number of SMCLK ticks that fit in a ACLK period.
This routine firstly disables the interruptions and "pauses" a DMA that is constantly running by putting the TimerA (sourced by ACLK) in Capture Mode which is the module that triggers it (the DMA).
This prevents the Timer from setting the CCIFG since the CM field of its CCTL register is set to No Capture.

Once the DMA is paused, the system watchdog is started with a timeout of 2ms which is more than enough for
this routine's execution time (ACLK freq = 32kHz -> ACLK period ~ 30us).

The SMCLK ticks counter code is then executed which has been implemented as follows:

__no_init static volatile UCHAR WDTResetCause;
UINT16 CountTicks( void )
{
	UINT16 startTA0R, stopTA0R, deltaTA0R;
	istate_t local_istate = __get_interrupt_state();
	__disable_interruption();

	TA1CTL_bit.TAIFG = false;
	TA1CCTL0_bit.CAP = true;

	WDTResetCause = 0xAA;
	WDTCTL = WDTPW | WDTPW | WDTCNTCL | WDTSSEL0 | WDTIS2 | WDTIS1 | WDTIS0;

	TB0CCR1 = TB0R + 1;
	TB0CCTL1_bit.CCIFG = false;

	while ( ! TB0CCTL1_bit.CCIFG ) {
	}

	startTA0R = TA0R;
	TB0CCR1 += 1;
	TB0CCTL1_bit.CCIFG = false;

	while ( ! TB0CCTL1_bit.CCIFG ) {
	}

	stopTA0R = TA0R;
	deltaTA0R = stopTA0R - startTA0R;

	WDTCTL = WDT_STOP;
	WDTResetCause = 0x00;

	TA1CCTL0_bit.CAP = false;
	__set_interrupt_state(local_istate);
	
	return deltaTA0R;
}

TimerA0 configuration:

	TA1CTL = TASSEL__ACLK		
	         | ID__1			
	         | MC__CONTINUOUS	
	         | TACLR;
	         
	TA1CCTL0 = !CAP		
	           | OUTMOD_0	
	           | !CCIE;
	
	TA1EX0 = TAIDEX_0;

Timer B1 configuration:

	TB0CTL = TBSSEL__ACLK		
	         | ID__1			
	         | MC__CONTINUOUS	
	         | TBCLGRP_0		
	         | CNTL_0			
	         | TBCLR;

	TB0CCTL1 = !CAP	
	           | OUTMOD_0
	           | CLLD_0
	           | !CCIE;	
	
	TB0EX0 = TBIDEX_0;

DMA Configuration:

	DMA0CTL_bit.DMAEN = false;				

	DMACTL4_bit.DMARMWDIS = true;		

	DMACTL0 = ( DMACTL0 & ~DMA0TSEL_31 )
	          | DMA0TSEL__TA1CCR0;			

	DMA0CTL |= !DMADT0						
	           |  !DMADSTINCR0 | !DMADSTINCR1
	           |  DMASRCINCR0 | DMASRCINCR1	
	           |  !DMASRCBYTE				
	           |  !DMADSTBYTE				
	           |  !DMALEVEL					
	           |  !DMAIFG					
	           |  !DMAIE					
	           |  !DMAABORT					
	           |  !DMAREQ;					

ERROR DESCRIPTION:

A couple of our devices that use the same microcontroller (MSP430FR5989) have been reset by the watchdog started in this routine. This implies that either the TimerB interruption
flag has been lost or the code execution between the initilization of the timer and its incrementation takes longer than 30us (1 ACLK tick). In either way, the program would have to wait nearly 2 seconds
provoking a watchdog reset.
In addition, I have read the microcontroller's erratas, looking for a possible explanation of this interruption flag loss, and I have found that, per errata DMA7,
it is possible to lose a module interruption if a DMA request starts executing during a read-write-modify execution of that module's register containing the interruption flag.

FYI, the devices in which the failure ocurred only have this routine in common, the rest of the program and their hardware are completely different and independent. Also,
they have been running without problems for a long time (1.5 years now).
There are a lot of other devices with the exact same firmware as these and they do not have this problem, so this is not really a recurrent issue.

I tried to reproduce the error by simulating the situation of the DMA7 errata, but I had no luck.

QUESTIONS:
Are any of the theories described before (particularly the DMA7 errata) possible to be the cause of this error? If so, how could I reproduce the error?
Is there anything else I could try?

ADDITIONAL INFORMATION:
DCO freq = 8MHz
MCLK and SMCLK freq = 4MHz
ACLK freq = 32kHz

__no_init static volatile UCHAR WDTResetCause;
UINT16 CountTicks( void )
{
	UINT16 startTA0R, stopTA0R, deltaTA0R;
	istate_t local_istate = __get_interrupt_state();
	__disable_interruption();

	TA1CTL_bit.TAIFG = false;
	TA1CCTL0_bit.CAP = true;

	WDTResetCause = 0xAA;
	WDTCTL = WDTPW | WDTPW | WDTCNTCL | WDTSSEL0 | WDTIS2 | WDTIS1 | WDTIS0;

	TB0CCR1 = TB0R + 2;
	TB0CCTL1_bit.CCIFG = false;

	while ( ! TB0CCTL1_bit.CCIFG ) {
	}

	startTA0R = TA0R;
	TB0CCR1 += 1;
	TB0CCTL1_bit.CCIFG = false;

	while ( ! TB0CCTL1_bit.CCIFG ) {
	}

	stopTA0R = TA0R;
	deltaTA0R = stopTA0R - startTA0R;

	WDTCTL = WDT_STOP;
	WDTResetCause = 0x00;

	TA1CCTL0_bit.CAP = false;
	__set_interrupt_state(local_istate);
	
	return deltaTA0R;
}

  • >  TB0CCR1 = TB0R + 1;
    > TB0CCTL1_bit.CCIFG = false;

    I'm pretty sure there's a race here. If ACLK happens to tick between these two statements, you'll lose the IFG and have to wait for the timer to cycle (2s). The similar sequence that follows has (most of) the full 30us to run, but this one is subject to (unknown) initial conditions.

    I think that just re-ordering these two statements doesn't help, since TB0R has to "count to" CCR1 to trigger the compare match. I suggest you increment by +2 rather than +1, since (within limits) you don't really care when the first trigger happens, only the delta between the first and second.

  • Hi Bruce,

    Thanks for your quick response.

    Yes, I agree, there´s a race in the code that I uploaded here. However, in the actual code there is TB0CCR = TB0R + 2. I must have made a mistake in the transcription of the code.

    I edited my post.

**Attention** This is a public forum