MSP430FR5739: Data corruption when reading timer capture register

Greg Fundyler

Part Number: MSP430FR5739
Other Parts Discussed in Thread: MSP-EXP430FR5739

Hi,

I am trying to do timer capture on TB1. This timer is running off of SMCLK (24 MHz DCOCLK) and should be synchronous with the CPU. I am using an asynchronous external signal to do the capture on CCR0 from CCI0A on P2.2. The external signal can be very fast, on the order of 50 kHz and up.

I have been researching the SCS synchronization mode and when/how it should be used. (Refer to this and this.) In my case, since the timer is using the same clock source as the CPU, I can read TB1R directly in software without needing to do any tricks (e.g. software-triggered capture or consecutive reads of TB1R). For my CCR0 external capture, it seems that I should be using SCS to avoid asynchronous captures while TB1R is (potentially) transitioning. So my timer is configured as follows:

TB1CTL = TBSSEL_2 | MC_2;               // use SMCLK, continuous mode
TB1CCTL0 = CM_3 | CCIS_0 | SCS | CAP;   // both edges, CCI0A, synchronize, capture mode

Indeed I have had no trouble reading TB1R anytime. But if I read TB1CCR0 in a tight loop with no regard for CCIFG, I will sometimes get data corruption. The following code will fail after around 10,000 - 200,000 iterations:

void wait_for_halt(void) {
    uint16_t timer_current, timer_captured, timer_delta;
    do {
        timer_captured = TB1CCR0;
        timer_current = TB1R;
        timer_delta = timer_current - timer_captured;
    } while(timer_delta < 300);
}

But this always works by testing CCIFG:

void wait_for_halt(void) {
    uint16_t timer_current, timer_captured, timer_delta, timer_flag;
    do {
        TB1CCTL0 &= ~(CCIFG | COV);
        timer_captured = TB1CCR0;       // this read does NOT clear CCIFG
        timer_flag = TB1CCTL0 & CCIFG;  // timer_captured is invalid if CCIFG appeared during read
        timer_current = TB1R;
        timer_delta = timer_current - timer_captured;
    } while(timer_flag != 0 || timer_delta < 300);
}

And this works by reading the register twice:

void wait_for_halt(void) {
    uint16_t timer_current, timer_captured, timer_redundant, timer_delta;
    do {
        timer_captured = TB1CCR0;
        timer_redundant = TB1CCR0;
        timer_current = TB1R;
        timer_delta = timer_current - timer_captured;
    } while(timer_captured != timer_redundant || timer_delta < 300);
}

My question is: Why do I need to do this? I presume that none of these registers are buffered. That would explain why you need to bend over backwards to read TB1R if using an external clock to drive the timer. But why would I have this issue with TB1CCR0, especially given that the SCS synchronization mode is enabled? I would think that this would cause the register to update synchronously with the timer and CPU; meaning I could just read it anytime. Or is the synchronization causing the register to sometimes update at the exact moment that my read is taking place?

I have experimented with SCS disabled. As expected, this fails regularly and none of the implementations above help.

On a side note, I understand that this example could be implemented by simply watching for transitions on the input and tracking the time in software. But I need to be able to detect a lack of signal immediately upon entry to the function.

over 6 years ago

0 Greg Fundyler over 6 years ago

Intellectual 545 points

It might be worth mentioning that I'm using TI compiler v16.9.1.LTS with -O2 --opt_for_speed=5. But the optimization settings appear to have no effect on this problem as it was present with -Ooff as well.

0 Wei.Jetim Zhao over 6 years ago

TI__Genius 13215 points

Hi Greg,

I'm still checking your issue on my side. But I would like to give you some of my comments so far.

1. For your first code, why you need to read the TB1CCR0 circularly even if the capture is not taken? In the UG of device, the TB1CCR0 is loaded with the TB1R when capture is taken.
2. Also, how do you been indicated the TB1CCR0 overflow since you just read the register circularly.

I will check the mechanism for TB1CCR0 register read/write with our design team and then reply you. But I would suggest you to trigger your function by the Timer_B CCIFG which indicates a capture is performed.

0 Greg Fundyler over 6 years ago in reply to Wei.Jetim Zhao

Intellectual 545 points

Hi Wei,

Thank you for looking into my issue. I appreciate your time/effort.

1. I don't check for CCIFG because this is an extra step. I don't care whether or not there has been a new capture. What I am trying to check for is a lack of signal for 300+ timer counts. I do this by calculating (TB1R - TB1CCR0) and waiting for it to be >= 300.

2. I don't care whether there has been a COV condition as long as when I read TB1CCR0 I get the most recent capture, and not an old one. This appears to work as expected.

Regarding triggering on CCIFG: The signal on CCI0A can be very fast. It could be a few MHz. I am afraid that if I trigger on CCIFG, that by the time I read TB1CCR0 my read can coincide with another capture and cause data corruption. Using this corrupt read to do the above calculation could result in a value greater than 300, causing it to exit the loop prematurely. (This is what happens in the first code example.)

I have tested the 2nd and 3rd code examples to work with signals between 50 kHz and 21 MHz. Both versions remain in the loop until the signal drops near about 40 kHz.

Thank you,
Greg Fundyler

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

Thanks for your feedback.

I'm sorry I've had no time for checking your issue with our design team so far, as I'm busy for some urgent cases, so that I can't reply you today for any idea.

But I will check your issue tomorrow and feedback. Please forgive my late replying.

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg Fudyler,

Thanks for your patience. I have check the mechanism for TB1CCR0 register read/write with our design owner today. But he's still checking the implementation code inside silicon as the Timer is a mature peripheral which was designed long time ago.

I will say sorry again that I would be back to reply you early next week because I will be out of office this Friday. But I will check with our design team as soon as possible when being back next Monday.

0 Greg Fundyler over 6 years ago in reply to Wei.Jetim Zhao

Intellectual 545 points

Thank you. I look forward to hearing from you next week.

0 Danny F over 6 years ago

Genius 3850 points

"I will sometimes get data corruption. "

what does "data corruption" mean and how do you know that you are getting it?

"The following code will fail after around 10,000 - 200,000 iterations:"

what does "fail" mean?

I think it helps your readers if you lay out what you are trying to do with your code, what you are expecting and what you are getting.

0 Greg Fundyler over 6 years ago in reply to Danny F

Intellectual 545 points

Hi Danny,

"I will sometimes get data corruption. " means that the value I read from the register is neither the last capture, nor the capture before it, but rather an apparently random value. I get the correct value if I simply read the register again.

Now, I say "apparently random" but it isn't really random. I have seen individual bits flip. For example, two consecutive reads of the register will have a difference of exactly 512 (bit 9). My conclusion is that I am sometimes reading the register while it is changing. With SCS enabled and SMCLK as the clock source, I would think that should not happen.

"The following code will fail after around 10,000 - 200,000 iterations:" means the loop exits prematurely after some number of iterations, even though the true value of (TB1R - TB1CCR0) has not exceeded 300. (I have seen this calculation come back in the 22,000 range.) This does not happen in the 2nd and 3rd code example.

What I am trying to do is detect a lack of signal on an input pin, defined as "no transitions for 300 SMCLK cycles." It should exit the loop when the signal drops below ~ 50 kHz. However, what I'm trying to do isn't the point of the post. I don't need code help. I need help understanding the underlying mechanism of the timer peripheral so I can have a better feel for how I should and should not use it. Maybe that information will help others as well.

0 Bruce McKenney47378 over 6 years ago

Guru 91630 points

Have you tried it without the debugger running?

I wrapped this code and put it on an FR5739 Launchpad (Rev 1.1, pretty ancient). With the debugger running, I see frequent failures. If I disable the debugger, it stops happening. My scope has now been waiting for about 15 minutes (maybe 35+million loops) and it hasn't triggered.

I don't know the details, but I'm pretty sure the debugger sticks its nose into your execution periodically, which could cause these glitches.

Just to make sure that I've maintained the spirit of your experiment:

#include <msp430.h> 
#include <stdint.h>
void wait_for_halt(void) {
    uint16_t timer_current, timer_captured, timer_delta;
    do {
        timer_captured = TB1CCR0;
        timer_current = TB1R;
        timer_delta = timer_current - timer_captured;
    } while(timer_delta < 300);
    return;
}

#define LED8    BIT7               // P3.7
#define LED7    BIT6               // P3.6
#define LED6    BIT5               // P3.5
#define LED5    BIT4               // P3.4
#define HZ      1000000UL
int
main(void)
{
	WDTCTL = WDTPW | WDTHOLD;	// stop watchdog timer
	P3OUT &= ~(LED5|LED6|LED7|LED8);
	P3DIR |=  (LED5|LED6|LED7|LED8);

	TB1CTL = TBSSEL_2 | MC_2;               // use SMCLK, continuous mode
	TB1CCTL0 = CM_3 | CCIS_0 | SCS | CAP;   // both edges, CCI0A, synchronize, capture mode
	P2SEL0 |= BIT2;              // P2.2 as TB1.0 (CCI0A)
    P2SEL1 |= BIT2;              // per table 6-42

    P1SEL0 |= BIT2;             // P1.2 as TA1.1
    P1SEL1 &= ~BIT2;            // per Table 6-39
    P1DIR  |= BIT2;
    TA1CCTL1 = OUTMOD_7;        // Reset/Set
#define CAP_HZ  50000UL
    TA1CCR1 = HZ/CAP_HZ/2;
    TA1CCR0 = HZ/CAP_HZ-1;      // 50kHz
    TA1CTL = TASSEL_2 | ID_0 | MC_1 | TACLR;    // SMCLK/1, Up [,Clear]
    //  Now jumper P1.2 to P2.2 to get a signal

	while (1)
	{
	    wait_for_halt();
	    P3OUT ^= LED8;
	}
	/*NOTREACHED*/
	return 0;
}

0 Greg Fundyler over 6 years ago in reply to Bruce McKenney47378

Intellectual 545 points

Interesting! No, admittedly I have not tried it without the debugger running. It had not occurred to me that it could be interfering. I guess the debugger intermittently halts the CPU to do its thing. Injecting that kind of randomness into the timing could make weird stuff happen. But removing that randomness doesn't necessarily mean the problem is solved. It is possible that making a benign change, such as incrementing a global variable each time through the loop, can bring the problem back. I've seen that happen too.

Would you mind running a couple experiments while you have all of this set up, please? Try a few different frequencies between 50 kHz and 1 MHz. And also try generating the signal from something external. You may experience different behavior with an asynchronous clock source.

Also, it looks like you are running MCLK at 1 MHz. Mine is 24 MHz. That should not really matter a whole lot, but it does change the numbers (e.g. 300 doesn't mean 40 kHz anymore).

Thank you for taking the time to reproduce my experiment. This is very informative. But I really should have provided a complete source file to make it easier. My apologies.

I will experiment some more when I am back in the lab next week.

0 Bruce McKenney47378 over 6 years ago in reply to Greg Fundyler

Guru 91630 points

I fooled with this a little more and indeed it does seem to fail at higher clock/trigger rates. I'm still not completely convinced of a "supernatural" explanation, but I'm running out of "natural" ones.

It seems to fail pretty repeatably (though not always the same way) with MCLK >= 8MHz / trigger rate >= 500kHz. The fact that (as you mentioned) adding even unrelated code causes the symptom to change is suspicious and, more to the point, makes diagnosis difficult. I moved it to TB0 and even fooled with the FRAM controller but it made no evident difference.

I put a reasonable facsimile on an FR5969 (Launchpad) and it ran flawlessly at MCLK=16MHz/trigger=1MHz, with or without the debugger. This suggests that there's nothing obviously awry about the code. It also suggests that whatever it is is limited to the FR57 generation.

I should mention that the MCU on my FR5739 Launchpad is an "X" (pre-release) device, but my FR5969 LP has an "M" (honest and for true) device.

I think I'm going to leave it there and see if Wei's Verilog Guy has any suggestions.

Full disclosure: Here's what I was working with:

#include <msp430.h> 
#include <stdint.h>
#define STATS   0               // Don't collect stats (disturbs the results?)
#define SMCLK_MON   1           // Put SMCLK out on P3.4
#define NO_AUTO 0               // Let FRCTL control the wait states
#define CAP_HZ    1000000UL
#define HZ       24000000UL
#if (HZ != 1000000UL) && (HZ != 8000000UL) && (HZ != 24000000UL)
#error "Fix HZ"                 // Typo Alert!
#endif
#if STATS
uint16_t fail_curr, fail_cap, fail_delta;
#endif // STATS
void wait_for_halt(void) {
    uint16_t timer_current, timer_captured, timer_delta;
    do {
        timer_captured = TB0CCR0;
        timer_current = TB0R;
        timer_delta = timer_current - timer_captured;
    } while(timer_delta < 400); //(2*HZ/CAP_HZ)); // 300);
#if STATS
    fail_cap = timer_captured;
    fail_curr = timer_current;
    fail_delta = timer_delta;
#endif // STATS
    return;
}

void
clk_init(void)
{
#if NO_NAUTO        // Clear NAUTO and explicitly set wait states
    // Wait states (NAUTO=0) per UG Table 5-1
#if HZ == 1000000UL
    FRCTL0 = FWPW | NACCESS_0 | NPRECHG_0;  // was 0/0
#elif HZ == 8000000UL
    FRCTL0 = FWPW | NACCESS_0 | NPRECHG_0;  // was 0/0
#elif HZ == 24000000UL
    FRCTL0 = FWPW | NACCESS_2 | NPRECHG_1;  // was 2/1
#else
#error "clk_init: Fix Hz"
#endif
#endif // NO_AUTO

    //  At startup DCO=8MHz, but with clock dividers of /8.
#if HZ == 1000000UL
    // Already there
#else // Need to set something
    CSCTL0_H = 0xA5;
#if HZ == 24000000UL
    CSCTL1 |= DCORSEL;                      // In addition to DCOFSEL_3
#endif // HZ=24M
    CSCTL3 = DIVA_0 + DIVS_0 + DIVM_0;        // set all dividers to 0
#endif  // HZ==1M
#if SMCLK_MON
    P3SEL0 |= BIT4;                 // SMCLK out (yeah there's an LED there too)
    P3SEL1 |= BIT4;                 // per table 6-47
    P3DIR  |= BIT4;
#endif
    return;
}
#define LED8    BIT7               // P3.7
#define LED7    BIT6               // P3.6
#define LED6    BIT5               // P3.5
#define LED5    BIT4               // P3.4
int
main(void)
{
	WDTCTL = WDTPW | WDTHOLD;	// stop watchdog timer
	P3OUT &= ~(LED5|LED6|LED7|LED8);
	P3DIR |=  (LED5|LED6|LED7|LED8);
	clk_init();

	TB0CTL = TBSSEL_2 | MC_2;               // use SMCLK, continuous mode
	TB0CCTL0 = CM_3 | CCIS_0 | SCS | CAP;   // both edges, CCI0A, synchronize, capture mode
	P2SEL0 |= BIT1;              // P2.1 as TB0.0 (CCI0A)
    P2SEL1 |= BIT1;              // per table 6-42

    P1SEL0 |= BIT2;             // P1.2 as TA1.1
    P1SEL1 &= ~BIT2;            // per Table 6-39
    P1DIR  |= BIT2;
    TA1CCTL1 = OUTMOD_7;        // Reset/Set
    TA1CCR1 = HZ/CAP_HZ/2;
    TA1CCR0 = HZ/CAP_HZ-1;      // 50kHz
    TA1CTL = TASSEL_2 | ID_0 | MC_1 | TACLR;    // SMCLK/1, Up [,Clear]
    //  Now jumper P1.2 to P2.1 to get a signal

	while (1)
	{
	    wait_for_halt();
	    P3OUT ^= LED8;
	}
	/*NOTREACHED*/
	return 0;
}

0 Greg Fundyler over 6 years ago in reply to Bruce McKenney47378

Intellectual 545 points

Thank you for doing this detective work. Interesting that it doesn't manifest on all devices. Indeed, I'm looking forward to getting some answers from the chip devs.

0 Danny F over 6 years ago in reply to Greg Fundyler

Genius 3850 points

"My conclusion is that I am sometimes reading the register while it is changing."

unless your register is multiples of 16 bits, any operation on it is atomic.

again, I think you should go back and ask yourself what you are trying to do and how reality differs from your expectation. when that happens, usually the fault is with your expectation.

"What I am trying to do is detect a lack of signal on an input pin"

many ways to do that, depending on your definition of "lack of a signal". if you define a signal to be a full logic transition, the simplest would be to use it to reset a free-running counter - that can be done via external interrupt, or a counter, or an input capture (like what you are trying to do). or resetting a loop counter, as if often done.

there are analog ways of doing things as well.

0 Greg Fundyler over 6 years ago in reply to Danny F

Intellectual 545 points

I think you are misunderstanding what I'm trying to convey. Yes, the register read is an atomic operation. But that does not preclude timing violations in the underlying hardware of the chip.

Are you familiar with the issue of trying to read TB1R when the timer is driven by an asynchronous external clock? You can get garbage because things change asynchronously. Individual bits within the register can be read incorrectly (i.e. the data corruption I was referencing). I believe a similar phenomenon is taking place here.

Yes, you are correct in that there are lots of ways to implement this type of lack-of-signal detection. Most of them are not appropriate for my application because the signal is of such high frequency. A software loop could literally miss all of the high pulses, for example, depending on how the timing of the loop lines up with the timing of the signal.

But once again, all of that is irrelevant. I discovered what I believe to be errata or at a minimum misunderstood behavior of the MSP430 chip. That's what is being explored here.

0 Greg Fundyler over 6 years ago in reply to Bruce McKenney47378

Intellectual 545 points

I thought of something that should aid diagnosis. Convert the wait_for_halt function to the third version of the code from my original post. First off, confirm that it does in fact operate flawlessly. Then change the condition to:

} while(timer_captured == timer_redundant || timer_delta < 300);

When it exits the loop, compare timer_captured vs timer_redundant. I expect you will see individual bits flip between the two. Furthermore, (timer_current - timer_redundant) should produce the correct answer of < 300.

I will try this myself in a few hours when I have access to hardware.

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

Thanks for your patient.

After checking with our design team, I would come back with simple conclusion about the mechanism Timer capture function.
1. The Timer counter register value TAxR is captured in TAxCCRn register when a capture event triggered. And the TAxCCRn register will be updated once there is another capture event triggered (regardless CCIFG is cleared or not).
2. If the COV is set with an overflow, it will stay until being cleared. But the TAxCCRn register will be updated once there’s new capture event triggered.

I went through the discussion between Bruce, Denny and you during the weekends, and many thanks for Bruce McKenney47378 and Danny F for doing the detective work.

But let's focus on you original issue. Could you share your whole original code for the issue recurrence so that we could stay on a same page?

I would like to make a summary for your original issue: the bit 9 of the TAxCCRn would get corruption when two consecutive reads of the register. Right?

Please let me know which action could be done for your issue? I would like to recurrence your code if needed from your side. Our design team is also happy to do further simulation with your issue.

0 Greg Fundyler over 6 years ago in reply to Wei.Jetim Zhao

Intellectual 545 points

Thank you for confirming the behavior of the CCR registers. Those two points are in line with my observations and expectations.

I have the following test case based on Bruce's code (thank you so much Bruce):

#include <msp430.h>
#include <stdint.h>
#define SMCLK_MON   1           // Put SMCLK out on P3.4
#define NO_AUTO 0               // Let FRCTL control the wait states
#define CAP_HZ     100050UL
#define HZ       24000000UL
#if (HZ != 1000000UL) && (HZ != 8000000UL) && (HZ != 24000000UL)
#error "Fix HZ"                 // Typo Alert!
#endif

uint16_t last, captured, redundant, current, delta;

void wait_for_halt_TB0(void) {
    do {
        last = captured;
        captured = TB0CCR0;
        redundant = TB0CCR0;
        current = TB0R;
        delta = current - captured;
    } while(captured == redundant || delta < 400);
}

void wait_for_halt_TB1(void) {
    do {
        last = captured;
        captured = TB1CCR0;
        redundant = TB1CCR0;
        current = TB1R;
        delta = current - captured;
    } while(captured == redundant || delta < 400);
}

void
clk_init(void)
{
#if NO_NAUTO        // Clear NAUTO and explicitly set wait states
    // Wait states (NAUTO=0) per UG Table 5-1
#if HZ == 1000000UL
    FRCTL0 = FWPW | NACCESS_0 | NPRECHG_0;  // was 0/0
#elif HZ == 8000000UL
    FRCTL0 = FWPW | NACCESS_0 | NPRECHG_0;  // was 0/0
#elif HZ == 24000000UL
    FRCTL0 = FWPW | NACCESS_2 | NPRECHG_1;  // was 2/1
#else
#error "clk_init: Fix Hz"
#endif
#endif // NO_AUTO

    //  At startup DCO=8MHz, but with clock dividers of /8.
#if HZ == 1000000UL
    // Already there
#else // Need to set something
    CSCTL0_H = 0xA5;
#if HZ == 24000000UL
    CSCTL1 |= DCORSEL;                      // In addition to DCOFSEL_3
#endif // HZ=24M
    CSCTL3 = DIVA_0 + DIVS_0 + DIVM_0;        // set all dividers to 0
#endif  // HZ==1M
#if SMCLK_MON
    P3SEL0 |= BIT4;                 // SMCLK out (yeah there's an LED there too)
    P3SEL1 |= BIT4;                 // per table 6-47
    P3DIR  |= BIT4;
#endif
    return;
}
#define LED8    BIT7               // P3.7
#define LED7    BIT6               // P3.6
#define LED6    BIT5               // P3.5
#define LED5    BIT4               // P3.4
int
main(void)
{
    WDTCTL = WDTPW | WDTHOLD;   // stop watchdog timer
    P3OUT &= ~(LED5|LED6|LED7|LED8);
    P3DIR |=  (LED5|LED6|LED7|LED8);
    clk_init();

    TB0CTL = TBSSEL_2 | MC_2;               // use SMCLK, continuous mode
    TB0CCTL0 = CM_3 | CCIS_0 | SCS | CAP;   // both edges, CCI0A, synchronize, capture mode
    P2SEL0 |= BIT1;              // P2.1 as TB0.0 (CCI0A)
    P2SEL1 |= BIT1;              // per table 6-42

    TB1CTL = TBSSEL_2 | MC_2;               // use SMCLK, continuous mode
    TB1CCTL0 = CM_3 | CCIS_0 | SCS | CAP;   // both edges, CCI0A, synchronize, capture mode
    P2SEL0 |= BIT2;              // P2.2 as TB1.0 (CCI0A)
    P2SEL1 |= BIT2;

    P1SEL0 |= BIT2;             // P1.2 as TA1.1
    P1SEL1 &= ~BIT2;            // per Table 6-39
    P1DIR  |= BIT2;
    TA1CCTL1 = OUTMOD_7;        // Reset/Set
    TA1CCR1 = HZ/CAP_HZ/2;
    TA1CCR0 = HZ/CAP_HZ-1;      // 50kHz
    TA1CTL = TASSEL_2 | ID_0 | MC_1 | TACLR;    // SMCLK/1, Up [,Clear]
    //  Now jumper P1.2 to P2.1 to get a signal on TB0
    //  Or jumper P1.2 to P2.2 to get a signal on TB1

    while (1)
    {
        wait_for_halt_TB0();
        P3OUT ^= LED8;
    }
    /*NOTREACHED*/
    return 0;
}

I had to pick a weird capture frequency (CAP_HZ) to make this problem manifest. 1 MHz and 100 kHz never exited the loop. This ties back to adding extra code, or otherwise changing the timing, having an effect on the behavior.

Each iteration through the loop should adhere to this rule: last <= captured <= redundant < current (adjust for integer rollovers)

This is what I am seeing with this particular implementation and these numbers on TB0:

The difference between captured and last appears to be 16384 every time. Interesting. I tried tweaking CAP_HZ and even feeding a clock to it externally, but it looks like it is 16384 no matter what. FWIW, even the (wrong) delta value appears to be fairly repeatable +/- 1 for a given set of conditions (capture frequency, internal/external source).

I tried again with TB1. It fails much less predictably than TB0. Here is an example of the 512 I mentioned previously, as well as 1536 (two adjacent bits).

I hope this provides enough info to be useful. Thanks again.

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

I will try to catch up your tests, it seems a lot of tests on this thread, and try to run with my target board with FR5739. Will keep you updating with the result on my side.

0 Bruce McKenney47378 over 6 years ago in reply to Greg Fundyler

Guru 91630 points

I just noticed a typo:
> #if NO_NAUTO // Clear NAUTO and explicitly set wait states
should be
> #if NO_AUTO // Clear NAUTO and explicitly set wait states

It was correct at one time, so I suspect a slip-of-the-keyboard sometime after I gave up on that avenue.

0 Greg Fundyler over 6 years ago in reply to Bruce McKenney47378

Intellectual 545 points

No big deal! Any thoughts about the results I posted?

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

I would like to give you update from my side.

1. I run your code on my MSP-EXP430FR5738, BUT it never run to the P3OUT ^= LED8; during about 5 minutes.

2. I modified the wait_for_halt_TB0(void) as below:

void wait_for_halt_TB0(void) {
    do {
        P3OUT |= BIT4;
        last = captured;
        P3OUT |= BIT5;
        captured = TB0CCR0;
        P3OUT |= BIT6;
        redundant = TB0CCR0;
        P3OUT &= ~(BIT4|BIT5|BIT6);
        current = TB0R;
        delta = current - captured;
    } while(captured == redundant);
}

It look like I found the root cause why you get different value among redundant, captured and last in one single loop. Please find below scope for the pins.

You could see frequency of your while loop is about 362kHz, while the capture event is 100kHz. Looking at the P3.4/5/6 which shows the start of the three lines of you original code. The capture event(P1.2) is very easy to load in the period of these lines of code. That says the Timer will get a new capture event during you copy the values to last, captured and redundant. I think that's the reason why you get different values for these three variables in a single loop.

0 Greg Fundyler over 6 years ago in reply to Wei.Jetim Zhao

Intellectual 545 points

Let's confirm a few things about your setup.

Please return back to the code I provided.
Did you jumper P1.2 to P2.1? Please make sure that you have a signal on P2.1 and that TB0CCR0 is really being updated regularly while you are in the loop.
Try changing CAP_HZ slightly to see if that causes the loop to exit. It seems this might behave differently from one chip to another.
If you still cannot reproduce the exit from the loop, try another timer. Connect your jumper between P1.2 and P2.2 and call wait_for_halt_TB1 instead.

I would like to reiterate that the problem is not that last, captured, and redundant are different. This is expected because the capture input signal is asynchronous.

As you indicated, the loop runs at 362 kHz while the capture input is 100 kHz. So you should expect to see no more than one change within the loop. Sometimes zero changes. Never two changes. Correct?

That is not what happens in my example (on my MSP-EXP430FR5739). After the loop exits, I see that last, captured, and redundant are all different from one another AND the value of captured is different from the value of last by a single bit (0x4000, bit 14). Please look again:

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

Thanks for your quick feedback. I would give you quick reply for the first 2 question that I jumped P1.2 to P2.1 as your code comments.

For the question 3&4, I will take time to do it when I get away from another urgent case. I will do it this afternoon(Friday in China) but it would like to be next Monday because I will be busy on several meetings today.

I would thanks for your patient again as we've already discussed for a week. Now we have got something cleared as involved from our design expert. But it seems we are still on the way for the root cause. Please feel free to reply any time you get more result and I will check your feedback every day.

0 Greg Fundyler over 6 years ago in reply to Wei.Jetim Zhao

Intellectual 545 points

Hi Wei,

Thank you for everything you have done so far. I don't believe there is anything else I can contribute at the moment. The code and results should be adequate to understand and reproduce the issue. I think it is up to you guys to take it from here. This may not happen the same on every chip. But if both Bruce and I were able to observe this behavior, I believe you and your colleagues should be able to replicate it as well. Please keep trying on other boards, other timers, etc. Let me know if there is anything specific I can assist with.

Thank you,
Greg

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

Sure, I will continually work for your issue and let you updated for the result. Thanks again for your patient.

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

I would like to give you a quick update.

I've already reproduced your issue on TB1 and also found something strange. I will update the details tomorrow after some deep debug.

0 Greg Fundyler over 6 years ago in reply to Wei.Jetim Zhao

Intellectual 545 points

Great! Thank you. I look forward to hearing about your results.

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Gerg,

Thanks for your patient.

I did more tests, a lot of times, today for changing a little bit on the code. Such as insert NOP in the loop. It seems the period of the loop is impacting the issue, which should be aligned with your description for the CAP_HZ. Now the issue could be stably reproduced on my board.

void wait_for_halt_TB1(void) {
    do {
        last = captured;
        P3OUT ^= BIT4;
        __no_operation();
//        __no_operation();
//        __no_operation();
//        __no_operation();
//        __no_operation();
//        __no_operation();
        captured = TB1CCR0;
        redundant = TB1CCR0;
        current = TB1R;
        delta = current - captured;
//    } while(captured == redundant || delta < 400);
    } while(captured == redundant || captured == last);
}

I need to involved development team to do some silicon level simulation for this code to scope the TB0CCR0's behavior deeply. The simulation should take several days because our development team need to setup the device environment for your case. I would like to move this discussion to offline by email and come back to E2E when we get the final root cause. Could you send me email to wei-zhao@ti.com?

0 Bruce McKenney47378 over 6 years ago in reply to Wei.Jetim Zhao

Guru 91630 points

I'm only an Interested Bystander, but I would like to hear how it turns out.

0 Wei.Jetim Zhao over 6 years ago in reply to Bruce McKenney47378

TI__Genius 13215 points

Hi Bruce,

Thanks for your investigation on this issue again. I will post the root cause and solution on this thread after the work.

0 Wei.Jetim Zhao over 6 years ago in reply to Greg Fundyler

TI__Genius 13215 points

Hi Greg,

Thanks for the discussion offline. I would like to give a summary here about our discussion.

The root cause why you saw that last, captured, and redundant are all different from one another is that the software read a wrong value from TB1CCR0 while the internal logic is loading the TB1R value to TB1CCR0 register after a captured event.

Please find below test code I used. P3.5&6 are indicating that the software is reading the TB1CCR0 to captured and redundant.

The test result is below. Every failed loop only came after that the trigger event came when the software is reading TB1CCR0 to captured(line62)

As the UG says that “The timer value is copied into the TBxCCRn register”, the implementation on internal logic would take a period for the “copied” action. In this period, the value of TB1CCR0 should not be stable, so that any read during capture event will result in uncountable value CCR.

The settling time of the CCRx register value on real silicon could depends on the cor voltage, silicon process, temperature and so on. So, you should avoid to read the CCRx register during the capture event. The suggestion way is deal with the capture event in Timer interrupt service routine which triggered by CCIFG. This should be the normal method.

But for your case, my suggestion is using the Compare mode(section 12.2.4.2 Compare Mode of the UG). In this mode, you could set your expire count value into TBxCLn. Then start the timer counter to let the Timer run itselft(Meanwhile, the CPU could run other tasks). Configure the signal input pin as GPIO interrupt function and if a transition occurs, the GPIO interrupt will be triggered and you could clear the Timer counter(TBxR) in the GPIO interrupt service routine. If you don’t clear the TBxR before it counts up to TBxCLn value, the CCIFG will be set to trigger a timer interrupt.

And also, your favored approach, which reads the register twice and only use the value if it matches between the two reads (basically the 3rd code example in my post), could work well for your case. Only be careful that It could fail when you approach a frequency so high that consecutive reads would yield different answers every time.

Additionally, for reading of TB1R you mentioned, please be noted that the TB1R increments on a rising edge and CCR0 latches TB1R on a falling edge. So if the CPU read instruction executes on falling edges, then it will always read TB1R just fine when Timer and CPU are using synchronous clock source.

0 Bruce McKenney47378 over 6 years ago in reply to Wei.Jetim Zhao

Guru 91630 points

Thanks for the summary.

Is this particular to the FR57 generation?

**Attention** This is a public forum

MSP low-power microcontrollers

MSP low-power microcontroller forum

MSP430FR5739: Data corruption when reading timer capture register