This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Mcu reset after timer overflow interrupt

Other Parts Discussed in Thread: MSP430FR5969

Hi all ! 

I'm having some trouble trying to make the timer of wolverine work. In fact, everytime TA1R reachs 65536, instead of getting back to 0, it makes the cpu resetting. I'm in continuous mode and here's my code (actually it's a part of the Contiki OS code) : 

 

  TA1CTL = TASSEL_1 | TACLR | ID_0;
  TA1CCTL1 = CCIE;
  TA1CCR1 = INTERVAL;
  TA1CTL |= MC_2;

And the interrupt handler : 

#pragma vector = TIMER1_A1_VECTOR
__interrupt void Timer1_A1_ISR(void)
{
  if(TA1IV == 2) {

  while((TA1CTL & MC1) && TA1CCR1 - TA1R == 1);

    do {
      TA1CCR1 += INTERVAL;

      ++count;

      if(count % CLOCK_CONF_SECOND == 0) {
        ++seconds;
      }

    } while((TA1CCR1 - TA1R) > INTERVAL);

    last_tar = TA1R;

  }


}

Thank you for your help, 

Best regards, 

Laurent.

  • Hi Laurent, 

    is the watchdog enabled?

    multi-flag interrupts require that the sourcing flag be cleared and I don't see source flag clearing in your interrupt handler.  If the flag is not cleared, your code will exit the handler only to re-enter immediately which will, eventually, result in WDT overflow and will reset the MCU.  

    add code to clear the flag and re-check.  Let us know what happens.

    edit-disregard

  • Michael S said:

    multi-flag interrupts require that the sourcing flag be cleared and I don't see source flag clearing in your interrupt handler.  If the flag is not cleared, your code will exit the handler only to re-enter immediately which will, eventually, result in WDT overflow and will reset the MCU.

    The code fragment above reads from TA1IV, which automatically causes the highest pending interrupt flag to be cleared (Section of 11.2.6.2 the MSP430FR57xx Family User's Guide).

    Laurent, which specific wolverine chip are you working on? Also, have you tried dropping those code fragments from Contiki into a minimal test program to see if they're the true source of the problem?

  • Thank you both for your answer, 

    So I tried to follow your advices with a minimal test program and it works like a charm. So I checked in my Contiki code and it seems, as you pointed out, that the watchdog was still enable, so now my interrupt handler is : 

    #pragma vector = TIMER1_A1_VECTOR
    __interrupt void Timer1_A1_ISR(void)
    {
    
    watchdog_stop();
    
    // Same as before 
    
    watchdog_start();
    
    }

    .. And it seems to work now. However i'm not sure to understand why I have to stop the watchdog during the interrupt handling, therefore I'm not sure to do the right move here. If you could explain to me what is going on behind this code, it would be very helpfull. 

    By the way, i'm working on the wolverine launchpad, the cpu is MSP430FR5969

    Regards, 

    Laurent.

  • Hey Laurent,

    I will strongly recommend you to go through WDT (section 12 )of Msp430FR5969 family User Guide. By the standard definition: WDT is intended to perform a controlled system restart in case of a software  problem. In your case, it seems WDT is enabled for some specific reason. (May be or maybe not)

    but any way as Michael.S pointed out, you have to clear the source interrupt flags before you leave any ISR.  for example lets say you have a port interrupt from button ( lets say P1.1). When ever you press a button, bit1 of register P1IFG is set. This is the source interrupt. Then the execution of code enter in to ISR and starts executiong lines in ISR. before it leaves the ISR you have clear BIT1 of P1IFG register (Source interrupt). if not, the code thinks that the corresponding action of this interrupt haven't taken place and re enter the same ISR which it had served earlier. To avoid this you have to clear the source flag.

    When it comes to (disabling  and enabling WDT, from the standard definition) you usually have a value in WDT register.  At the tick of this value in WDT timer, system resets /restarts. In your case you are just avoiding system reset or system restart when the execution is in ISR. I still think,  reset are happening in your your code at certain intervals. Check WDT register to find out this. 

    Usually the standard practice in MSP430 MCU is , at the very beginning of the code itself you disable WDT and go further. One usually deploys WDT only when there is system struck or software  [problem.

  • Further to sri-sri's reply, I strongly recommend you investigate why the timer ISR is taking so long to process that the watchdog fires.

    According to the user guide you're already doing what you need to in order to clear the CCIFG flag in TA1CCTL1. That implies that a single execution of the interrupt is taking a very long time. 

  • Thank you all for your answers, it really helped me to understand how thoses mechanisms work.  I will now go through the WDT part of the Msp430FR5969 family User Guide as sri-sri recommended.

    Regards, 

    Laurent.

  • The code inside the ISR take a very long time to execute. And it is not clear what the purpose it achieves. Seems to me it is a very bad scheme to use the timer. 

  • Timer interrupts are shared and you should not read offset vector if you plan to add the other ISR later.
    There are 5 jumps in the table, the last one is the overflow and no need to re-jump on the last one.
    You should never spend more than a few uSec inside the routine,
    but instead wake up main routine and pass along a flag. 

    // TIMER1_A1_ISR       ; IRS Compare1_1, 1_2  and 1_Over //
      add.w   &TA1IV,PC    ; Add Timer_A offset vector, clears ISR flag
      reti      ; CCR0 - no source
      jmp     TA1_1_Routine   
      jmp     TA1_2_Routine   
      reti      ; CCR3
      reti      ; CCR4
    TA_over       ; last spot here and such no jump is needed
      add.b #1,&mycounter   ; keep track on how many overflow
      reti

  • Tony,

    Please read what is inside the ISR of the original post, it is totally beyond my comprehension. How can you do the equivalent in main(); and what does that accomplishes?

    -- OCY.

  • You don’t have to stop the watchdog during ISR execution. But if you don’t stop it at all, you have to trigger it in time. Which your code probably didn’t do.
    Instead of stopping and re-starting it, you could as well trigger it.
    However, as the others have pointed out, it seems that your ISR is taking a long time to execute. As a general rule, you shouldn’t ever do a while of undetermined length (waiting for a hardware register or external event) inside an ISR. Never! ISRs are no threads for parallel execution. They are short, fast functions that react on an event and do the absolutely necessary things that must be done immediately. All the rest should be signaled to main and done there.
    In your case, you enter the ISR when TA1R == TA1CCR1 (or somewhat after this moment, as interrupt s may have been disabled or you might have been in a different ISR that isn’t interrupted). Then you wait until TAIR is one tock before TA1CCR1 (which is almost one complete timer cycle later which you completely waste inside the ISR). Obviously, this will exceed the watchdog timer period (which is half of a timer cycle). Then you loop and increment count and do a very time-consuming division (% is a division too, just returning the remainder instead of the result) to decide whether to increment the second. And this as long as the difference of TA1CCR1and TAR1 indicates that you missed some interrupts. Well, this is actually a good idea.
    But has a racing condition (what if TAIR has incremented to another interrupt while your code is still running? Then you count one cycle too much because you then have another interrupt pending right away in which you you’ll also increment counter)

     For the division, I suggest using a local variable static int scount, which is incremented to 1000, and when 1000 is reached, it is set to 0 and seconds is incremented. This is way faster.
    And the first while loop, well, remove it.

     

    Sri-sri: Not ‘in case of a software problem’. Usually, the watchdog is used by software in a ways that it will never trigger in case of a software problem – or trigger immediately during development. It is not meant to fix buggy software, but to detect a crashed CPU due to any reason such as brownout or ESD transients (or radiation). It can be abused for software supervising (and is usually misused for this), but that’s not its original purpose.

  • It is not a matter of doing those things inside the ISR or set a flag and do them in the main(). The matter is those things are total nonsense.

  • Thank you JMG. 

    Ohh god, for a long time, I am literally abusing the WDT . I should improve my coding skills. Thank you.

  • sri-sri: Many people use the watchdog for software bug ‘fixing’. But in most cases I’ve seen, I noticed that the way the watchdog is used is more or less useless for this purpose. To keep it from barking, it is triggered in this loop and that function and everywhere. So at the end, the software will fall into an undefined state but still keeps the watchdog triggered.
    To prove my point: why is the watchdog active by default? If it were for software supervising, it would be okay to activate it in main if wanted. The reason is that if the CPU crashed after an ESD, and is reset by the WDT, the cause for the crash may persist and only a watchdog that’s on by default (even during code initialization) will revive it if it crashes immediately again.

     You can use the watchdog for software supervising. I’ve done it in out access point. There runs a specific watchdog thread. Each thread registers itself by the watchdog and tells how often it has to report back in future. The watchdog thread (and only the watchdog thread) triggers the watchdog, but only does so if all registered threads have reported back in time. If one thread is overdue, this is written down as an error and the watchdog threads exits. And since nobody is triggering the watchdog anymore, the CPU will reset. If thread switching fails, or any one of the threads locks-up or the CPU crashes, the watchdog will still do its job and reset the system.
    Instead of a separate thread, a similar mechanism can be implemented inside a timer interrupt. It’s just important that it doesn’t simply trigger the watchdog but checks whether everything is still okay before it does.

  • I took a look at the Contiki source to try to figure out what it's doing in that timer ISR. The original source has some comments that were removed from the original post.

    ISR(TIMER1_A1, timera1)
    {
      ENERGEST_ON(ENERGEST_TYPE_IRQ);
    
      if(TA1IV == 2) {
    
        /* HW timer bug fix: Interrupt handler called before TR==CCR.
         * Occurrs when timer state is toggled between STOP and CONT. */
        while(TA1CTL & MC1 && TA1CCR1 - TA1R == 1);
    
        /* Make sure interrupt time is future */
        do {
          /*      TACTL &= ~MC1;*/
          TA1CCR1 += INTERVAL;
          /*      TACTL |= MC1;*/
          ++count;
    
          /* Make sure the CLOCK_CONF_SECOND is a power of two, to ensure
             that the modulo operation below becomes a logical and and not
             an expensive divide. Algorithm from Wikipedia:
             http://en.wikipedia.org/wiki/Power_of_two */
    #if (CLOCK_CONF_SECOND & (CLOCK_CONF_SECOND - 1)) != 0
    #error CLOCK_CONF_SECOND must be a power of two (i.e., 1, 2, 4, 8, 16, 32, 64, ...).
    #error Change CLOCK_CONF_SECOND in contiki-conf.h.
    #endif
          if(count % CLOCK_CONF_SECOND == 0) {
            ++seconds;
            energest_flush();
          }
        } while((TA1CCR1 - TA1R) > INTERVAL);
    
        last_tar = TA1R;
    
        if(etimer_pending() &&
           (etimer_next_expiration_time() - count - 1) > MAX_TICKS) {
          etimer_request_poll();
          LPM4_EXIT;
        }
    
      }
      /*  if(process_nevents() >= 0) {
        LPM4_EXIT;
        }*/
    
      ENERGEST_OFF(ENERGEST_TYPE_IRQ);
    }

    The setup in Contiki is based around a 32kHz crystal as the source for ACLK. INTERVAL is set to 1000 and CLOCK_CONF_SECOND is 32. That means the expensive divide is actually just a bitwise and*. The code above was taken from the platform/exp5438 folder, so presumably it is intended to run on a msp430f5438.

    The strange busy wait (line 9) is allegedly to work around a hardware bug where the interrupt fires when TA1R is one less than TA1CCR1 (ie the ISR is called one tock before it is due). The loop only executes while TA1R == (TA1CCR1-1). That said, I can't find the erratum that this claims to be a workaround for...

    The OP said they were trying to get this working on Wolverine, so I guess the next question is "which one?" (part number). What clock source is being used for ACLK, and what are INTERVAL and CLOCK_CONF_SECOND set to?

    * As long as the compiler is doing a good job. Fortunately this particular optimisation is trivial enough that compilers manage to apply it pretty reliably.

  • That example is no good
    Use switch (a jump table for assembler) as reading the TA1IV clears it, so only one shoot at it.
    I don't know about this bug he claim, only thing I could think of is when you set the timer up,
    clear the TA1V register before you enable GIE.
    You say you want a overflow IRQ, but your example mention CCR1.
    There is no capture register needed for overflow-IRQ as it's only happens in continues mode and is triggered when TA1R rolls around to 0 but TAIE have to be set in TA1CTL
    The MCU will reset if you do not handle the TA1IV table correctly and the overflow routine jumps in to unknown memory.
    Do you have the TA1 Reset Vector set up correctly to start with?

    #pragma vector=TIMER1_A1_VECTOR
    __interrupt void Timer1_A1(void)
    {
      switch (__even_in_range(TA1IV, 10))        // Efficient switch-implementation
      {
        case  2: TA1CCR1 += 16;                  // Add Offset to TACCR1
                 break;

        case 10: P1OUT ^= 0x01;                 // Timer1_A3 overflow
                 break;

  • I guess the real question for Laurent to answer is: are you trying to port Contiki to wolverine, or using the Contiki source as an example to help you implement something else?

    Like Tony said in his last message, that source from Contiki isn't a good example of how to use TimerA in general.

  • Robert, you’re right. The while isn’t as dangerous as it seemed to me on first glance. In fact it is indeed delaying the update of TA1CCR1^so it doesn’t happen when the timer is one tick before the current CCR1 setting. However, it doesn’t make much sense.
    The problem is that you cannot write a new value y to CCRx when the counter is already at y-1. You won’t get an interrupt then. If you just got into the ISR due to CCRx, you don’t need to wait for TxR to be no longer CCRx-1. The interrupt was already triggered at TxR==CCRx. So it if is CCRx-1 now, you had a full timer cycle latency. And it makes no difference if you wait now.

    And if TAR is already at (CCRx+INTERVAL-1), then you’re too late and won’t get an interrupt. And won’t get one if you wait another timer tick.
    So this line is completely useless.

    There’s another erratum that applies on some MSPs when the timer is stopped and restarted while being reset in the meantime. It may cause a bogus overflow interrupt. However, you’re already inside the ISR. And it has nothing to do with the current CCR1 value. (and the while will just wait and not skip the interrupt)
    So honestly, I don’t know why this line exists. Maybe someone misunderstood the erratum description and tried to be clever by not using the proposed workaround.

     Regarding the division, well, I’ve seen a compiler not doing the “/” -> “>>” optimization even though the divider was constant with a power of 2. Also, in this case, it isn’t a division but a modulo operation. And while it is the same for a real calculated division, it can’t be easily simulated by a shift. As it means saving the bits that are shifted-out, then shifting them to right-justified position. I don’t know whether a compiler is smart enough for this. And even if, it is way more complex, larger and slower than a simple shift “division”. Again, it could be optimized by generating a mask and throwing it over the original value (clearing all bits from the divider bit position up), but then again, I doubt a compiler is smart enough for this very special and uncommon case. At least it cannot be taken for granted.

     Besides this, the approach requires the “timer tick” being a rather odd value. Personally, I’d prefer a tick of (exactly) 1ms, rather than something like 1/32s or such. That’s what I use in my own projects. For delays as well as for the timestamp.

     But once more back to the original problem (re-reading the initial posting): The reset reportedly occurs when the timer overflows. I fthis observation is correct, then his could be caused by CCR0.CCIE being set somewhere else in the code, but no TIMER1_A0 ISR being defined. Since the default for CCR0 is 0, this would cause a CCR0 interrupt when the timer overflows, and since this interrupt has priority, the CPU will jump into the void, causing a PUC. And on a PUC, the timer registers are NOT reset to default, so CCR0.CCIE would remain set.

    Quick check: define a TIMER1_A0 ISR and set s breakpoint on it.

**Attention** This is a public forum