This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/CC2640R2F: Why sometimes my watchdog does not execute watchdog's callback and reset directly?

Part Number: CC2640R2F
Other Parts Discussed in Thread: CC2640

Tool/software: TI-RTOS

My development environment is like below

- IAR 7.80.3

- simplelink_cc2640r2_sdk_1_35_00_33

- Base on simpleBLEperipheral project

I have enabled watchdog function and it works fine.

In order to store some important configuration, my watchdog will call the callback function when watchdog overflow.

And call SystemReset() after storing configuration.

I find sometimes callback function not be taken and cc2640r2f restart directly.

Do you have any idea?

  • Hi David,
    Could you explain in more detail what is executed upon watchdog overflow?
  • Hi Joakim,

    The following is my code.

    Please reference it.

    static void SimpleBLEPeripheral_init(void)
    {
      ...
    
      Board_initWatchdog();
      
      Watchdog_Params_init(&params);
      params.resetMode = Watchdog_RESET_OFF;
      params.callbackFxn = SimpleBLEPeripheral_watchDog_cb;
      hWdt = Watchdog_open(Board_WATCHDOG0, &params);
      if (hWdt == NULL) {
        // Error opening watchdog
        for (;;){}
      }
      // set timeout period to 1000 ms 
      wdtTicks = Watchdog_convertMsToTicks(hWdt, 1000);
      Watchdog_setReload(hWdt, wdtTicks);
    
      ...
    
    }
    
    
    static void SimpleBLEPeripheral_processAppMsg(sbpEvt_t *pMsg)
    {
      ...
    
        case SBC_SS_RST_EVT:
          {
            RecoveryStatusStruct mStruct;
            
            if(osal_snv_read(0x83,sizeof(RecoveryStatusStruct),&mStruct)!=SUCCESS)
              mStruct.rstCounter = 0; // init reset counter
            
            // Store tag
            mStruct.rstTag = 0xA5;
            
            // Store oled is on or off
            if(BSP_OLED_Status()) // isOledOn
              mStruct.isOledOff = 0;
            else
              mStruct.isOledOff = 1;
            
            // Store timeticks
            mStruct.utcTimes = (uint32_t)UTC_getClock();
            
            // Store reset counter
            if(mStruct.rstCounter<127)
              mStruct.rstCounter++;
            else
              mStruct.rstCounter = 0;
    
    #ifdef BK_USE_UART
            char txBuf[30] = {0};
            tiny_sprintf(txBuf, "Pre(hex):%02X,%08X,%02X,%01X\r\n", mStruct.rstTag, mStruct.utcTimes, mStruct.rstCounter, mStruct.isOledOff);
            logPrinter_send((uint8_t*)txBuf, strlen(txBuf));
    #endif // BK_USE_UART
            
            if(osal_snv_write(0x83,sizeof(RecoveryStatusStruct),&mStruct)==SUCCESS)
            {
    #ifdef BK_USE_UART
              char txBuf0[30] = {0};
              tiny_sprintf(txBuf0, "Pre(hex):Success to write.\r\n");
              logPrinter_send((uint8_t*)txBuf0, strlen(txBuf0));
    #endif // BK_USE_UART
              BSP_SYSTEM_RESET();
            }
            else
            {
    #ifdef BK_USE_UART
              char txBuf1[30] = {0};
              tiny_sprintf(txBuf1, "Pre(hex):Fail to write.\r\n");
              logPrinter_send((uint8_t*)txBuf1, strlen(txBuf1));
    #endif // BK_USE_UART
            }
    
            Watchdog_clear(hWdt);
            SimpleBLEPeripheral_enqueueMsg(SBC_SS_RST_EVT, 0x00);
          }
          break;
    
        default:
          // Do nothing.
          break;
      }
    }
    
    
    void SimpleBLEPeripheral_watchDog_cb(uintptr_t handle)
    {
      Watchdog_clear(hWdt);
      SimpleBLEPeripheral_enqueueMsg(SBC_SS_RST_EVT, 0x00);
    }

    Thanks.

  • Hi,

    I'm not sure that you can call any RTOS or stack APIs in the watchdog callback function. Also since the watchdog interrupt is a non-maskable interrupt (NMI) it could occur at times that the regular code thinks it's protected and atomic.

    If I were to guess, I'd say that something in the WDT NMI is causing a cpu fault, which would (per the arm docs) lead to a cpu lockup, and (according to the TI reference swcu117) a reset.

    Luckily, you can read some registers on boot to detect what caused the reset to happen. It's a bit of a trick, basically we must do two things

    * Disable Warm-reset-to-pin-reset, so that WDT/LOCKUP are tracked as reset sources
    * Read out the Warmreset status bits before the regular reset functions clear the result

    #include <ti/devices/cc26x0r2/inc/hw_prcm.h>
    
    volatile uint32_t warmreset_stat __attribute__((noinit));
    void extra_reset_fxn()
    {
        warmreset_stat = HWREG(PRCM_BASE + PRCM_O_WARMRESET);
    }
    
    void *mainThread(void *arg0)
    {
        Watchdog_Handle watchdogHandle;
        Watchdog_Params params;
        uint32_t        reloadValue;
    
        HWREG(PRCM_BASE + PRCM_O_WARMRESET) &= ~PRCM_WARMRESET_WR_TO_PINRESET_M;

    Also in the RTOS config (.cfg) file, at the very top, this reset function must be added as the first to execute:

    var Reset = xdc.useModule('xdc.runtime.Reset');
    Reset.fxns[Reset.fxns.length++] = '&extra_reset_fxn';

    It would be interesting to see, after a reset, what the value of warmreset_stat is. 0x01 would indicate WDT, 0x02 LOCKUP.

    Best regards,
    Aslak

  • Hi Aslak,

    I have added the code you mentioned in my workspace like below.

    @simple_peripheral.c
    
    static void SimpleBLEPeripheral_init(void)
    {
      ...
      HWREG(PRCM_BASE + PRCM_O_WARMRESET) &= ~PRCM_WARMRESET_WR_TO_PINRESET_M;
      ...
    }
    
    #include <ti/devices/cc26x0r2/inc/hw_prcm.h>
    volatile uint32_t warmreset_stat __attribute__((noinit));
    void extra_reset_fxn()
    {
      warmreset_stat = HWREG(PRCM_BASE + PRCM_O_WARMRESET);
    }
    
    @source/ti/blestack/common/cc26xx/kernel/cc2640/config/cc2640_r2_csdk.cfg
    
    /* Debug */
    var Reset = xdc.useModule('xdc.runtime.Reset');
    Reset.fxns[Reset.fxns.length++] = '&extra_reset_fxn';
    ...
    // var Reset = xdc.useModule('xdc.runtime.Reset');

    But the board cannot reset like before and I need to re-power on to reset the board, it like stuck in the reset process.

    Then I try to mark the code of "Disable Warm-reset-to-pin-reset" and I check another register in the self-defined reset function(extra_reset_fxn()), the code like below:

    @simple_peripheral.c
    
    #include <ti/devices/cc26x0r2/inc/hw_prcm.h>
    #include <ti/devices/cc26x0r2/inc/hw_aon_sysctl.h>
    
    __no_init volatile uint32_t warmreset_stat;
    __no_init volatile uint32_t reset_ctrl;
    void extra_reset_fxn()
    {
      warmreset_stat = HWREG(PRCM_BASE + PRCM_O_WARMRESET);
      reset_ctrl     = HWREG(AON_SYSCTL_BASE + AON_SYSCTL_O_RESETCTL);
    }
    
    static void SimpleBLEPeripheral_init(void)
    {
      ...
    //HWREG(PRCM_BASE + PRCM_O_WARMRESET) &= ~PRCM_WARMRESET_WR_TO_PINRESET_M;
      ...
    }

    I get the following log message.

    ---------------------Power On--------------------
    17:29:35, [[Device Initial]]
    17:29:35, Reg     : WARMRESET ,RESETCTL
    17:29:35, Reg(hex): 0x00000004,0x00003AF2 --> RESET_SRC:1h = Reset pin
    17:29:35, snv_read Fail!!
    17:30:16, Pre:Success to snv_write.
    ---------------------Reboot--------------------
    17:30:16, [[Device Initial]]
    17:30:16, Reg(hex):0x00000004,0x00003AFC --> RESET_SRC:6h = Software reset via SYSRESET register
    17:30:16, Rst:Success to snv_read.
    17:31:03, Pre:Success to snv_write.
    ---------------------Reboot--------------------
    ...
    ---------------------Reboot--------------------
    17:38:21, [[Device Initial]]
    17:38:21, Reg(hex):0x00000004,0x00003AFC --> RESET_SRC:6h = Software reset via SYSRESET register
    17:38:21, Rst:Success to snv_read.
                                             --> [Error] Didn't execute the code of storing the current status
    ---------------------Reboot--------------------
    17:39:04, [[Device Initial]]
    17:39:04, Reg(hex):0x00000004,0x00003AFE --> RESET_SRC:7h = Software reset via PRCM warm reset request
    17:39:04, Rst:Success to snv_read.
    17:39:43, Pre:Success to snv_write.
    ---------------------Reboot--------------------
    17:39:43, [[Device Initial]]
    17:39:43, Reg(hex):0x00000004,0x00003AFC --> RESET_SRC:6h = Software reset via SYSRESET register
    17:39:43, Rst:Success to snv_read.
    17:40:30, Pre:Success to snv_write.
    ---------------------Reboot--------------------
    ...
    

    Why the data of WARMRESET register didn't been changed after reset by my watchdog?
    Does it cause by that the resetMode of my watchdog is Watchdog_RESET_ON?

    According to the above information, should we say that something in the WDT NMI is causing a CPU fault and a reset?
    If yes, how can I do to prevent this error?

  • Hi,

    Yes I should have explained better. The reason we need to disable WR_TO_PIN is that if we do not, then the reset cause is lost.

    But as you figured out, you can only do one "unplanned" reset this way, and then you have to press the reset button on the launchpad or power-cycle. This is becaucse we need a warm reset (lockup/wdt) to be converted to emulated pin reset in order for it to be a proper system reset.

    Unfortunately since you commented out that line, the status doesn't tell us anything useful.

    Best regards,
    Aslak
  • Hi Aslak,

    In order to check that I can get the correct data of RESTCTL register,

    I change the reset mode of the watchdog into "Watchdog_RESET_ON" and disable "Warm-reset-to-pin-reset".

    Let the watchdog timeout. Then my device will halt in some process.

    I get the data of RESTCTL register is 0x04 after power-cycle.

    Why the data of RESTCTL register I read is 0x04 after power-cycle? (I think it should be 0 by default. (ref:swcu117.pdf, section 6.8.2.4.37))

    Thanks.

  • Hi David,

    Yes. RESETCTL should be 0x00 if you power-cycle, 0x01 if you do a pin reset. 0x04 would indicate a VDDR brown-out reset. If this is your own board, it's possible that you somehow get a brown-out during your power-cycle, depending on your power source and layout. I don't know too much about this.

    Best regards,
    Aslak
  • Hi Aslak,

    Thank you so much for your help.

    I'm so sorry that I write the wrong register name in the last post.
    And I actually want to say is the WARMRESET register, not RESTCTL register.

    So my question is the following, 

    In order to check that I can get the correct data of WARMRESET register,

    I change the reset mode of the watchdog into "Watchdog_RESET_ON" and disable "Warm-reset-to-pin-reset".

    Let the watchdog timeout. Then my device will halt in some process.

    I get the data of WARMRESET register is 0x04 after power-cycle.

    Why the data of WARMRESET register I read is 0x04 after power-cycle? (I think it should be 0 by default. (ref:swcu117.pdf, section 6.8.2.4.37))

    Thanks.

  • Hi,

    Luckily that is easier to answer. It has to be because you read out the value of the register _after_ the setup.c :: trimDevice() reset/init function has executed and done a read/modify/write that clears the status bits and enables WR_TO_PIN.

    Did you add the cfg file lines at the top of the file? Most importantly before the Boot module.

    Best regards,
    Aslak
  • Hi Aslak,

    Yes, I have added the following code at the top of the cfg file,cc2640_r2_csdk.cfg.

     ...
     ******************************************************************************
     Release Name: simplelink_cc2640r2_sdk_1_30_00_25
     Release Date: 2017-03-02 20:08:35
     *****************************************************************************/
    
    var Reset = xdc.useModule('xdc.runtime.Reset');
    Reset.fxns[Reset.fxns.length++] = '&extra_reset_fxn';
    
    /* ================ ROM configuration ================ */
    /*
     * To use BIOS in flash, comment out the code block below.
     */
    if (typeof NO_ROM == 'undefined' || (typeof NO_ROM != 'undefined' && NO_ROM == 0))
    {
    ...

    Is it the correct place to add codes?

  • Hi,

    Yes this seems like a decent place. Personally I put it at the top of the applications app_ble.cfg, but same difference.

    This should result in a generated file <Output Folder>/configPkg/package/cfg/app_ble_pem3.c which should have a function xdc_runtime_Startup_reset__I(void) where extra_reset_fxn() should be the first one called. You should be able to find this by breaking in the extra_reset_fxn and navigating the call stack.

    Best regards,
    Aslak
  • Hi Aslak,

    Because the project has not had enough time, the cause of this problem may not continue in a short time, I must find other ways to solve the problem.

    I put the results of the experiment on my Evernote(link). I hope it could be useful for somebody.

    Assuming that this abnormal restart is caused by CPU LOCKUP, is there any way to solve or avoid this problem?

    Or is there any other way to keep some important parameters before the watchdog reboots the system?

  • Hi David,

    My guidance would be that when you are in a WDT interrupt this is because something is potentially seriously wrong. For that reason you should probably not rely on RTOS and Stack API calls working. The safest thing to do would be to use only driverlib functions. In addition, you are running from a Hwi context with a different than normal stack pointer, and you don't necessarily know how much stack you have available.

    You don't really know if it's in the middle of an SNV operation so that's probably not very safe, at least not bullet-proof.

    You could probably use the AUX ram (2kb) as an area to store crash information, and then read from it on boot. This would be mostly memory access, and should work regardless. See the User's Guide ( dev.ti.com/.../node ) --> Application --> Memory Management --> Using the AUX RAM as RAM for how to configure the linker to be aware of this area. You'd probably need to tell the linker not to initialize the sections you place there. Alternatively just use pointer access to this memory region.

    Unfortunately this WDT / crash dump scenario isn't something we have documented very well.

    Best regards,
    Aslak