The MCU's "WATCHDOG" - KEY CONSIDERATIONS to ENHANCE PROGRAM ROBUSTNESS BY DETECTING CODE EXECUTION FAULTS and/or (even) 'EXTERNAL DEVICE' HANGS or ERRORS!

cb1_mobile

Recently a forum poster - and several w/in my firm's staff - have, 'Run Afoul' of the MCU's Watchdog. That cannot be good - the Watchdog (WDG) when used carefully & correctly - provides great value.

This post aims to present:

an abbreviated summary of what is believed to be the 'key guidance/direction' - supplied (both) w/in the 'MCU Manual' as well as the 'Peripheral Driver's User Guide.' (PDL)
a series of (hopefully) 'thought provoking questions' - which 'Guide Users FAR Beyond the (usual) 'Cut n Paste' (i.e. effort lite) 'solutions.'
do note as well - the 'API - WDG Source Code' - which provides a 'deep dive' into 'How many/most of these functions 'really work.'
and - located at the 'end' - several issues (likely) requiring 'Vendor Inside Knowledge & Comment.' (if they'd be 'so kind.')

Follows Staff & my sense of the MCU Manual's 'Key WDG Factoids:'

A watchdog timer can generate a non-maskable interrupt (NMI), a regular interrupt or a reset when a time-out value is reached. The watchdog timer is used to regain control when a system has failed due to a software error or due to the failure of an external device to respond in the expected way.

The Watchdog Timer can be configured to generate an interrupt to the controller on its first time-out, and to generate a reset signal on its second time-out.

The Watchdog Timer module generates the first time-out signal when the 32-bit counter reaches the zero state after being enabled; enabling the counter also enables the watchdog timer interrupt.
If the timer counts down to its zero state again before the first time-out interrupt is cleared, and the reset signal has been enabled by setting the RESEN bit in the WDTCTL register, the Watchdog timer asserts its reset signal to the system. If the interrupt is cleared before the 32-bit counter reaches its second time-out, the 32-bit counter is loaded with the value in the WDTLOAD register, and counting resumes from that value.

Writing to WDTLOAD does not clear an active interrupt. An interrupt must be specifically cleared by writing to the Watchdog Interrupt Clear (WDTICR) register. The Watchdog module interrupt and reset generation can be enabled or disabled as required. When the interrupt is re-enabled, the 32-bit counter is preloaded with the load register value - and not its last state. The watchdog timer is disabled by default out of reset. To achieve maximum watchdog protection of the device, the watchdog timer can be enabled at the start of the reset vector.

And now - that from the PDL:

A watchdog timer module’s function is to prevent system hangs. The watchdog timer module consists of a 32-bit down counter, a programmable load register, interrupt generation logic, and a locking register. Once the watchdog timer has been conﬁgured, the lock register can be written to prevent the timer conﬁguration from being inadvertently altered.

The watchdog timer can be conﬁgured to generate an interrupt to the processor after its ﬁrst timeout, and to generate a reset signal after its second timeout. The watchdog timer module generates the ﬁrst timeout signal when the 32-bit counter reaches the zero state after being enabled; enabling the counter also enables the watchdog timer interrupt. After the ﬁrst timeout event, the 32-bit counter is reloaded with the value of the watchdog timer load register, and the timer resumes counting down from that value. If the timer counts down to its zero state again before the ﬁrst timeout interrupt is cleared, and the reset signal has been enabled, the watchdog timer asserts its reset signal to the system. If the interrupt is cleared before the 32-bit counter reaches its second timeout, the 32-bit counter is loaded with the value in the load register, and counting resumes from that value. If the load register is written with a new value while the watchdog timer counter is counting, then the counter is loaded with the new value and continues counting.

WatchdogIntClear() Because there is a write buffer in the Cortex-M processor, it may take several clock cycles before the interrupt source is actually cleared. Therefore, it is recommended that the interrupt source be cleared early in the interrupt handler (as opposed to the very last action) to avoid returning from the interrupt handler before the interrupt source is actually cleared. Failure to do so may result in the interrupt handler being immediately reentered (because the interrupt controller still sees the interrupt source asserted). This function has no effect if the watchdog timer has been locked.

With the above as 'reference' - these, 'WDG DESIGN STRATEGIES' aim to, 'Boost User Recognition - then Understanding' - thus Success!

'Tasks Required' to implement a successful Watchdog: (these should be considered - and resolved - by those employing the WDG)

How is the WDG 'Timeout Period' to be determined - then set - by the MCU User?
What is the 'proper method' to 'Clear the WDG Interrupt?'
Identify the, 'Time Periods of most consequence' - when employing the MCU's WDG.
Where should the, 'Code which services the WDG Interrupt' be placed w/in the user's program? Why?
Is it 'necessary' to, 'Do other than simply, 'Reload the WDG Load Register' - to prevent the (potentially) unwanted, MCU's Resetting?
What if anything can be done to detect & record an 'external device's Hang and/or Fault' - which delays (even blocks) the 'Clearing of WDG Interrupt?'

And to the Vendor:

Kindly note the last sentence - w/in the last (PDL) paragraph (WatchdogIntClear() ). "This function (i.e. WatchdogIntClear() ) has no effect if the watchdog timer has been locked." Can this be true? Does this not mean that the WDG 'Cannot be Cleared' - if & when 'Locked?' That cannot be true - or more likely - that language is poor & misleading.

Another point of note - NOWHERE could Staff nor I find the 'Direct Statement of the 'Initial Value' Loaded into the WDG - prior to (any, especially the first) WDG Interrupt!' Our only 'Means of Discovery' came from, 'Review of the '(WDTLOAD) Register' - which defaults to, '0xFFFF.FFFF' upon Reset. (which yields, '53.7 Seconds @ 80MHz' & '35.8 Seconds @ 120MHz.') While never expressly stated - it appears that the, 'Content of the 'WDTLOAD' Register' is employed - as soon as the WDG is enabled. Is that so? Thank you - your 'inside knowledge' is appreciated...

over 5 years ago

Genatco over 5 years ago

Guru 55913 points

cb1_mobile said:
indly note the last sentence - w/in the last (PDL) paragraph (WatchdogIntClear() ). "This function (i.e. WatchdogIntClear() ) has no effect if the watchdog timer has been locked." Can this be true? Does this not mean that the WDG 'Cannot be Cleared' - if & when 'Locked?' That cannot be true - or more likely - that language is poor & misleading.

No - it seems not true after all.

Register 4: Watchdog Interrupt Clear (WDTICR), offset 0x00C
This register is the interrupt clear register. A write of any value to this register clears the Watchdog
interrupt and reloads the 32-bit counter from the WDTLOAD register. Write to this register when a
watchdog time-out interrupt has occurred to properly service the Watchdog. Value for a read or
reset is indeterminate.
Note: Locking the watchdog registers by using the WDTLOCK register does not affect the WDTICR
register and allows interrupts to always be serviced. Thus, a write at any time of the WDTICR
register clears the WDTMIS register and reloads the 32-bit counter from the WDTLOAD
register. The WDTICR register should only be written when interrupts have triggered and
need to be serviced.

However locking WDG blocks write access to other control registers.

Arm-based microcontrollers

Arm-based microcontrollers forum

The MCU's "WATCHDOG" - KEY CONSIDERATIONS to ENHANCE PROGRAM ROBUSTNESS BY DETECTING CODE EXECUTION FAULTS and/or (even) 'EXTERNAL DEVICE' HANGS or ERRORS!