ResetISR

Akira Mitsui

Hello to all.

I am having a reset issue in my code, without success in finding the cause of the reset.

I am using TM4C123FH6PM device,

I debugged the code and put breakpoints at all points where it performs software reset (ROM_SysCtlReset), also i put breakpoints at the FaultISR, NmiSR and IntDefaultISR to check if it enters any of these conditions.

Then i found out the code goes directly to the reset handler:

void ResetISR(void)
{

HWREG(NVIC_CPAC) = ((HWREG(NVIC_CPAC) &
~(NVIC_CPAC_CP10_M | NVIC_CPAC_CP11_M)) |
NVIC_CPAC_CP10_FULL | NVIC_CPAC_CP11_FULL);

__iar_program_start();
}

At the time the code reaches the reset handler, I tried checking the registers but didn't find any clue: PC points to the reset command line, LR is 0xFFFFFFFF, SP points to 0x2000779C which in memory is 0x00, 0x00, 0x00, 0x00 and from there the memory just shows 0xFF. The NVIC_FAULT_STAT (0xE000.ED28) just shows 0x00000000.

I'll appreciate any help with this. Thanks in advance.

over 9 years ago

0 cb1_mobile over 9 years ago

Guru 117855 points

Akira Mitsui said:
I am having a reset issue in my code, without success in finding the cause of the reset.

Indeed you've provided that info which you believe "necessary" - yet can that same (limited) info qualify as "sufficient?" (I think not)

May I suggest the following strategy to amplify understanding?

Usually one's code is represented by a collection of "functions" called sequentially from w/in "main." That's your case - is it not?
It is doubtful that each/every one of those functions lands you w/in your Reset ISR. That's true - is it not?
Armed w/the results of the above - you (then) may focus properly upon the "offending" function.

Now we know little of your board! I don't recognize that MCU's suffix as the one placed upon LPads. Is your board custom? Has it ever worked? You (surely) built "more than one" so that dreaded, "Single Board Anomaly" will not torment - did you not? Such "extra detail" moves your post (further) up the, "Sufficient Info" scale.

No description of your power design - its adequacy - and any/everything attached to your board appears. Should the MCU's power be glitched - reset is a predictable result. We've seen instances where user code causes GPIO and/or bus contention - often that's sufficient to reset the MCU - the info w/in your post does not enable such detection.

It is assumed your board design meets the published guidelines from this vendor. (they are quite good - imo)

Self diagnosis (alone) - at the physician's office or here - may not yield the depth of info required...

0 f. m. over 9 years ago

Guru 11940 points

Brown-out ?

Watchdog reset ?

Uncaptured exception ?

0 Akira Mitsui over 9 years ago in reply to cb1_mobile

Prodigy 210 points

Thank you for the quick reply.
The program runs for a random time and then it reset itself. It enters the ResetISR() function by itself. I am trying to find what is the cause of this but can't find any clue.

0 cb1_mobile over 9 years ago in reply to f. m.

Guru 117855 points

Greetings Frank,

Indeed - all are possible - none reported as, "Tested & Cleared" from "usual suspect" list... (pop up one post above yours - how closely we overlapped!)

0 cb1_mobile over 9 years ago in reply to Akira Mitsui

Guru 117855 points

Reread my posting - detailed procedure for assisting resides w/in. (Net "in/out" I respond in pieces so, "All is not lost."

0 Akira Mitsui over 9 years ago in reply to cb1_mobile

Prodigy 210 points

Usually one's code is represented by a collection of "functions" called sequentially from w/in "main." That's your case - is it not?
Yes, that is the case.
It is a custom board, we have never done a long enough test to know if this problem is recent or it has "ever worked". We do have more than one board and we are testing two at the moment, one of them reset itself after 4 days, and the other after 7 days.
Sorry for the ignorance, but i didn't find any good explanation of the ResetISR function, is it possible to the MCU to end up in this interrupt because any bug in the code?or is it quite doubtful/impossible? Do you think it is more probable to be a hardware problem?
About the GPIO pins, the ones that are not used are set as outputs, what exactly did you meant by bus contention?
Thanks again.

0 cb1_mobile over 9 years ago in reply to Akira Mitsui

Guru 117855 points

Once again you've provided much "necessary" information - but (still) insufficient info.

Four to Seven days of operating success increases the "degree of difficulty" in resolving your issue. Often that frequency of failure can be vastly increased - which greatly speeds & eases - the identification of the issue.

As your board is self-made - and we've no idea of your/firm's experience/capability - the design & execution of the build are "suspect." As stated previously - vendor here has a comprehensive, "Board design guide" - surely you'll find links to it (up top - among "Blogs, Groups, Videos") or perhaps NOT - yet "know" that the key design guide is "hidden away" somewhere! (it is not unusual to employ Google to find such vendor, key docs!) Note: italics aimed toward esteemed vendor - not to you - "forever" we outsiders have begged for, "Key/Critical/Essential MCU data" to be removed from, "Top Secret" (buried, hidden, obscured AND inconsistent) repository! Maddening AND Self-Destructive! (and persistent!)

Power always is critical - any surge or software fault which connects two MCU outputs may draw sufficient current to cause the MCU top reset. That reset "clears the "short or near-short circuit" and the MCU continues. Bus contention occurs when one or several pins - from (usually) different devices - are interconnected and each side is set as "output." (each output then is placed in a potentially damaging condition - enough current may flow to reset the MCU - often one (or both) components are damaged during this unwanted event.)

IMO - you must eliminate any/all power supply design & implementation weaknesses first - then systematically exercise each/every software function - ideally at a far higher rate - so that you can, "CAUSE the ISR Fault SOONER rather than LATER!"

0 Akira Mitsui over 9 years ago in reply to cb1_mobile

Prodigy 210 points

I see.
In that case, I will search for the board design guide and check the board.
Also i will examine the possibility of the MCU outputs short circuit and bus contention. By the way, i was thinking about the possibility of a stack overflow, could that cause the program to enter the ResetISR() function?

0 f. m. over 9 years ago in reply to cb1_mobile

Guru 11940 points

Beside cb1's points, I would not rule out a 'rare' performance overload condition. An unexpected high interrupt cadence might bring your system down, perhaps with a 'simple' stack overflow. Such things can go unnoticed for a long time. And a stack overflow usually adds an accidental component to resulting hardfault.

Stress-testing you board in two ways might provide more insight. The first - test your hardware for extended periods (hours) at elevated and decreased temperatures, monitoring that your software runs properly. Second - test your software with as much external input events and combinations thereof (whatever this is in your case) as possible, even if they seem 'impossible' during normal operation. Whilst the discovered issues might not be relevant for the product, it increases knowledge and confidence in your device.

0 cb1_mobile over 9 years ago in reply to f. m.

Guru 117855 points

Bravo f.m. - all extremely pertinent & insightful - not earlier presented!

When you note "seems impossible" may I add, "Especially when such, "seems impossible!"

Too often - "impossible" describes our ability to predict & fully account! Reality - in the guise of "Murphy" - has no such shortcoming...

0 Akira Mitsui over 9 years ago in reply to f. m.

Prodigy 210 points

Interesting.

Well, i will test and analyze the board and the program further with those advices, hoping to resolve this problem.
I appreciate both for your help and time.
Thank you.

0 cb1 over 9 years ago in reply to Akira Mitsui

Guru 47900 points

May I suggest that you heed poster f.m.'s advice re: Do everything in your power to, "Cause those failures more quickly!" Even tests which "PASS" for 168 hours (7 days) may well fail upon the start of Day 8! (My firm experienced just that - AND failures after multiple weeks of continuous, otherwise successful operation.)

You may recall a major auto firm's struggle w/"Unintended Acceleration." Great periods of testing were performed - yet "Live - in the court-room" an unusual sequence of operations were launched - and these ALWAYS caused the failure - and Judge, Jury, Attorneys (for both sides) and witnesses were ALL STUNNED! That failure was undoubtedly PROVED - right there - right then - and the auto-firm was (properly) fined severely!

Duration of testing thus - is poor substitute for, "Cleverness and Aptness of Testing!" Neither quick nor easy! And (both) very Necessary!

0 Petrei over 9 years ago in reply to Akira Mitsui

Guru 26105 points

Hi,

It is not quite clear from your description - when you start debugging then the debugger stops immediately at ResetISR or after some running period? If the first cause, then did you configured your IDE to stop at main at startup or to ResetISR? Did you set a breakpoint at main and watched if it is reached?

0 cb1 over 9 years ago in reply to Petrei

Guru 47900 points

Hi Petrei,

His report is long & bit scrambled - yet he did state his boards ran (successfully) for 4 & 7 days. I would not expect the debugger to be attached during those runs...

0 Robert Adsett over 9 years ago

Guru 27665 points

First, ResetISR will leave little to no trace in the registers as to it's source except for the ResetCause register. There is a TivaWare function to read that SysCtlResetCauseGet. You should read that immediately on reset and save it/display it. I would then clear it so each reset event was distinctly delineated as much as possible.

Second as cb1 suggests do as much as you can to increase the frequency of the resets.

One of the things I do is log occurrences of resets and their source. It can be just a count of occurrences or it could contain more information.

Robert

0 Akira Mitsui over 9 years ago in reply to Robert Adsett

Prodigy 210 points

Thanks to all who replied.

Petrei, i am using IAR Workbench. I have the breakpoints on all code lines where the function SysCtrlReset() is, also on FaultISR(), NmiSR() and ResetISR(). And actually the ResetISR function is reached as soon as I start debugging, because it prepares the system for the C(++) program (so I have read). Then it reaches to the main and continue in an infinite cycle where the tasks are processed by flags activated by different timers interrupts, also the watchdog timer is reinitialized every cycle.

cb1, yes the debugger was attached during those 4/7 days because we need to fix the problem urgently. After that period the program stopped at the ResetISR() function again, without stopping at any other breakpoint before, any clue on the registers, and the program reboots, starting again from main.

I already increased the interrupts frequency in order to verify if there is a stack overflow and to increase the frequency of the resets, so far it is not resetting.

0 Robert Adsett over 9 years ago in reply to Akira Mitsui

Guru 27665 points

Akira Mitsui said:
After that period the program stopped at the ResetISR() function again, without stopping at any other breakpoint before, any clue on the registers, and the program reboots, starting again from main.

Check the resetcause register. It will tell you why you reset

Robert

0 cb1 over 9 years ago in reply to Robert Adsett

Guru 47900 points

While poster reports, "W/out stopping @ any breakpoint" - should the "Event" be sufficient - might that "ResetCause Register" be corrupted?

0 Akira Mitsui over 9 years ago in reply to Robert Adsett

Prodigy 210 points

I just wrote the code to read the cause of the Reset with the SysCtlResetCauseGet function at the initialization of the program and save it, and after that i clear it in order to distinguish between events, as you suggested. Now i am just waiting it to happen and see what i get from it.
Thank you.

0 Robert Adsett over 9 years ago in reply to cb1

Guru 27665 points

Very unlikely I think, If you clear the register after reading it on startup then if only a single bit is set you can be sure it's the cause. Only if multiple bits are set will there be any question.

Robert

0 Robert Adsett over 9 years ago in reply to Robert Adsett

Guru 27665 points

I had to use this myself to track down EMI caused resets.

Robert

0 cb1 over 9 years ago in reply to Akira Mitsui

Guru 47900 points

Akira Mitsui said:
I already increased the interrupts frequency in order to verify if there is a stack overflow

While conventional wisdom urges you to INCREASE Stack Size - might it prove insightful to DECREASE it? Then note "IF & to what DEGREE" the issue persists.

While we know, love & believe in IAR (paid, multi-seat) attaching any such debug probe while a system is undergoing "Serious Test" adds, "intrusion variables" which may confound your test & results! Many serious test facilities "disallow" debug probe attachment while tests are active...

Arm-based microcontrollers

Arm-based microcontrollers forum

ResetISR