One of the things which most frustrated me when getting started with TivaWare (StellarisWare at the time) was ending up in FaultISR() when I didn't get a peripheral enabled before trying to use it or some other similar issue and having to single-step through my source to find where it went wrong. The standard advice (1) seems to be to decipher the values in the NVIC_FAULTSTAT and NVIC_FAULTADDR registers, then to manually decode the interrupt stack to determine the address of the instruction after the one that caused the fault (2). I have done that, but 99% of the time I can find the problem with much less work using a modified FaultISR() like this:
static void
FaultISR(void)
{
volatile int i = 1;
while(i)
{
}
}
With this implementation, when you pause the debugger and find that you are in FaultISR(), you can usually find the cause as follows:
- Use the debugger's Variables view to change the value of i to 0 so it will exit the loop.
- Click the C step-into button (usually twice) until the call stack display changes to show main() second from the bottom.
- Click on the second-from-the-top item in the call stack. It should show your source code.
- Look at the instruction before the one indicated. It is likely the one that caused the fault.
I suggest that TI change the default implementation of FaultISR() to something like that. The version I use also has a lot of comments in it; I'll include it below (3).
Even better, perhaps the debugger could be changed so it is able to decode the stack trace from within an interrupt service routine. I realize that the stack frame pushed by entering an ISR is different than that for a function, so there would need to be some method for deciphering which decoding method to use. If it isn't practical to do that automatically in the general case, I could imagine a way for the user to provide a hint (maybe using a checkbox) that a particular stack frame is for an ISR. Perhaps the debugger could maintain a list of symbol names which had been marked that way so it would know to treat them as an ISR the next time as well. That list could default to containing known ISRs like FaultISR(), NmiISR() and ResetISR() or could get initialized with the entries from g_pfnVectors[]. That way new users would immediately be able to see where they went wrong; my guess is that doing so would eliminate many of the questions about FaultISR() on this forum. It should be possible to decode the stack even if there are nested ISRs (higher-priority interrupts happening while processing a lower-priority one) and ISRs which call functions.
I should note that if something overwrites the stack (stack is too small, buffer overflow, etc.), the call stack will be corrupt and none of these methods will help. There are some source code comments about detecting stack overflows below (3, again).
Steve
(2) - SPMA043 describes how to manually decode the stack trace. I'll try to include an example in a follow-up post. /cfs-file/__key/communityserver-discussions-components-files/908/0842.spma043_2D00_Diagnosing-Software-Faults-in-Stellaris_AE00_-Microcontrollers.pdf
(3) - My full version of FaultISR(). Most of the differences from what I put above are comments, and some won't apply to others. A comment could be added about uninitialized peripherals being a common cause of ending up in FaultISR().
//*****************************************************************************
//
// This is the code that gets called when the processor receives a fault
// interrupt. It prints any queued debug messages (including tracepoints)
// then enters an infinite loop, preserving the system state for examination
// by a debugger.
//
// If have trouble figuring out why we get here, check to see if there is a
// way to see which vector was used. Could also make multiple ISRs and make
// each suspect vector point to a different one. Update: it looks like FaultISR(),
// unlike IntDefaultHandler(), is pointed to by only one vector. Might still
// be able to look at register values and determine what triggered the "hard fault".
// - Also consider setting a global to different values at various places in the
// code so can check its value here and see which of those places it was last set.
// Perhaps use a macro that sets a "checkpoint" (pointer to the filename or
// maybe function name and an int to the line number). Search for
// "ktowyawesctlcic". Todo.
//
//*****************************************************************************
static void
FaultISR(void)
{
// Print messages before going into infinite loop. If the COP watchdog is
// enabled, this printing probably would have happened in a bit anyway when
// watchdogTimeoutISR() got called.
extern volatile uint32_t sysTickMillisecondCount; // defined in sysTick2.c
blockWhilePrintAllDebugMessages( sysTickMillisecondCount, "FaultISR" );
//
// Enter an infinite loop.
//
// There are a couple of ways to (sometimes) determine which code was running
// when the fault occurred:
// - Exit this loop and single-step out of this ISR to the calling code.
// To trace back out of this ISR, use debugger to change the value
// of i to 0, then click the "Assembly Step Into" button several times
// (usually 4 to 6) or try the C step-into button.
// - See "Debugging - tracing how got into ISR.docx" for how to inspect
// the stack by hand and modify the PC to make the debugger show the calling
// code.
//
// If unable to trace back out, it may be because the stack got hammered.
// - It could be that the stack is too small and is overflowing.
// - There is a process for checking the available stack space documented
// in the firmware release procedure.
// - You can set a watchpoint on __stack as described in
// http://processors.wiki.ti.com/index.php/Watchpoints_for_Stellaris_in_CCS
// to get the debugger to stop at the point the stack overflows.
// - It could be that the stack frame is being overwritten even if the stack
// itself is not overflowing.
// - Can hammer the stack frame without overflowing the stack. For
// example, could have a local array on the stack and overflow it.
// - The stack pointer itself could get changed to an invalid location.
//
// Other possible ways to get clues about the cause:
// - Look at contents of stack for clues (like strings).
// - Enable IF_DEBUG_LOCKUPS_USING_TRACEPOINTS_MAIN_LOOP and similar code
// to help track down where the buffer overflow is occurring. Search for
// "ktowyawesctlcic".
// - Acquire a "reverse debugger" (perhaps using debug hardware with trace)
// so can look back to the point it all went wrong.
//
volatile int i = 1;
while(i)
{
}
}