This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

LAUNCHXL2-570LC43: Debugging call stack using CCS after exception vector is encountered

Part Number: LAUNCHXL2-570LC43

I have a bug/failure I need to figure out, running bare-metal software on a XL2-570LC43.  Basically, I have code that continuously transmits on CAN using interrupts and after some 2 hours of running, the code encounters a failure resulting in one of the low (reset, undef, prefetch) interrupt vectors triggering.  CCS does not show the call stack in this case.  In the 2+ hours, millions of messages have been sent, so it's not clear this is a bug of the CAN portion of the software.  Could be something like some un-handled interrupt.  The failure I'm seeing doesn't only occur with one vector, with reset, undef, and prefetch occuring at different times.

resetEntry:
        b   _bl_c_int00
undefEntry:
        b   undefEntry
svcEntry:
        b   svcEntry
prefetchEntry:
        b   prefetchEntry

Does anyone already have a good way of doing this, or should I go figure out the ARM (big-endian) stack frame and write my own debug code?

Basically, what I need to do is what was done towards the end of this thread...

https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/688822/tms570ls1224-bootloader-application-jump-leads-to-undefentry-after-interrups-are-enabled

It'd be so nice if CCS had facilities to do this, in the same way the normal stack frame is displayed for analysis.  Maybe with aid of target-side code.

Thank you very much for any help.

  • It'd be so nice if CCS had facilities to do this, in the same way the normal stack frame is displayed for analysis. 

    While CCS currently doesn't have any in-built support for this, with Cortex-M4F devices have previously used a custom GEL script to unwind the stack to get CCS to display the context of the exception. See CCS/TM4C1294KCPDT: How do I get the stack unwound in exception handlers?

    Some changes would be required to the GEL script to handle any differences in the exception stack frame between a Cortex-M4F and Cortex-R5F, but could maybe form a starting point.

    At least with a GEL script it just runs on the PC in CCS, rather than needing target-side code.

  • Hi,

    I need serious help on debugging.  A couple of days ago, I was seeing the problem above as described, with a combination of "spontaneous" resets, undefEntry, or prefetchEntry errors, with running times as long as 8 or 10 hours before the error is encountered.

    As of yesterday, the problem has morphed to one of spontaneous resets or dataEntry errors that can happen from 10 minutes to hours.  This may be "better or worse" depending on whether the ability to run longer is desirable or hitting some error more quickly would allow more experimentation or analysis.  When the dataEntry error occurs, the fault address, R14_ABT, is not (necessarily) at the same place (as somewhat expected).  I believe R14_ABT - 0x8 is the faulting PC, based on some forum article.  I'm not even sure whether chasing the dataEntry error would lead to my actual cause, but it's a start, and it's happening consistently/often enough.

    For one thing, I have no idea how the symptoms have changed so much (from undefEntry and prefetchEntry, to dataEntry, and having running times of 6+ hours to minutes), since I haven't changed my source code in any real way.  In some sense, the seeming "stability" has vaporized overnight.

    In any case, I don't know what steps to take to decipher the dataEntry fault.  I'm not quite certain what processor mode I'm in (I think User, or maybe Interrupt).  I've tried looking at other data abort forum articles and the Technical Reference Manual, but have not gotten a solid handle on what steps need to be performed.  Several of the articles elude to debugging steps, but seemed to have reached solutions without really having to use the results of the debugging to get to fixes.  One thing that I'm struggling with is the number of Registers and what particular ones does exactly what (sometimes not even sure where to find the register mentioned in particular articles).

    Thank you very, very, much in helping me figure out the necessary debugging steps with the Cortex-R5F, or directing me to the right information.

  • So, in one failure case, I see R14_ABT to be 0x112cc, which leads to a PC of 0x112c4.  I look at the assembler and it seems r0 is being loaded with 0 (core register value) from [r13] (link/frame register??).  The value of r13 is 0x8002800 (which contains 0), which is supposedly my Undefined Stack Base (I doubled the size of my stacks from default, just in case I was hitting some stack overflow issues).  The C code is supposed to be playing with some CAN message boxes, AFAICT.  Don't know what the Undefined Stack is used for, or even if I'm looking at the right r13.

    I don't know how I got here.  Any help appreciated highly.

    000112b0: E59DC00C ldr r12, [r13, #0xc]
    000112b4: E35C0008 cmp r12, #8
    000112b8: 3AFFFFEE blo $C$L107
    2603 node->IF1NO = (uint8) messageBox;
    $C$L108:
    000112bc: E59D0000 ldr r0, [r13]
    000112c0: E5DDC007 ldrb r12, [r13, #7]
    000112c4: E5C0C103 strb r12, [r0, #0x103]
    2605 success = 1U;
    000112c8: E3A0C001 mov r12, #1
    000112cc: E58DC010 str r12, [r13, #0x10]
    2614 return success;

  • Actually, 0x8002800 is the Undefined Stack Base, so it seems the code is reading beyond the stack to get the 0 value.  Don't know why r13 holds that value.

  • I tried to take the GEL project and port it for the XL2-570LC43, but can't figure out how it's supposed to work.  With a few tweaks, I was able to build and load it, but don't know how the code is supposed to run (if it is running).  Most/all my previous projects will start-up stopped at main() in HL_sys_main.c, but this one does not (don't know where to put breakpoint, if needed).

    I probably need to tweak more things.

    Thanks.

  • Hi Peter,

    Have you resolved your issue?

  • I did "solve" my problem (the one that seemed to require the need to trace the stack across exception vectors), in a totally different way.  I would really love to be able to get the GEL project, which presumably does the stack trace across exception vectors without target code support, going on the TMS570-LC43.  This would help me both debug future oddball problems, as well as maybe understand better how this particular ARM processor runs contexts and stacks/registers.  Right now, I'm still often guessing how the processor does things (using transferred experience from other older-generation processors that are more simplistic).

    Do you think getting the GEL project ported is a viable goal, or should I abandon this desire?  In reality, most cases of (catastrophic) exception vectors are due to faulty memory scribbling that happened some time earlier, and stack tracing won't help that much anyway.

    Thank you very much.

  • Hi Peter,

    I don't know how to use GEL project to trace the stack usage. Do you know any MCU or MPU devices use this approach to trace the stack?  

  • Hi, According to Chester, the GEL facility should be able to trace the stack, since it's just some "macro" language that deciphers the machine state in the same way a human being would interpret what the compiler/architecture constructed.  I'm not that familiar with ARM architecture (at this point), but am slowly learning how to do this manually.  I will be posing some questions on ARM in another thread.  From what I can tell, when an exception happens, ARM will enter a different machine mode, and leave the faulting mode intact (with its registers and stack), so it should be even easier to do stack traces.  The CCS debugger itself is already quite "powerful" in providing the user visibility into the machine state, but right now the user still needs to connect the dots.  Presumably, what the GEL facility does (in helping connect the dots) could have been built into CCS (in future releases?).

    I've used GDB much more, and have used its "macro" facility to decipher C++ constructs.  There are lots of incredibly obscure things in C++ that a typical C++ programmer would not be able to understand and have to take on faith as black boxes.  The GDB macro programs someone had written really helped me figure out some oddball things (something with iterators, etc.).  Again, the macro programs could have been built into GDB, but the ability to do this via add-ons is sufficiently nice.

    In general, software development tools (IDEs, debuggers, etc.) have come a long way in helping the programmer in recent decades.  I've used Visual Studio Code in recent years and have been quite impressed by how much support it offers, especially with its plethora of add-on extensions.  I've used DDE (graphical front end for GDB) in recent years and its support for multi-thread analysis is quite impressive.

    All in all, these tools makes life a lot easier for the simple-minded programmer, especially in deciphering repetitive things (such as C to assembler translation), but the programmer still needs to know how to connect the dots at various levels.

    Thanks.