This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM6437 Exception Crash Debugging

Hello:

I am slave booting the DM6437 over HPI and running some algorithms on the DSP.   It runs great for a while, then crashes WITHOUT any hints as to why it crashes.

I have connected the JTAG and loaded symbols (release mode though so it wasn't compiled with symbols) and I can run this.   However the crash still doesn't tell me anything. 

Here are my questions:

In short, what is the best way to find out why this system is crashing?

1) Should I compile in release mode with symbols.  Can I strip the symbols that I load over HPI?  How?  Can I then load the symbols only through the JTAG and let it run as usual.

2) The call stack is always empty.   Is there a way to get a bigger call stack or make sure that it doesn't get cleared.

3) If I enable Debug Exceptions via the debugger, will this catch anything?

4) should I see anything in the "Message Logs"

5) other suggestions are welcome

Thanks,

--B

  • Hi B,

    Which version of BIOS are you using?

    Bandeg said:
    I have connected the JTAG and loaded symbols (release mode though so it wasn't compiled with symbols) and I can run this.   However the crash still doesn't tell me anything. 

    In this scenario, do you see any output in the CCS standard out window?  What about in the BIOS system logs (need to know your BIOS version to help you better with that, but in short check the ROV tool).

    Where is the program at when the crash occurs?  (you may need to check the dis/assembly window to see that).

    Please also see these related forum posts if you haven't already, they may be helpful to you:

    http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/99/t/6310.aspx

    http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/99/t/41385.aspx

    Steve

  • Okay. Below are some logs from the "Execution Graph Details." 

    According to http://www.ti.com/lit/ug/spru732j/spru732j.pdf  says that Internal exception: IERR=0x1 is a fetch packet exception??    Any idea what this means?   How can I debug this further using CCS?

    Also, the program stopped at: C$L1 in assembly.  What does this symbol refer to?

    Thanks,

    -B

    937255216   PRD: tick count = 151100910 (0x09019dee)
    937255217   SWI: post  KNL_swi (TSK scheduler) (0x8543e2ac)
    937255218   SWI: begin KNL_swi (TSK scheduler) (0x8543e2ac)
    937255219   TSK: ready dynamic TSK (0x81022a5c)
    937255220   TSK: running dynamic TSK (0x81022a5c)
    937255221   SWI: end   KNL_swi (TSK scheduler) (0x8543e2ac) state = done
    937255222   SEM: post <unknown handle> (0x810227bc) count = 0
    937255223   SWI: begin KNL_swi (TSK scheduler) (0x8543e2ac)
    937255224   TSK: blocked dynamic TSK (0x81022a5c) on <unknown handle> SEM
    937255225   TSK: running dynamic TSK (0x8102ec5c)
    937255226   SWI: end   KNL_swi (TSK scheduler) (0x8543e2ac) state = done
    937255227   EXC_exceptionHandler: EFR=0x2
    937255228     NRP=0x5eeaff84
    937255229     mode=supervisor
    937255230   Internal exception: IERR=0x1
    937255231     Instruction fetch exception
    937255232   SYS abort called with message 'Run-time exception detected, aborting ...'
  • You are getting an exception because some code somewhere is doing something illegal.

    At this point you need to try to trace backwards in order to gain clues as to what line(s) of code are causing the exception.  The information you've found will help.

    Bandeg said:
    Also, the program stopped at: C$L1 in assembly.  What does this symbol refer to?

    C$L1 is probably just a label within some function and you can use the disassembly window to find which function that is.  If you click the mouse on C$L1 in the disassembly window, so that the cursor is in the window, you can then use the keyboard arrows or keyboard page up/down to navigate in the disassembly window (for whatever reason, clicking with the mouse on the window scroll bar doesn't work well in the disassembly window, so please use the keyboard).

    Once you have the cursor in the window at C$L1, scroll or page up until you see a symbol for a function name (you may have to move up for a while and there may be a lot of other labels).

    What function symbol do you see first?  This will be the function that you are in at "C$L1".

    Bandeg said:
    937255228     NRP=0x5eeaff84

    Can you enter this address into the disassembly window?  This is probably pointing to where the offending code instruction is.

    Bandeg said:
    937255225   TSK: running dynamic TSK (0x8102ec5c)
    937255226   SWI: end   KNL_swi (TSK scheduler) (0x8543e2ac) state = done
    937255227   EXC_exceptionHandler: EFR=0x2

    Looking at the output, it looks like the above bolded task ran last.  Can you check the code of that task?  It may contain the offending code.

    You should also check this forum post if you haven't already, as the problem is similar.

    Steve

  • I doubt that NRP address is correct.  The .text region is at 0x85000000.  

    The dynamic task runs the heart of our code, and it is large.  I have been using LOG_printf() to get closer.  

    Can the UBM help me here? 

    http://processors.wiki.ti.com/index.php/DSP_BIOS_Debugging_Tips

    http://processors.wiki.ti.com/images/0/0d/Ubm-ext-02.pdf

    I am using CCSv3.3 and I'd like to set-up a Watchpoint new action in the UBM window (page 9).  This window only shows me "Software Breakpoint and Hardware Breakpoint" as my options.   How do I enable all the Watchpoint option so that I can watch the stack?

    Regards, B 

  • Hi B,

    Can you open the KOV tool to get some insight into the crash?  Do you see any stack overflow happening in your TSKs?

    Please refer to the DSP/BIOS Debugging Tips wiki page for some more tips on what to do when experiencing random crashes like this.

    Steve