This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DSP crash on C6474?

We observe some random DSP crash on DSP C6474 core 0 with DSP/BIOS. Here is the log from system message log.

127862971   PRD: tick count = 3165766 (0x00304e46)
127862972   SWI: post  KNL_swi (TSK scheduler) (0x852fc0)
127862973   SWI: begin KNL_swi (TSK scheduler) (0x852fc0)
127862974   SWI: end   KNL_swi (TSK scheduler) (0x852fc0) state = done
127862975   CLK: current time = 3165767 (0x00304e47)
127862976   PRD: tick count = 3165767 (0x00304e47)
127862977   SWI: post  KNL_swi (TSK scheduler) (0x852fc0)
127862978   SWI: begin KNL_swi (TSK scheduler) (0x852fc0)
127862979   SWI: end   KNL_swi (TSK scheduler) (0x852fc0) state = done
127862980   EXC_exceptionHandler: EFR=0x2
127862981     NRP=0x81fe40
127862982     mode=supervisor
127862983   Internal exception: IERR=0x180
127862984     Loop buffer exception
127862985     Missed stall
127862986   SYS abort called with message 'Run-time exception detected, aborting ...'

We’ve checked the program memory at NRP=0x81fe40 and found this address is within “DSP_fft32x32” function of Ti DSP library. We are using DSPLIB v2.1 for C64+ core. Do you know any solution/walkaround?

-Tom-

  • Tom,

    DSPLIB is pretty tight, so the odds are against a code error in there; not impossible, but your best odds will be to try to find something outside of there that can lead to a problem.

    That said, it is pretty hard to figure out how a loop buffer exception would occur in well-tested code or how something outside the function would cause this kind of an error.

    Still, it would be good to look closely at the arguments you used in the call to the DSPLIB function.

    Does the crash occur with the exact same logs and the exact same NRP every time?

    Could an erroneous write be occurring from the PRD or SWI that were running just before the error?

    Regards,
    RandyP

  • I searched this topic on internet and found couple of similar cases. Here are some links:

    http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/115/p/12342/176847.aspx
    http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/112/t/85262.aspx

    But there are no official answers/solutions for this issue. The crash happened randomly and usually after several hours of running. The PRD and SWI are all from DSP/BIOS, we have no control of it. We will check whether NPR points to the same location.

    -Tom-

  • Tom,

    Sorry, I did not notice that the SWIs were all for KNL_swi from DSP/BIOS.

    If I were sitting in your lab, what I would do would be

    - run to the crash several times, confirming that the NRP address is always the same. If it is not, then something else may need to be considered.

    - load the code, run to main

    - set a breakpoint at the interrupt vector for NMI. This is where the DSP will go once the exception occurs, but this will be before any of the register context has been "corrupted" by running the EXC_handler to print out the messages.

    - in the Disassembly Window, go to the program address that is in the NRP and see what has been going on.

    - look at registers to try to figure out if this was the first pass through the SPLOOP (confirm you are in or at the end of an SPLOOP) or at the mid-point or the end.

    - the return address may have been moved from the default B3 to another register for safe-keeping, but try to find where the value that was in B3 at the entry to the DSPLIB function was placed and make a note of it.

    - look at the address that was in B3. It will be the return address, so you can trace back to where the DSPLIB function was called. This should get you to the right place in your source code by tracking the address from the linker map file.

    Please let us know what you find.

    Regards,
    RandyP

  • RandyP,

    Just wanted to keep you informed;

    We ran the code through the weekend and couldn’t reproduce this SPLOOP issue. I will do more test and use your suggestions. Once I find anything, I will let your know.

    -Tom-