This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Illegal Interrupt on 2812

Problem: During a commanded reboot of the processor, the DSP executes an unexpected, un-enabled, illegal interrupt.

Scenario: After the DSP boots up and has been running for some time, we command a ‘reboot’ to return to restart the system. During this operation, we have a low probability (1 in ~500) that the DSP will execute an unexpected, un-enabled, interrupt that takes it to “this should never happen” code.

Detailed description:

1)            The DSP is booted up and configured to run in normal mode:

a)            External pin (xnmi_xint13) is used to generate interrupts for normal operation.

(1)    External pin (xnmi_xint13) is connected to a 20 Kilohertz signal.

(2)    External pin (xnmi_xint13) is configured as an input to generate the maskable interrupt INT13.

(3)    External pin (xnmi_xint13) is NOT configured as an external non-maskable interrupt.

b)           The watchdog time is configured to be active.

2)            The DSP executing operational code is commanded to re-boot; the operational code:

a)            Disables interrupts (INTM).

b)           Forces a watchdog reset by writing an illegal value to watchdog control register.

c)            The watchdog hold the DSP in reset for a short period of time.

d)           The reset re-initialzes registers to their initial state.

3)            The DSP wakes up from the (re)boot and starts to execute the environment code.

a)            Watchdog is disable

b)           RAM resident sections of the bootloader image are copied from FLASH into RAM

c)            The rest of RAM is set to zero

d)           Stack area is setup

4)            The DSP wakes up from the (re)boot and starts to execute the code I’ve attached.

a)            EALLOW is enabled

b)           Minimum DSP initialization is completed

(1)    Flash registers

(2)    DSP clocks

(3)    CPU interrupt registers

(4)    Watchdog timer (disabled)

(5)    A subset of I/O pins are configured

c)            Bootloader program (3F4000h – 3F7F80h in flash) is CRC verified

d)           Sections of internal RAM are pattern tested

e)            External bus parameters are configured for normal operation

f)             The rest of the DSP is initialized / configured

(1)    Configure SPI registers

(2)    Configure real-time-clock registers

g)           An external FPGA is loaded with an image from an external FLASH:

(1)    Clocks are re-configured to a slower rate to load the FPGA

(a)          The clock change waits for 327,675 clocks to settle out

(b)          While waiting for the clock change to settle out, an illegal interrupt occurs

·    the this-should-never-happen interrupt very occurs earily in the wait period

·    the this-should-never-happen interrupt loops forever

 

  • I think first we should find out exactly what interrupt is occurring.  When you use the term 'illegal', are you suggesting an 'illegal instruction trap'?  Since your problem occurs so rarely, it's tempting to suspect something electrical/external rather than software.  Can you test it with interrupts disabled to determine if it's a non-maskable event?

  • Interrupts are disabled when this event occurs (it is after a watchdog reset). From the data we have, the failure occurs after we have reconfigured the DSP clock to run at slower rate (needed to be able to program and off chip device). While we are waiting for clock change to settle, the interrupt occurs. This code is executed from RAM.

  • When you take an illegal instruction trap, registers are automaticallyinformation pushed on the stack. The order is:

    T:ST0

    ACC

    P

    AR1:AR0

    DP:ST1

    DBGSTAT:IER

    Return Addr

    empty  < SP points here (even or odd address)

    So, if you access the Return Addr value pushed on the stack at the Illegal Trap, you can find out where the illegal operation occured.

    Cheers,

    Alex T.

  • This problem has been narrowed down to a function that throttles back the internal clock from 150MHz to 60MHz.  Does anyone have a recommended (.asm) procedure for changing the clock frequency?  The NMI that is occurring happens within one or two decrements in a 'for' loop, just after 'Sys.LowSpeedClock = 1'

  • This problem still turns up from time to time and its important to get a handle on it because the 2812 is used in shipping product. I notice that the code we're using does some housekeeping between setting the PLL divisor and waiting for the PLL to settle. This should be relatively benign -- its just setting peripheral clock enables -- but my personal preference would be to set the register and then immediately wait on a REPT/NOP instruction for the PLL to settle.

    Suggestions? Observations? Anyone?

    (PS -- Another line we're looking at is whether a slight rail dip as the processor changes speed is responsible for the processor malfunction.)

  • Martin,

    It might be helpful if you could provide the code in question for us to look at.  (I believe sending just the problem function should be good enough for us to potentially provide some feedback)


    Thank you,
    Brett

  • Martin,

    The following from SPRAAS1B(just search TI.com for this appnote) should address your problem:

    The ISR is likely caused by executing code while the PLL is switching as you mentioned.  You will need to add a 131k cycle delay immediately after you set the PLLCR so that the CPU does not fetch any instructions while the PLL is settling.  I think this and th power glitch you are observing are consistent as well.

    Let us know if this solves the problem, and is acceptable.

    Matthew

    "Each time you write to the PLLCR register to

    configure the PLL multiplier, the PLL will take 131,072 cycles to lock. While the PLL is in the process of

    locking, the device frequency may experience a large swing at the start and end of the locking process.

    These two potentially abrupt frequency transitions may cause power rail fluctuations. Careful design of the

    power-supply is needed in order to prevent these transitions from impacting the device operation. Once

    the PLLCR register is written to, it is recommended that a tight loop be executed until the PLL relocks to

    the new frequency. Once the PLLCR is written to, writing to the PLLCR again, even with the same

    multiplier will cause these frequency swings and potential power supply swings."

  • Matt:

    I received this feedback:

    "The TI documentation is vague in this area. From SPRAA1SB referenced in the E2E thread: “Careful design of the power-supply is needed in order to prevent these transitions from impacting the device operation”.

    Our preliminary investigation into the power supply rails (I sent this data in the original email) seem to indicate that we are within tolerance. Improving the rail stiffness or decreasing the rail stiffness in an effort to increase or decrease the frequency of failure hasn’t made a difference. This issue only seems to occur when we go from 150Mhz to 60Mhz and not when we transition from 60Mhz to 150Mhz. Thermals also seem to have an impact although it is not consistent from unit to unit.

     

    I would like to understand if there are other contributing factors."

  • John and Martin,

    I should have been more specific in the recommendation; the addition of  a 131k cycle for loop with an asm " nop" should take care of this.  The problem is occurring as you switch down in frequency, the device is going from 150 MHz to  CLKIN/2 before it rellocks at 60MHz.  That jump will cause a voltage spike up since the current demand is decreasing so quickly.  To remedy this we need to get the CPU cycling on something that was already fetched before the jump.  

    Let me know if that fixes the problem on the suspect boards.


    Matt

  • Hi Matt,

    My name is Tai Trinh ,I am a Failure Analysis at AMC, I would like to follow up with this issue and learn from it to improve our future circuit design.

    Regards to this PLL loop issue, the designer had no information in the data sheet for this? why not?

    The TI documentation is vague in this area. From SPRAA1SB referenced in the E2E thread: “Careful design of the power-supply is needed in order to prevent these transitions from impacting the device operation”.

    Please help me find where TI recommended how to set the limit on the rail stiffness or decreasing the rail stiffness in an effort to increase or decrease the frequency of failure.

    Thanks

    Tai Trinh

  • Hi Matt,

    Please disregards the above request, I had all the information that I need from your application notes.Would you please  remove the above information.

    Thanks

    Tai Trinh