This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28335: CPU lock-up issue between custom bootloader and application

Part Number: TMS320F28335
Other Parts Discussed in Thread: SYSBIOS

Hi,

We have a custom bootloader that we use to side load our application which has worked well for years but now we have a few instances where the CPU locks up on the jump from the custom bootloader to our 'application' logic.

Sequence of events:

1) Power-up / Processor boot

2) Custom bootloader logic runs, checks application memory, toggles a debug GPIO, and jumps to application

3) Application logic memory starts with the SYSBIOS boot.a28FP used with our application (c_int00)

<random lock-ups occur here>

4) 'main' method runs where we toggle a debug GPIO, configure modules, and start SYSBIOS

Using the debug GPIO, we know that the issue occurs between the jump to application and our 'main' method.

We have attempted updating the Illegal ISR hook but this doesn't appear to be getting called as I have set it to toggle a GPIO and never see it getting hit.

As the CPU lock up is a random occurrence, I'd like some tips on locations in our logic and settings that may be causing this.

  • Wesley,

    I've assigned your post to one of our experts in C2000, but due to the US Holiday today please expect a reply tomorrow 1/17.

    Best,

    Matthew

  • Considering this has been working well for years and the fact that this is a random occurrence makes it a challenge to debug. Have you looked into the possibility of a disturbance from an external source, such as EMI? Did anything change? For example, seemingly unrelated things like any component on the board or even the manufacturing facility? Is this being seen on boards that worked well for years or are these new boards?

  • Hi Hareesh, I agree that there is likely some external source but our electrical team has scoped multiple test points on our boards with no luck. Reset related pins, clock pins, etc. I'm looking for ways to either (1) find and eliminate any potential software flaws related to this issue in our application that may have become more frequent due to the manufacturing changes or (2) determine how/where an exterior source is causing the lock-up so that I can provide that information to our electrical team and then determine if some form of software workaround can be made.

    We have changed both firmware and electronics. Using older firmware on the newer electronics indicates that the issue is likely being caused or has become more prominent by electrical changes with the newer boards. We can't go back to older boards as there is a supply chain issue. That said, this lock-up issue is only seen rarely on power-up and is nearly always corrected with a single power-cycle.

  • Are you able to connect the debug probe after the lock up occurs to examine where it's getting stuck and what the state of the device is? There's some advice in this video on how to connect a device without a reset/reload that is helpful for debugging these kinds of boot issues.

    In your Boot module settings do you disable the watchdog or is it left enabled while c_int00 runs?

    Whitney

  • Hi Whitney, I haven't tried to connect the debug pro e but I should be able to rework the bootloader and application to allow attaching it. I'll check into that video. Thanks.

    The custom bootloader that jumps to the application sets" SysCtrlRegs.WDCR= 0x0068;" to disable the watchdog.

    Looking at the application's SYSBIOS configuration, we have "Boot.disableWatchdog = false;" set.

  • So by turning of the CSM and reloading the bootloader and application, I verified that I was able to connect via the debug probe. No issues there.

    Next I performed numerous power cycles to get the lock-up to occur. Once it did, I attempted to connect via the debug probe. I was unable to do so.

    I got the message: "Error connecting to target: (Error -1156 @ 0x800) Device may be operating in low-power mode?"

    When choosing 'No' to retry without wake-up, the same pop-up occurred. Then choosing 'Yes' showed "Unable to connect to the target."

  • Wesley,

    1. If it is OK with you, could you share the schematics with me privately? You can do this by initiating a "friendship request" with me. 
    2. Do you use HALT mode in your code? If so, is there a possibility that your device was forced into HALT mode inadvertently? If your design uses an external quartz crystal, you could probe to check if the oscillator is still oscillating.
    3. Do you leave the watchdog enabled in your application? If you do, I am thinking it should be able to pull your device  out of this condition, unless the external disturbance was severe enough to "freeze/lock" the device.
  • Hi Hareesh,

    1. I'll have to check with my leadership. Will get back to you on this.

    2. No, we do not use HALT mode. I checked for use of the LPMCR0 register in our bootloader and application and do not see any use of this register in the system control memory. I would expect attempting to have the debugger attach to the running application and not being able to wake-up the CPU would eliminate this as a possibility.

    3. We made an update now to potentially band-aid this issue by turning the WD on in the custom bootloader prior to jumping to our application (entrance to c_int00). This isn't a 100% solution for us though for multiple reasons but it seems to work. Is it possible to have c_int00 configure/start the WD? 

    For reference, c_int00 is currently from bios_6_83_00_18, specifically "bios_6_83_00_18\packages\ti\targets\rts2800\lib\boot.a28FP" and the boot_cg.o28FP.

  • Thank you Whitney.

    I was able to move our WD initialize to the reset function and it does eliminate the problem and doesn't require us to update our custom bootloader which was our main concern with our fix.

    We also do some other setup within this function. I will use some GPIO to see if our core issue is related to actions performed there.

  • Okay, good, glad you have a workaround. Let us know how GPIO experiments go.

    Whitney

  • Hi Whitney,

    Unfortunately not as well as I'd like. I'm having trouble setting and clearing the GPIO in the reset function. 

    At the beginning of the reset function using EALLOW and EDIS at the end, I set each pin's Mux register setting to GPIO, then the direction as output, and set them high initially, but my scope shows no changes on the pins. I move the set action to the main method after startup and see the GPIO go high. Perhaps something is preventing the GPIO output at this point in the startup or am I missing some configuration at this point?

  • The same code from your reset function works when you move it to main()? I can't think of any device initializations that would affect the ability to use the GPIOs. I guess since you're running before c_int00, you need to think about the initialization of variables, but if you're just doing simple register writes, I wouldn't think that'd be a problem.

    You could put a while(1) in the code (after disabling the watchdog) to halt it so you can connect to the device and debug to see if you can figure out why the GPIOs aren't working.

    Whitney

  • Hi Whitney, thanks for the help. Since we have the workaround we're moving in a different direction and will investigate further on our next iteration. If we need further help, I'll open a 'related question'. Thanks again.