RM48L952 nERROR pin randomly asserted low during power up - How to Debug?

stomp

Other Parts Discussed in Thread: RM48L952

Hi,

In our application we have found that at random times (maybe 1 in 100) power on resets the RM48L952 locks up and the nERROR pin is low.

Its naturally impossible to debug.

Does anyone have any tips?, Is there a way through CCS using XDS100V2 debugging hardware that we can attach to the MCU and view some key registers?

Thanks

Stomp!.

over 10 years ago

0 Zhaohong Zhang over 10 years ago

TI__Mastermind 22715 points

After the error occurs, you can connect CCS to CPU to see what causes the error. The first thing to check is the ESM status registers.

Thanks and regards,

Zhaohong

0 stomp over 10 years ago in reply to Zhaohong Zhang

Genius 4675 points

Hi,
Thanks for your advice, it took a little bit of digging to work out how to debug the target without loading or resetting, but I followed the procedure: http://processors.wiki.ti.com/index.php/MSP430_-_Connecting_to_a_running_target

Once I connected to the inoperative MCU, the debugger is at the reset entry, and any stepping of the target just loops back over the reset entry (Program Counter = 0x00).

For the ESM registers, (assuming they are valid) I have:
STAT1 = 0x80000000
STAT2 = 0x00000004
STAT3 = 0x00000000

The question is now, where in the documentation do I find the group mappings?

I have found the mappings for the groups in the document SPNS174. For the STAT2 register (Group 2, causes an interrupt) bit 3 is set, which I assume is "CCMR4 - Compare" as per SPNS174 page 101.

From SPNU503B, section 9.3.1 (Lockstep Mode), the stats bit relates to the CPU compare error. So assuming the ESM Status registers are correct, and what I am seeing is a CPU compare error, randomly at power up.

Looking into the start-up code there is an errata for silicon revision A. The errata DEVICE#140 does not appear in Silicon B or C errata documents. Although I have boards with both Rev A and Rev B silicon, the issue appears on both silicon revisions.

Any suggestions please?

Thanks
Stomp!

0 Zhaohong Zhang over 10 years ago in reply to stomp

TI__Mastermind 22715 points

It seems that you did not initialize the CPU registers at the beginning of boot up.

You can check by the following steps.

(1) Load your code from CCS.
(2) Do system and CPU reset.
(3) Stepping through the execution from address 0x0. CPU should first jump to _c_int_00. Then it should run a function to initialize the CPU registers before doing anything else. Check if you miss this function.

Thanks and regards,

Zhaohong

0 stomp over 10 years ago in reply to Zhaohong Zhang

Genius 4675 points

Hi,
Thanks for your quick reply.

The fault we are seeing is random, about 1 in every 100 power on resets. Our system is powered from a TPS65300 PMIC, we have just scoped the power up reset sequence and nPORRST is released about 100ms after power supplies are stable and nERROR is released (from low to high) when the MCU starts about 110ms after the nPORRST is released.

I DO Init the registers correctly after reset, I am currently looking into several different possibilities and they are:

1. Fault cause by init of the MPU after stack pointers but before Errata DEVICE#140 work around. Per: http://e2e.ti.com/support/microcontrollers/hercules/f/312/t/188446
2. VIM 0 Non Mask-able Interrupt being set by the ESM hardware, but VIM not init yet.
3. Some other strange silicon issue, such as DEVICE#140 being present in RevB silicon.

I'll keep looking into it, but thanks for your help thus far.
Stomp!.

0 Zhaohong Zhang over 10 years ago in reply to stomp

TI__Mastermind 22715 points

Glad to learn that you are making progress. We can also help you further if you can share your CCS project. A method we commonly use in debugging this type of error is to insert the branch to it self instruction into the code to see if CPU can reach it successfully. In c-code it is "asm (" b #-8");" in arm mode and "asm (" b #-4");" in thumb mode.

Thanks,

Zhaohong

0 stomp over 10 years ago in reply to Zhaohong Zhang

Genius 4675 points

Hi,

We are now clear of the fault on Silicon RevC hardware, but still have on Silicon RevA, despite the work-around (This only affects our development units, not field production boards which are all RevC). We've done over 1000 cold starts without a fault on RevC now. RevA board only faulted once, with the MPU init function move.

To summarize our experience:

1. We discovered how to connect to a running target.
2. We were able to observe ESM bits being set indicating a CPU core mismatch error, indicated by low on nERROR pin.
3. We can observe this after power on reset as per errata DEVICE#140 on RevA, despite the work around.
4. We can cause the error almost constantly by having the MPU init function before the work-around for DEVICE #140 on RevA

So in the end I think we have solved this solution.
Thanks again for your help for pointing us in the right direction. For our RevA development boards, they are now in the trash.

Kind Regards
Stomp!

Arm-based microcontrollers

Arm-based microcontrollers forum

RM48L952 nERROR pin randomly asserted low during power up - How to Debug?