TM4C1237H6PM: Lock-Up at power brown out event

Steven Li

Part Number: TM4C1237H6PM

In one of our applications, the Microcontroller (MCU) we used -- TM4C1237H6PM -- entered a kind of lockup mode (or may be internal resetting mode) at the event when the supply power brown out happened. This is reproduceable with a specific s/w revision, but not reproduceable with earlier s/w rev. That means it is s/w code related. With the lockup MCU, we cannot reflash the unit through the JTAG port – getting error message “Cannot access the DAP”. The MCU core voltage VDDC is normal. We saw one JTAG port signal toggling indicating that the MCU is not completely dead. The power brown out did not create over voltage stress to the unit. So we believe the MCU is not physical damaged. The questions are:

How could we have the MCU out of the lockup mode? or just reflash the processor somehow? We don’t want to replace the MCU on the locked-up units.
What is wrong with the s/w that caused the MCU entering the lockup mode? How to fix it?

over 3 years ago

0 Charles Tsai over 3 years ago

TI__Guru**** 191906 points

Hi,

Will you be able to tell the major differences between the current and prior versions of your firmware? Normally, you can lock out JTAG access if you 1) put the device into some type of deep-sleep or hibernate mode without a mechanism to wake up, or 2) you repurpose the JTAG pins for GPIO or 3) you configure BOOTCFG register to disable JTAG access permanently for security purpose.

Reset is considered as the highest exception priority by the processor. If you have constant reset events then it is also possible to lockout JTAG access. Please refer to section 5.3 of this app note to unlock the device. https://www.ti.com/lit/pdf/spma075. Depending on the debug probe you use, you can unlock the device by using LM Flash Programmer or djgjtag.exe tool.

0 Steven Li over 3 years ago

Prodigy 175 points

Thanks for the response. We tried to unlock the device following the instructions in the app you sent. The djgtag.exe tool was what we used. We didn’t have success to unlock the microcontroller. Any thoughts or something else we can try to unlock the MC?

The s/w team is working to identify what are the differences in the s/w revision where this issue occurred.

I got a question about how could I tell if the device is not really damaged but is actually in deep sleep mode or other lockup mode?

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Hi Steven,

What is the result of dbgjtag.exe? Did it say it succeeds in unlocking on the screen or is it having problem to unlock? dbgjtag.exe and LM flash programmer are the only tools to unlock the device. If they won't unlock then I suspect the device is bricked or maybe even damaged. Can you do a jtag scan test? What does it show? See below example using XDS200 debug probe. It will have the same test connection feature for XDS110 or XDS100.

Please also check with your firmware if power was interrupted during non-volatile register commits. See below.

Please also check if newer version of your firmware has any operations with EEprom? There are several errata pertaining to EEprom that may result in the device non functional if power is lost in the middle.

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

Hello Charles,

dbgjtag.exe went through exactly like described in the APP. No message saying succeed or error. To confirm, we ran the dbgjtag.exe on a good board on the same setup. It was successful. So we know the dbgjtag.exe and our setup worked fine. We don’t have setup here to try LM flash programmer.

We could not do the jtag scan test. I got error message “Error connecting to the target: (Error -1170 @ 0x0) Unable to access the DAP”.

The firmware does have access to EEPROM. When power got lost in the middle of the EEPROM access leading to non functional of the MC, does it mean it will be not recoverable like our case – locked up or bricked?

Another clue. We’ve been testing this design for years and always turn off the power in the middle when the MC is in various operating modes. We’ve not got the same issue until recently when we used a powered USB hub. The hub sometimes cycle power on and off several times when plugging this board to it. The power off time is about 500ms then on for about 800ms. Now we’ve more than 5 units in this lockup mode. Previously we thought the lockup relates to specific s/w revision. Today we found this is not true. We tried to reproduce the failure mode with that s/w revision (same h/w). We could not reproduce the failure. Now we’re confused.

Regards,

Steven

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Steven Li said:
Previously we thought the lockup relates to specific s/w revision. Today we found this is not true. We tried to reproduce the failure mode with that s/w revision (same h/w). We could not reproduce the failure. Now we’re confused.

I'm a bit confused here. Here you said 'with that s/w revision (same h/w)'. What is 'that' revision? Is this the older revision that has been working in the past? The reason I ask is because you later than you could not reproduce the failure. I thought that was expected for the older revision of s/w. If the understanding is not correct, please clarify.

Steven Li said:
The firmware does have access to EEPROM. When power got lost in the middle of the EEPROM access leading to non functional of the MC, does it mean it will be not recoverable like our case – locked up or bricked?

Yes, it is possible that the device become non-recoverable when EEprom operation is interrupted in the middle due to a power loss.

Steven Li said:
Another clue. We’ve been testing this design for years and always turn off the power in the middle when the MC is in various operating modes. We’ve not got the same issue until recently when we used a powered USB hub. The hub sometimes cycle power on and off several times when plugging this board to it. The power off time is about 500ms then on for about 800ms.

Perhaps in your old setup the power off in the middle of MCU operating modes was never happening in the middle of EEprom operation. The new setup with the powered USB hub may turn off power more randomly than the old setup.

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

Here "that revision", I meant the new s/w revision -- the revision that caused multiple h/w locked up. Now we could not reproduce the failure with the new s/w revision as well as with old s/w revisions. In short, we could not reproduce the failure mode.

Now we are working on new test setup for the purpose reproducing the failure mode. Our secondary priority is to determine the real root cause. Another priority is to unlock the MC -- since we don't have many test samples. There is nowhere we can purchase this Microcontroller. This part is out of stock everywhere.

Thanks,

Steven

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Steven Li said:
Here "that revision", I meant the new s/w revision -- the revision that caused multiple h/w locked up. Now we could not reproduce the failure with the new s/w revision as well as with old s/w revisions. In short, we could not reproduce the failure mode.

Hi Steven,

I suppose you could not reproduce the failure on different chips/boards running the new s/w revision, correct? However, the one that failed is still uncoverable, correct? I'm not too sure what had happened to this failed chip? Is it bricked due to the possible failure modes (e.g EEprom or non-volatile register programming due to interruption by power or reset) or other unknown failure modes. Can you do a current test? if the failed chip measures high current then it is possible that there is some short. You can also do resistance test on each pin.

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

That is correct. We could not reproduce the failure on different boards with the same new s/w revision. We could not unlock the failed ones. The current draw in all the failed boards is still within the normal range. The supply voltages are all normal. The VDDC at pin 25&56 is normal (1.2V). We see a clock signal about 13Hz (ICDI_TDO) from the MC at PC3 (pin49). We have an external crystal circuit at OSC1/OSC0 where we measured no clock signal. All the GPIO pins do not have normal functions. We monitored the supply voltages and reset signal during the USB hub power brown out test and did not capture any over voltage event.

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Hi Steven,

Steven Li said:
We could not reproduce the failure on different boards with the same new s/w revision. We could not unlock the failed ones.

I wish I know the root cause but I don't at the moment. I take it that you may have two different designs of boards. One board design running the same software will not fail but another board design will lock up. What is the difference between these two designs although I can hardly associate the lock up to be board related.

Steven Li said:
We have an external crystal circuit at OSC1/OSC0 where we measured no clock signal. All the GPIO pins do not have normal functions.

If there is no OSC0/OSC1 then there is no source clock unless your code is running off of the internal PIOSC. Or your code has put the device into some type of hibernate mode or deepsleep mode where the oscillator is turned off. If this is the case, your application must have a wakeup mechanism. If there is no clock then the processor debug logic cannot synchronize with the JTAG clock.

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

Hi Charies,

Thanks for the response. Please be updated that we've reproduced the failure with old revision of s/w too. Now we're trying to reproduce the failure on other design boards. We've several circuit boards that use the same MC. I've forwarded your comment to our s/w team and hopefully they will have resource to work on this issue. It is getting higher priority at our side.

BTW, sounds like the latest silicon revision -- rev 7 -- may not have this kind of failure mode. If this is true, we really want to have the rev 7 parts and get it tested to prove it. Once proved, at least we've one resolution -- change our design to rev 7 parts. Currently we cannot purchase the part anywhere. Any possibility we can get several samples from TI? I think the rev 7 silicon P/N is TM4C1237H6PMI7.

Thanks,

Steven

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Hi Steven,

Steven Li said:
Please be updated that we've reproduced the failure with old revision of s/w too.

So far my understanding is that both the old and new s/w will produce the failures on the newest board but not another older board. Is this a correct understanding? What is the difference between the two boards? Is there any test setup difference when testing these two boards?

Steven Li said:
Any possibility we can get several samples from TI? I think the rev 7 silicon P/N is TM4C1237H6PMI7.

Sorry, please understand this forum is only for technical discussion. I have no information concerning samples. Please contact the local TI sales office for questions regarding samples.

I'm also on vacation. Please expect delay on my response.

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

Hi Charles,

You are correct. The test condition is switching on/off the supply power. There are circuit differences between the two circuit boards. We'll start working on finding out if the circuit differences are the cause after the holiday break.

Happy Holiday!

Steven

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

Hi Charles,

Now we determined it is s/w related. Our s/w team is working on the resolutions. I'm still struggling on how to unlock the microcontrollers. I found this when searching TI website:

The error can be caused by invalid code on the subcore that causes it to reset itself continuously.

If this error is originated in software, it can potentially be recovered by accessing the DAP directly and trying to either reset the offending core, lock it or erase its flash memory via a GEL script (some microcontrollers have pre-loaded routines to allow that).

Any further instruction if we can try above mentioned -- accessing DAP directly via GEL script? Or any other methods we can try please advise.

Thank you!

Steven

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Hi Steve,

If you have a very short periodic reset event then it is possible that the debugger is unable to connect with the target. For example, after the device is released from reset at t1 and another reset event at t2, the debugger must be able to connect to the target before t2 happens. Debugger takes some time to connect and if it needs to take longer time to connect to the target beyond t2 then t2 has already happened. This can become a vicious cycle.

What debug probe do you have? I know some higher performance debug probes may take faster time to connect. If you have such higher performance probe like J-Link then you can try to repeatedly connect to the target. Perhaps, there is a chance to connect after multiple tries. You can also try with the existing debug probe you have. I will also suggest you hold the nRST pin low and release it at the same time you try to connect the debugger to the target. In another word, you want to start connecting right after t1. However, if the duration between t1 and t2 is so short then it may not help as the total connection time may just exceed it.

As for using GEL to access DAP, I have no experience. Please open a new thread for this so I can move the new post to our CCS expert. I don't want to clutter with different topics in the same thread. Thanks

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

We tried many times, and will try more per your advise. We use XDS200 probe.

One thing we found that, from the locked MC, the signal at ICDI_TDO port (pin 49) goes to low for about 77ms then goes high for the next 77ms -- like a 13Hz clock signal. While from a good MC, this signal stays high. What does that mean? Does it mean internally the MC hold reset for 77ms then release the reset for 77ms? if so, not sure if this 77ms is too short for XDS200 probe to do anything.

Thanks!

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Hi Steven,

Since you already have code in these failed devices, do you have any functional pins that may also indicate a reset pattern at a frequency of 77ms?

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

Good idea! I checked several signal pins but not all of them. I'll get it done the first thing next week. Thanks!

0 Chester Gillon over 3 years ago in reply to Steven Li

Guru 92251 points

Steven Li said:
One thing we found that, from the locked MC, the signal at ICDI_TDO port (pin 49) goes to low for about 77ms then goes high for the next 77ms -- like a 13Hz clock signal.

Section 5.2.2.1 Reset Sources of the TM4C1237H6PM datasheet contains:

Note: If the device fails the initialization phase, it toggles the TDO output pin as an indication the device is not executing. This feature is provided for debug purposes

From TM4C1231H6PZ: tm4e1231h6zrb no-functional state, if you are using a rev 6 part think this means that due to errata MEM#04 the device has become non-functional:

0 Steven Li over 3 years ago in reply to Chester Gillon

Prodigy 175 points

This is very helpful. Thanks!

We did not expect that such the EEPROM write related failure is none recoverable at power cycle. Another thing I’ll check with our s/w team is to understand what they did differently in the earlier code. With the earlier code, the failure is not reproduceable.

BTW as for my earlier question, at such the “device fails the initialization phase” mode as mentioned, any way we can have the device to get out from such the failure mode and back to normal use condition? We are running out of useable units for our on-going project that is very tight in schedule.

Thank you!

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Hi Steven,

Thanks to Chester for digging the relevant posts and errata #4. As I mentioned before, if the EEprom operations were interrupted in the middle due to power loss or any type of reset events, then the device may become unrecoverable even after a reset due to the erratas. Device unlock is the only method I think may have some slim chance of recovering but you have already tried without success.

0 Steven Li over 3 years ago in reply to Charles Tsai

Prodigy 175 points

In our early development phase, we studied the errata MEM#04 while we experienced the failure mode -- "A reset will not recover the device” as said. But the device was recoverable when we cycled the power. Such failure mode was accepted. But now the failure mode is different -- both reset & cycle power do not recover the device. That means it is a permanent loss of our device -- same as h/w failed to loss of all the device functions. This is not acceptable in our applications. I think the errata should make this clear to users.

I'll communicate with our s/w team about this failure mode. We may have to further prove this is a true failure mode. Once proved, seems we only have two options given that h/w change is not our option:

1 - Change our s/w not to write to the EEPROM.

2 - Change our design to silicon 7 device and prove that the failure mode is not reproduceable.

Please advise if there are other options.

To verify silicon rev 7, we need support from TI. Please advise if you have the contact information of related team or person at TI.

Thanks!

0 Charles Tsai over 3 years ago in reply to Steven Li

TI__Guru**** 191906 points

Hi Steven,

For rev 7 samples or questions related to purchasing, please contact the local TI sales office. Sorry, we only handle technical discussion on e2e.

Arm-based microcontrollers

Arm-based microcontrollers forum

TM4C1237H6PM: Lock-Up at power brown out event