TM4C129x low 3V3 short

Peter Jaquiery

Other Parts Discussed in Thread: TM4C1294KCPDT, ADS8598S, TUSB422, EK-TM4C1294XL

We have had two TM4C129X development boards and a prototype board of our own design fail with what seems to be a short on the 3V3 rail by the processor. The dev kit uses the TM4C129XMCZAD processor and our prototype uses the TM4C1294KCPDT variant of the processor.

In each case the systems have been powered from USB ports. In the case of our prototype board I have a VBUS current monitor on the device and know that it was usually drawing about 40mA which agrees well with the datasheet and the work we were doing with the processor. We have a third dev kit which has been running with heavy use for a number of months without failure. We have been using our prototype board heavily for about a month. All of the devices worked correctly for several week to a few months before failing.

Our worry is that there is a failure mode with these processors that we are not aware of and that will impact customer units. We haven't been able to determine any proximate cause. ESD is a possibility, but seems unlikely in a populated system and more unlikely given that the failure mode seems to be the same in each case. Could it be a 3V3 supply related issue? That seems unlikely in the context of a development board.

So, is this a known issue?

If so, is there standard mitigation for the issue or do we change our design to use a different processor family (probably from a different manufacturer)?

over 4 years ago

0 Ralph Jacobi over 4 years ago

TI__Guru*** 128255 points

Hello Peter,

This is not a known issue/device flaw, but it is certainly a concerning set of failures. I tend to agree ESD seems unlikely given the three total failures with two being dev boards.

Are there any potential sources for transient voltages on the system at all?

From the USB side, can you share schematics of that portion?

Have you ever been able to recover operation of the device after failure?

Have you tried any ABBA swaps with the failed device and a working one?

0 Genatco over 4 years ago

Guru 54087 points

Hello Peter,

Peter Jaquiery said:
and a prototype board of our own design fail with what seems to be a short on the 3V3 rail by the processor

Have you checked the VBUS pin PB1 with ohm meter, what is the resistance? We had similar issue with VDD short (KCPDT). Discovered 3v3 LDO ground pin was not making good contact to PCB pad, somehow it shorted VDD two MCU's within weeks. Tapping top of LDO regulator with bamboo stick could watch VDD low output (2.9v) MCU would POR. Several previous DMM static checks of LDO indicated exactly 3v3. Seemingly the same short can occur if one side MCU has poor ground connection and VDD floats the entire VDD rail is destroyed. Anyway after removing/replacing 3v3 LDO, first detoxing ground pad and all is well for several months. Talk about bad luck :(

0 Peter Jaquiery over 4 years ago in reply to Genatco

Intellectual 615 points

Hi BP101,

our device is bus powered so we haven't connected PB1 to VBUS. If the code is running we must have power connected and when the power is removed the code doesn't run ;) .

In our case, and on three different systems (two Tiva dev kits from TI and one prototype board of our own design) using two different variants of the TM4C129 family of processors and different 3V3 regulators we see the 3V3 rail pulled down to about 0.1V. In each case the regulator is current limiting to between 300mA and 600mA (inferred from device specs, not measured). It is quite unlikely to be a dry joint on each system or a common failure mode of the regulators.

0 Peter Jaquiery over 4 years ago in reply to Ralph Jacobi

Intellectual 615 points

Hi Ralph,

thanks for the reply. We investigated sending a unit back to TI for Failure Analysis but in the first go around there was a degree of miscommunication with Mouser (where we got the dev kits) complicated by getting a warranty replacement and keeping a TI field application engineer in the loop. With a failure on our own board we are trying to pick this option up again, but have been told we need to see at least three failures in a production run of 200. I'm sure 3 failure in a smaller run would qualify, but it's not clear we could make that case in our current situation.

In answer to your questions:

1/ Are there any potential sources for transient voltages on the system at all?

With the dev kits we were connecting an external pretotype board connected to the dev kit using booster pack headers. We were powering the pretotype board with a bench supply.

The prototype board was entirely USB powered from a computer USB 3 port.

2/ From the USB side, can you share schematics of that portion?

On the dev kit there are a couple of transient suppression diodes and nothing else between the connector and the processor for the data lines. On our board there are a couple of transient suppression diodes and an EM suppression transformer on the data lines.

The dev kit sources power from from the debug USB connector. A TPS62177DQC LDO is used to generate 3.3V_MAIN which powers most of the dev kit's 3V3. 3.3V_MAIN is the run through a 1 Ohm resistor with a shunt across it to enable current measurement for the processor.

On our board the 3V3 is derived from VBUS using a LM3670MF.

If you think actual schematics would be helpful let me know and I'll post them as images.

3/ Have you ever been able to recover operation of the device after failure?

No, but we haven't tried swapping out processors.

4/ Have you tried any ABBA swaps with the failed device and a working one?

No. We are investigating ways we could do that with our prototype board, but we don't have suitable rework tools in our office to let us swap the processors out.

The regulator seems to be fine. In all cases disconnecting the load from the regulator restores it to normal 3V3 operation. On at least one dev kit I disconnected just the processor by lifting the 1 Ohm current sense resistor and confirmed the regulator was operating correctly under normal load.

0 Genatco over 4 years ago in reply to Peter Jaquiery

Guru 54087 points

Hi Peter,

Peter Jaquiery said:
). It is quite unlikely to be a dry joint on each system or a common failure mode of the regulators.

However you have different scenarios in play, each to it's own Murphy's law! One stratagem seeks to divide and concur the most obvious, best to forget the test bed and make the new custom PCB work.

Are you using ADC and or tied VDD to VDDA? if so perhaps lift VDDA pin from pad to check it's ohmic value. Again check all GND pins each side of MCU and visually inspect pins 10X magnification or better. If your eye sight is good as mine you may have to close 1 eye, flip one lens up anyway.

Did you power the failed launch pads via the OTG port, ICID port or the Booster pack? The target must be powered simultaneously with any Booster pack sub PCB/s. Leaving 3v3 power on booster and unplugging USB cable (JP1 installed) may lead to failed MCU on EVM

0 Peter Jaquiery over 4 years ago in reply to Genatco

Intellectual 615 points

Occam and Murphy I'm sure would have a great bar fight. My money is on Occam.
We are not using the ADC on the prototype. We did in the context of the dev kit (not using launch pad). The fault in all cases only manifested after and extended period and in no case has it cleared. It is unlikely in the extreme to be a dry joint or solder bridge related issue.
In both cases the dev kits were powered through the ICID port. The prototype board is powered using VBUS from the USB port.

0 Genatco over 4 years ago in reply to Peter Jaquiery

Guru 54087 points

Peter Jaquiery said:
The fault in all cases only manifested after and extended period and in no case has it cleared. It is unlikely in the extreme to be a dry joint or solder bridge related issue.

Yet you expected the forum to know the length of your burn in period and other surrounding details, such as how many times custom PCB was powered up/down prior to VDD failure. Pointing fingers at the MCU is easier than finding the root cause for the failure/s. One clue to look for is/are shorted GPIO pins, easily detectable via DMM set of diode check, no mention this being done. Have you attempted to discover the root cause of how VDD is/was being shorted?

Does the LDO have proper VBUS pin capacitor to stop surging when it is plugged into a power source? Have you attempted to use another power source for +5 VBUS or remain with the same test where all failures were eminent? Does your USB cable have EMI/RFI filter ferrites on either end of the cable ? many cheaper ones omit any such filter.

How does Dev kit differ from the launch pad EVM used to evaluate MCU? Perhaps there is some difference involved in the PCB layout or secondary circuit connections? Solder paste residue (balls) are often very tiny and must be ruled out by visual inspection under high magnification! Did you install the failed MCU's or have a PCB house assemble them?

0 Peter Jaquiery over 4 years ago in reply to Peter Jaquiery

Intellectual 615 points

This is a small progress report. We have managed to pull the processor off the prototype board and the 3V3 fault cleared which strongly suggests a processor fault. We also swapped a processor from another board, but it doesn't seem to have survived the transplant - the board doesn't come up as a USB device and we can't talk to the processor with the debugger. The 3V3 is still good though.

0 Ralph Jacobi over 4 years ago in reply to Peter Jaquiery

TI__Guru*** 128255 points

Hi Peter,

Hmm, that makes me think that maybe there is a bad solder joint somewhere. You may want to ohm out the connections for power and ground and the JTAG to make sure they are connected well. Also for sanity, double check with a microscope for any possible solder bridges... that pesky occurrence has caused me enough problems to double or triple check for them when the part isn't working right after a swap.

0 Peter Jaquiery over 4 years ago in reply to Ralph Jacobi

Intellectual 615 points

Hi Ralph,

dry joints or bridges are possible for sure, although not obvious with a quick look. I'm not too worried about reviving the board unless you think it important for diagnostic purposes. I'm pretty busy and need to focus my effort so I'd rather not do "optional" work.

We have a couple more boards being used for firmware development and are keeping them powered pretty much 24/7. That should help tell us something about reliability.

0 Genatco over 4 years ago in reply to Peter Jaquiery

Guru 54087 points

Murphy suggests one can not see solder bridge issues with naked eye or incandescent light source, simple solution purchase low cost 10x LED magnify lamp. If the replacement MCU was know good likely the procedure of transplanting has caused further issues, e.g. rework/reflow produced shinny pads? One sure method to reveal bridge issues, never power 3v3 LDO without first verifying ohmic resistance VDD to GND.

Perhaps Occam's razor is not so sharp after all ;-)

0 Ralph Jacobi over 4 years ago in reply to Peter Jaquiery

TI__Guru*** 128255 points

Hello Peter,

Given this is a one-off failure in terms of your own boards, the ABBA test was to try and get some insight on if the board could be recovered with a new MCU and thus the issue was purely damage to the MCU.

The biggest element that sparked my concern is that the issue also occurred on the dev kit.

All that said, if you are testing a few more boards to try and get to this extended period failure then that seems to be more productive. If the failure occurs on these additional boards then you should be able to provide multiple samples for FA and maybe that can lead us to where the issue is if they can ID the exact pin(s) for failure as well as what mode of failure was observed. I would not worry as much about the 200 board minimum... if there's 3 out of 50, it should be able to be pushed through.

From a DK board standpoint, I'm still not quite sure what could have happened though - was there anything connected to the DK beyond the USB cables?

0 Peter Jaquiery over 4 years ago in reply to Ralph Jacobi

Intellectual 615 points

Hi Ralph,

When I stop beating my head against the wall with my current firmware issue I'll run a meter around the JTAG and supply pins, and check the clock on the transplanted chip. My fear though is that the chip may have been cooked in the process of removing it from its previous board.

Yes, the dev kit element is the compelling cause of concern for us too.

I didn't expect to need a run of 200 if we could reproduce the issue 2 or more devices in a run of 10! :-D In this context (given the history with the dev kits) 1 out of 10 is a pretty good batting average especially as at this point only about half the boards have had power on them for any significant time.

In both dev kit failures we had external pretotype boards connected to booster pack headers. In one case to J29 and in the other case to J30. In both cases we were using an ADC input and a few GPIO pins. In one case we also used I2C. In both cases the external electronics were powered by a current limited bench power supply. Our prototype board doesn't use the processor's ADCs.

Peter

0 Ralph Jacobi over 4 years ago in reply to Peter Jaquiery

TI__Guru*** 128255 points

Hello Peter,

I think the history with dev kits actually works against you in a way here from an FA standpoint in that we haven't had such issues with them so while your add-on's sound innocent enough, it's hard to think that the device is failing because of a silicon flaw and not some external conditions such as ESD. Or at least, that's how the FA teams would view it probably... Though I don't recall ESD fails having hurt a Dev Kit before either... I've been exchanging emails with your FAE too and he is also unsure what can be done to FA an EVM device.

Do you think there is any chance of an over voltage happening on any of the pins tied in via the booster pack header? Sounds like current isn't an issue given the supply used. Without knowing more about the components/circuit though I'm not sure what to suggest to look into. I imagine the ADC's are talking at a 3V logic level with the TM4C so that should be fine, but anything else that could cause an issue?

Just trying to come up with any plausible area to intelligently investigate around.

0 Peter Jaquiery over 4 years ago in reply to Ralph Jacobi

Intellectual 615 points

Hi Ralph,

thanks for your continued support with this!

I can't rule out an over voltage on booster pack pins, but at worst only one or two pins could possibly be affected and they would be different pins in each case. It's kind of the same argument as ESD - possible, but the same failure mode (effective short on 3V3) seems unlikely.

An over voltage issue with our prototype board is less likely and ought to be consistent across boards so if it were that I'd expect them all to fail and probably immediately power is applied.

For the prototype board we are using a ADS8598S across SPI and a MCP23017 I2C I/O expander. We also has USB and ethernet in the mix along with a TUSB422 and the usual USB-C over voltage protection and mux for the super speed USB-C lines. The ADC is on a second board plugged into the processor board and may not have been connected at the time of the processor failure.

I don't see any smoking guns here!

0 Genatco over 4 years ago in reply to Peter Jaquiery

Guru 54087 points

Peter Jaquiery said:
My fear though is that the chip may have been cooked in the process of removing it from its previous board.

I applaud you on even being able to get that DK-TQFP 128 pin/ball ? up from pads. At least with EK-TM4C1294XL it seems impossible requires after burners being so much copper clad under MCU pads. If temperature rises >160*C might easily damage the replacement MCU. For that reason I'm out, done with the EKXL for good.

Easiest way to remove MCU without destroying FR4 PCB especially multilayer type, e.g. Dremel tool & narrow carbide cutoff wheel. Skim tops of pins "Under 5x magnification" cuts right through copper in minutes, razor knife easily finishes cuts. Follow up via flat tip solder point (235-245*C), quickly removes remaining pins on pads without lifting pads/traces off FR4. Having several spare MCU on hand makes life so much easier!

DC supply is highly suspect of EVM failures. If it powers up prior to MCU that alone can stress if not eventually short out GPIO pin/s, possibly VDD rail. Brown outs can lead to similar if battery back up is not existing on either test bench, all powered systems.

0 Ralph Jacobi over 4 years ago in reply to Peter Jaquiery

TI__Guru*** 128255 points

Hello Peter,

After crosschecking through the errata sheet for possible ideas I actually came across something that is listed which is relevant and not implemented properly on our DK board which is GPIO#09: http://www.ti.com/lit/er/spmz850g/spmz850g.pdf

Can you check if your custom boards may be susceptible to this as well? If there are not 100 ohm resistors placed between the VBUS and ID pin on the USB connector and USB0VBUS (PB1)/USB0ID (PB0) on the MCU, then that could have affected both the DK and your custom hardware.

0 Peter Jaquiery over 4 years ago in reply to Ralph Jacobi

Intellectual 615 points

Hi Ralph,

Ti support through Ken (FAE before Eddie) suggested that fairly early on, but I ruled it out as pretty unlikely. First and most compelling argument is the fault is hard. Once it happens it doesn't clear. That is inconsistent with it being a noise triggered fault. We are not using PB0 or PB1 for either of the pretotype boards we used with the dev kit so they could only be affected on the Dev kit when connected to an active USB port. Once the fault happens it doesn't matter if USB is connected or not.

On our prototype card PB0/PB1 are used for I2C so they are configured as open drain outputs with a 10k pull ups so GPIO#09 doesn't apply.

Cheers,

Peter

0 Ralph Jacobi over 4 years ago in reply to Peter Jaquiery

TI__Guru*** 128255 points

Hello Peter,

I'd have to check about whether the I2C pull-ups would prevent the errata from being applied because the errata still states: "It can occur when the pin is in input or output mode or with any pin multiplexing options." The fix is supposed to be a low-pass filter and I am not sure how the I2C pull-ups compare.

Furthermore, we've had many reports of GPIO#09 causing permanent device damage, so while it's a noise triggered fault once triggered it won't go away until power down and can damage device internals. I was checking some E2E posts and have even seen multiple EVM's destroyed this way. The typical observation for that case is the TM4C heats up very notably, so it wouldn't be hard to double check on the EVM if that's still an option (I couldn't recall which board you were trying the ABBA swap on) because maybe we are debugging two separate cases then where the EVM failed due to this issue, and your custom board is plagued by something different.

0 Peter Jaquiery over 4 years ago in reply to Ralph Jacobi

Intellectual 615 points

Hi Ralph,

my reading of the errata implies the low resistance path is from off chip through PB0 or PB1 to chip ground which would require a low impedance VDD source connected to one or both pins. The errata also implies a large voltage swing (10% to 90% VDD) is required to trigger the fault. It's hard to see how that may be being generated in the case of the prototype board. A USB plug event could do it in the case of the EVM, especially as there is no filtering on the VBUS.

On the dev kit I measure 230mA at 0.7V into the processor which is high current but fairly low power and the chip is not getting noticeably warm. Power is from the USB debuger port and nothing is connected to the EVM USB port.

The ABBA was on our prototype board.

FYI, we are continuing development of the product and hope to get prototype devices into the hands of customers in a few months so we can get some field testing done. If we get no more failures then our confidence increases and we are happy. If we get another failure then we almost have enough for FA. It's a gamble, but what isn't, and the alternative is adding six months to spin up some alternative with no better guarantees at the end.

Cheers,

Peter

0 Genatco over 4 years ago in reply to Genatco

Guru 54087 points

BP101 said:
Have you checked the VBUS pin PB1 with ohm meter, what is the resistance?

My find was any reading DMM below several hundred K indicates PB1 has inflicted some kind of internal damage. Likewise diode check will indicate a very low drop occurs one direction of probes. Perhaps the GPIO#09 errata has made it's way into other class MCU? Taking no chances added 1k series R into VBUS/ID (PB1/PB0) seems to help diminish frequency of sudden client disconnects. In the case of GPIO#09 perhaps leaving PB0/PB1 as floating CMOS pins eventually lead to unexpected internal demise.

0 Ralph Jacobi over 4 years ago in reply to Peter Jaquiery

TI__Guru*** 128255 points

Hi Peter,

For the EVM did you ever have it plugged into the EVM USB port? If so, I'd repeat that test with that plugged in - the USB port is where the PB0/PB1 connections would route too, so checking with the Debug Port wouldn't have any bearing for the test.

Regarding your plans moving forward, that sounds good. If you do get another failure then I think what can be done is go over a few general tests to try and isolate the failure on that board, and if that leads no where I think I have may be able to get a means to get an FA going based on the total 4 unit fails plus showing sufficient investigation.

Aside from the test I mentioned above, is there anything else we need to discuss on this thread right away? If not, I'd like to close the thread for now, but you'll be able to reopen it with a reply within 30 days or make a new one if it auto locks due to inactivity.

0 Peter Jaquiery over 4 years ago in reply to Ralph Jacobi

Intellectual 615 points

Hi Ralph.

I'm not sure which test you would like repeated. Our normal operating mode with the EVMs is to power and debug them using the debug USB port. However the system we are developing is a USB device so we essentially always had the OTG USB port connected in device mode. I quite likely unplugged and replugged the OTG port at times to test device connect and disconnect. I have no recollection of failure immediately after plugging a USB cable into the EVM OTG port. I'm pretty sure I'd have noticed that.

Thanks for the offer to help with getting an FA going should we get another failure.

I'm happy to close this thread for now. Without another failure it looks like we've pretty much squeezed all the juice we can.

0 Ralph Jacobi over 4 years ago in reply to Peter Jaquiery

TI__Guru*** 128255 points

Hi Peter,

I see, that was my misunderstanding then. In the original post you said "In each case the systems have been powered from USB ports. " referring to the EVM and the custom board, so based on that I thought you were powering the EVM with the USB plug for the device itself and not for debug which would expose it to the chance of that issue more often. Given you didn't see issues with it connected like that though, I agree that is probably not relevant then.

And alright, sounds good regarding the thread. I suppose this is a rare case where it would be a good thing for all parties if I don't hear from you again then haha. Best of luck!

Arm-based microcontrollers

Arm-based microcontrollers forum

TM4C129x low 3V3 short