Hello everyone,
I have a product with the TUSB4041I chip and some problems appeared recently. I will try to describe the symptoms in a compact way.
- The product was designed and built within the last year, about 20 units. Every unit works in the beginning.
- After a couple months of operation we have 4 units which stopped being recognized by some USB hosts. Once a unit stops working with a particular host, we have not seen it starting to work again.
- If a bad unit is connected via an external USB 2.0 hub (host-powered) it works just fine even with the host with which it does not work directly.
- We tried different cables.
We always use computers with Ubuntu and we look at dmesg reports. If we have a unit that stopped working (all of them initially work) then the dmesg stays completely quiet after the device is connected, just like if the data lines of the USB were not connected.
The reset pin is connected to an RC circuit with time constant of about 20ms. Manually pulling down the reset pin after the device was not recognized does not help. Also power cycling without disconnecting the USB lines does not help.
After some troubleshooting I noticed that the clock oscillator behaves strange when the system is not recognized, so I tried to tackle it there. One of the attached scope screenshots shows the strange behavior on the clock, and the other screenshot shows a nicely operating clock when the same device is connected to the same host via a USB hub. The signal is measured on one of the pins of the crystal. Note the different time scales.
The crystal used is ABM3B-24.000MHZ-10-1-U-T. I tried the following steps to see if I can make it work:
- Remove the 18p caps
- Put the caps back, remove the 1M resistor
- Put the 1M back, add a 1k resistor in series with the XO pin
After each of these steps the device was working just fine via an external hub, but did not work without (i.e. the steps taken had no effect at all, also did not make it worse).
Later I read that other people had issues with the TUSB4041 being stuck in suspended mode, and that in that mode the clock might be shut down. Is this what is happening? Can the clock look so weird when shut down?
I also read that the soldering of the heat pad of the chip is somehow critical. We have a PCB footprint which is created according to the datasheet, but we don't know what thickness stencil our assembly house is using. We also have no easy access to an x-ray. Is the solder issue purely thermal? I cannot see how the solder quality on the heat pad can affect the signal integrity as long as there is a reasonable amount of solder and a solid connection. In our application the chip is additionally attached to a heatsink from the top, which should improve the cooling. The PCB which has the TUSB4041 on it is inside a passively cooled enclosure and the temperature inside the enclosure can reach 50-60 deg C, but I would not expect the chip to get permanently damaged like that even if it was operating in high temperature (I would expect it stops working when overheated, but then the device would shut down due to lost connection and cool down, so no crazy high temperatures are possible).
So to summarize: all our devices initially work fine, then after some time some of them permanently stopped being recognized by some PCs, while still working fine with other PCs or when connected via a hub.
The strange thing is that everything initially works and only stops after a few weeks or months of operation.
I already looked at the following threads:
https://e2e.ti.com/support/interface/f/138/t/730153?TUSB4041I-How-to-check-behavior-
https://e2e.ti.com/support/interface/f/138/t/548848#pi239031350=2&pi320995=2
My plan now is to try an oscillator instead of a crystal, or even just use a crystal from different manufacturer, but it would be great to understand the problem. I appreciate any hints about what to do. We have customers who have their operations on hold and are waiting for us to find a reliable solution. Swapping a PCB with another one is not that great of an idea, because the new PCB can stop working in the same way any time.