This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TUSB7320: failure during burn-in testing of connected USB device

Part Number: TUSB7320

Can any electrically-valid behavior of a connected USB2 device cause a permanent failure of the TUSB7320 USB Host Controller?

One of our PCI Express card designs has the TUSB7320, without a configuration EEPROM installed, with a single Cypress FX2LP connected (integrated on same PCA).

While performing a burn-in test over the holiday weekend the USB peripheral stopped responding; CyUSB3.sys no longer saw the device, it didn't enumerate in Device Manager, Linux `lsusb` took 20 seconds to display the devices, and the details of the USBHC included multiple error messages.  (All operations on the PCIe portion of the TUSB7320 appear to work fine (only the USBHC and downstream are symptomatic).)

These behaviors remained after cold-booting, moving the PCIe card to a radically different motherboard, etc.

I wonder: can our USB / Cypress FX2LP brick the TUSB7320 USBHC-side circuit?  We have hundreds if not thousands of these cards in the field and this is the first known instance of this problem.

Note: if we move the failed PCIe card to an OLD motherboard it operates correctly.  Perhaps the TUSB7320 "bricked" in a manner that causes it to fail on modern gen PCIe busses?  Our testing continues.

  • what kind of Burn-in test? temp? hrs?

  • The burn-in was written to decrease the time between failures seen in a customer's application.

    The customer's application evokes the symptom we're fixing after some random time from 15 minutes to 15 hours or more, but only issues two control transfers at ~1Hz intervals.

    The burn-in program is a Windows 32-bit program that calls our AIOUSB.dll as often as possible, issuing a vendor-specific USB control transfer (endpoint 0) read.  It achieves more than one thousand control transfers per second on this motherboard/PCIe/TUSB7320/FX2LP device/environment.

    Using my "production" firmware the burn-in program evokes the symptom in seconds.

    I fixed 3 errors in Cypress' FX2 EZ-USB SDK source code and the burn-in ran for a few hours before tripping the symptom.

    I switched from Ezusb.lib's IRQ-based I2C read/write functions to non-IRQ implementations and the symptom no longer occurs.

    However, one of the several cards running the burn-in test over the holiday weekend "vanished" during the testing; it stopped responding to USB requests, even after rebooting, cold booting, and even when the PC was booted to a Live USB Ubuntu image — even when the card was moved to a very different PC/mobo it failed.  .... but it works in an ancient PC.

    Other instances of the same card / model / design continue to operate in all PCs as expected, and do not throw the symptom under test (after the firmware updates).

    My question is: could "valid but bad FX2 firmware", or in fact *any* behavior of *any* working, connected, USB peripheral, brick the TUSB7320.

  • is this burn-in test cause this system card reboot (power on/off)?

    this failed card only failed one time?

    Regardds

    Brian

  • The burn-in test does not reboot the card, nor the PC, during the run; the original symptom, now fixed via firmware updates, involves a reboot but that's by design.

    The failed card failed once ... permanently.  We still have not recovered normal operation with that card, which is what prompts me to ask the question, here.

  • John:

         Can you replace the bad unit with good unit to retest? if confirmed the unit is bad, then pls send the bad unit to us for further analysis.

    Regards

    brian

  • We have been able to replace the bad *PCIe card* and the replacements work; we cannot replace the TUSB7320 chip on the bad PCIe card; nor do I think we could remove it without destroying it.

    I'm checking to determine if we can send the failed PCIe card, and a working card, for further analysis.  Please confirm I've understood your request correctly.

  • if you can not remove the bad part from the board, just sent card to us.

    I just want to confirm your card is ok for previous suggestion.

    Regards

    Brian

  • We've continued testing, both the failed card and four other units.  Only the failed card experiences symptoms in hundreds of hours of burn-in, so we're going to assume there is a subtle manufacturing error, presumably in the soldering of the chip.

    Thank you, Brian, for your help.