This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TUSB8020B: Intermittent SuperSpeed link losses

Part Number: TUSB8020B
Other Parts Discussed in Thread: TUSB8041, TUSB8044AEVM, , TUSB522PEVM

Dear Support Team,

We have been using TUSB8020B in our product for a couple of years and have not experienced problems with link stability. So far, we mostly equipped our systems with Intel NUC series PCs, but recently we started to experiment with different solutions due to unsatisfactory system-level data latencies with newer families of NUCs and/or Windows and/or drivers. We have achieved satisfactory results of throughput and latencies using a separate add-on USB 3 PCIe card. However, when experimenting with various onboard USB 3 ports of a couple of motherboards, we have experienced serious problems with link stability between the PC and the hub. The intensity of the problems seem to depend on the motherboard port, but there is no clear pattern, i.e., Gen2 ports are neither better nor worse than Gen1 ports and even two neighboring ports connected to the same controller may behave much differently. The frequency of the problems is not constant in time, either. There happen periods of flawless connection lasting for a couple of minutes, and then re-plugging the device causes the problems to return.

The problem looks as follows. When our TUSB8020B hub is connected to the PC, it is correctly recognized (as USB 2.0 and USB 3.0 devices) and appears in USB TreeView as expected. Then, when our device is connected to the hub and powered up, its both controllers (independent Super Speed Cypress FX3 and Full Speed FTDI FT230X) are also correctly recognized, but shortly after, the Super Speed device cyclically disappears from the system and then reappears with a period of a few seconds, which can be seen in USB TreeView (and heard as standard "ding-dong dong-ding ding-dong" from the system). The Full Speed controller remains properly connected.

We have captured the waveforms on SS_DNx (purple/3), SS_UP (cyan/2), SSTXx_UP (blue/4) and SSRXx_UP (yellow/1) pins of TUSB8020B. Unfortunately, we do not have access to an USB 3 protocol analyzer. We were also unable to capture SSRX and SSTX at the same time for an interesting reason: with two oscilloscope probes (10 MOhm ~3 pF) touching an SSRX and an SSTX line simultaneously (no matter P or M), the link instability disappeared!

The following pattern repeats: SS_UP goes high and SS_DNx goes high as well. After 300 to 2000 milliseconds (the time varies widely) SS_UP goes low, SS_DN remains high for about 2.5 seconds, then goes low (for about 100 milliseconds). While SS_UP is low, the waveform on SSTXx_UP looks like link negotiation.

Before you ask:

1. The voltage on USB_VBUS is stable at 0.5 V and does not change, even during the link loss event.

2. The voltage on USB_R1 is stable at GND (actually, about 10 mV) and does not change, even during the link loss event.

3. The voltage on TEST is stable at GND (actually, about 10 mV) and does not change, even during the link loss event.

4. The power pad of TUSB8020B is well-connected to signal GND (see picture). The quality of the solder joints has been verified.

5. The temperature of TUSB8020B is about 44 deg C.

We underscore that our design worked properly with Intel NUC PCs for a couple of years. We have Super Speed transmission lines in our boards on both sides of TUSB8020B (approx. 10 cm in length), the lines are carefully laid out and our PCBs are manufactured with impedance control for symmetric lines. We have board revisions from two manufacturers and both experience similar problems (the older revision seems to lose the link less frequently, however).

We know that NUC mainboards are small and therefore they have relatively short Super Speed transmission lines, and this is also the case with the add-on PCIe card, while normal PC motherboards are much bigger and often have much longer transmission lines. We have seen complaints on unreliable USB 3 ports in various mainboards. Our most tested mainboard has signal redrivers in the transission lines, and we have seen complaints on redriver-equipped ports as well. We also know from our own experience that Cypress FX3 performs poorly with redrivers in the transmission lines (but here we have TUSB8020B in the middle).

We suspect that the problem is poor quality of the link between the hub and the host on the PC MB, not necessarily on our side (worked perfectly with NUC), maybe due to a very large distance between the sockets and the MB chipset.

- Why then the problem only occurs only with our device on the downlink side of TUSB8020B? We have tried other devices: hard disks, video grabbers -- none experienced similar problems.

- When our device is connected directly to the same MB port, it works properly (no link loss). Why adding a hub in between induces the problem?

- Can this be related to the short Pending HP Timer of TUSB8020B?

We would be grateful for any suggestions and help with interpreting the oscilloscope waveforms.

Best regards,

Michal

  • Hi Michal:

          Where is TUSB8020B located? on the  PC MB where with redriver ? Cypress FX3 is USB3 device connected to TUSB8020B downstream port?

         Can you draw a block diagram?

    Regards

    Brian

  • Hi Brian,

    Here is the diagram, I hope it is clear:

    We do not experience the described problems with Intel NUC computer nor with the add-on PCIe USB card with ASM3142 (with the newest firmware uploaded to ensure chipset compatibility). Interestingly, the onboard ASM3042 USB host (firmware version unknown) exhibits the problematic behavior in a degree similar to other onboard ports connected directly to Intel chipset.

    Regards.

    Michal

  • Hi Michal:

      It's better to look at the traffic with USB analyzer. Since you don't have it, is it possible to ship one PC MB to us?

      We suspect USB redriver behave abnormal.

       For ASM3042, it could be firmware issue.

    Regards

    Brian

  • We know that an analyzer would be the best solution, but we don't have one yet.
    We are not sure that sending this particular motherboard will help anything. On the other test board (different manufacturer and chipset, same generation of processors) the same problems occur, only with slightly less intensity. In fact, one only has to browse the Internet to find an absurdly large number of posts on discussion forums related to the malfunctioning of USB3 ports on most new motherboards. Apparently, someone has exaggerated the technology and it has become unreliable.
    To sum up - the problem does not apply only to this motherboard, but rather to a greater or lesser extent to all new motherboards. We would just like to know its genesis. We can temporarily or permanently solve the problem by using an expansion card, but we would prefer to know the more precise reason for the current malfunction. The device of which the TUSB8020b is a part is a medical device and this is important to us.

    Now the questions:

    1. As you have aptly noted, we are most puzzled by the good performance on the card with the ASM3142, and the frequent problems with the ASM3042 placed directly on the MB right next to the USB port.
    Are you suggesting a problem with the ASM3042's firmware, or do you have any other reports of such USB3 link disconnection errors for this or a similar host that have been fixed with a firmware update?

    2. Can you tell from the attached waveforms which device initiates the link breakup - the TUSB8020b, or the host, or the camera with FX3?

    3. Whether the quality of the USB signal is responsible for the link breakup, or too long signal propagation, or some incompatibility between all components. Consequently, could switching to newer TUSB8041/43/44 type chips help something?

    Regards.

    Michal

  • Hi Michal: 

       For Q1, I don't have any reports for USB3 link disconnection errors , I just saw some other people reported they can fix the issue after update firmware.

       For Q2: It's hard to get link information from  your waveform. Can you zoom the SSTX and SSRX waveform, SSRX seems get LFPS signal and followed by  polling signal, then go to idle , but SSTX just keep sending data signal.

       For Q3:  I don't worry too much about signal quality, since  signal after USB redriver or add on card should be pretty clean  and it should compensate the 20cm loss from MB. Only concern is from 1.5m cable from camera to TUSB8020B downstream port, Is  it standard USB3 cable or build by your own? There are 20db loss allowed between USB host to device at 5Gbps, Do you know what loss if it;s your own cable?

        It could be incompatibility issue as well since TUSB802B is USB3.0 hub, and all new MB is USB3.1 or USB3.2 host.

       Do you have the link of those new MB issues people talking about? I like to read and understand more to see if I can duplicate the issue in our lab

    Regards

    Brian

  • Hi Brian,

    Here are the zoomed waveforms:

    Regarding the cable, its datasheet states ~2 dB/m attenuation at 5 Gb/s, it comes from a verified supplier, and it is carefully finished in our lab: both plugs are soldered under a microscope and in our opinion they look better than in most off-the-shelf cables that we have tested. Anyway, we have tried various custom and off-the-shelf cables of different lenghts -- no difference. When the setup works (on a "good" PC USB port), it even tolerates an additional short extension cable between the hub and the camera head. And when it fails, it fails with any cable. Replacing the camera head does not change anything. Replacing the cable on the PC side does not change anything, either: on a "good" USB port, an 1.8 m cable works, on a "bad" port, even a 0.3 m cable fails.

    Regarding ASM3042 firmware, we have been informed by ASRock support that there is no newer version.

    And finally some examples of reported USB3 disconnecting and reconnecting (but nothing specific to TI chips):

    https://www.reddit.com/r/ASUS/comments/rco3lk/gaming_tuf_z690_usb_ports_disconnecting/ (one of our test PCs also has a TUF Gaming MB),

    https://www.reddit.com/r/gigabyte/comments/1696101/usb_devices_keep_disconnecting_and_reconnecting/

    https://www.dell.com/community/en/conversations/inspiron/usb-keeps-connecting-disconnecting/647f9f59f4ccf8a8de423e75

    https://answers.microsoft.com/en-us/windows/forum/all/usb-composite-device-keeps-disconnecting-and/670173c4-da82-4a90-97c8-875cb209e1fc

    There are lots of similar reports recently, we searched for issues similar to ours and have seen so many that gave up searching further. The typical advices are reinstalling drivers, changing system power settings, etc., usually with no effect:

    https://www.makeuseof.com/how-to-fix-usb-device-disconnecting-reconnecting-windows-10/

    https://www.minitool.com/data-recovery/usb-keeps-disconnecting.html

    Regards,

    Michal

  • Hi Brian,

    We were in a bit of a hurry last time and didn't send the waveforms in the right order.

    The colors of the waveforms indicate the same signals as in the first series of waveforms.

    The first image is a general overview of the whole reconnection cycle, the second is the zoomed moment of loss of the SuperSpeed link on the host side, the third and fourth images are the zoomed moments of loss of the SuperSpeed link on the camera side (and then the moment of the reconnection)The two last images were doubled because it was difficult for us to get all four signals recorded together, and in the last one SSTXx_UP probe (blue) is also connected. In the first and third images, the probe is not connected to SSTXx_UP (just noise in the waveform).

    Regards,

    Michal

  • Hi  Michal:

       Thanks for the waveform and the link of  those issues.

       For the yellow waveform,  it is  single ended or  differtial signal,? it seems around 800mv to 1000mv.

    Regards

    Brian

  • Hi Brian,

    Yes, all signals are single ended with probe set as 10x (10 MOhm || 10 pF). We do not have a HF differential probe.

    Regards,

    Michal

  • This is SS signal looks like, LFPS signal polling followed by USB CP0 data

  • From your waveform, It looks like CP0 pattern, but SS-UP is still low, so it should not in SS state yet.

    Regards

    Brian

  • Hi Brian,

    Thank you for your answer. We are conducting further experiments using EVMs with TUSB8020B and TUSB522 redriver. We will share the results tomorrow.

    Regards,

    Michal

  • Thanks for the update.

    Regards

    Brian

  • Hi Brian,

    Sorry for the delay in responding, we were doing a lot of tests and measurements.

    We carefully analyzed with an oscilloscope all the waveforms from the host side and from the device side with enough time resolution to be able to distinguish all types of LFPS signals.

    The first picture shows a general view of the 3 cycles of connecting and disconnecting of the device with our interpretation of the signals.

    In the next pictures, magnified fragments of the waveforms are shown. First comes the LFPS.Polling:

    It is followed by multiple U1-U0 transitions, which end at a random moment with a timeout and a transition to SS.Inactive shown in the next picture.

    We have verified experimentally that while RX.Detect, the HUB tests the receiver connection every 20 ms. In the picture above, the gaps are about 12 ms, which clearly points at SS.Inactive. The HUB can only exit SS.Inactive via physical disconnection of the link or via WarmReset. The Host sends WarmReset after a 2-second timeout as shown below in a magnified picture.

    There is just one failed (probably timed out) attempt to transit from U1 to U0, marked with a red exclamation mark in the graph below. All other operations visible in the waveforms are correct and result either from the previous operations of from the error mentioned above.

    The time between the transition to SS.Inactive and the failed U1 to U0 transition is always between 4.5 ms and 6 ms, suggesting that either Ux_EXIT_TIMER or tRecoveryIdleTimeout is exceeded. Most likely, it is the HUB which is responsible for the disconnection, and it happens during Recovery.Active, Recovery.Configuration or Recovery.Idle. It is further supported by the fact that the error only occurs with the camera equipped with Cypress FX3. We have tested a lot of other devices – mass storage and frame grabbers – and the disconnection never happened. Our interpretation of the waveforms allows to attribute the disconnections to the unusual behavior of FX3, which repeatedly, in 1 ms periods, switches between U1 and U0, which is not disallowed, but forces frequent Recovery states. When we have disabled the support for U1/U2 in the HUB (register 5 bit 5, u1u2Disable), the errors disappeared because of the lack of Recovery after U1 to U0 transition.

    This configuration change eliminates the problem but does not fix its root cause which remains unknown. In a medical device, we need to know the origin of the problem. It would be the simplest to attribute the problem to a poor signal quality in the SSRX_UP lines, but...:

    1. After disabling U1/U2 support in the HUB, i.e., with link disconnection eliminated, we performed long hours of data transmission stability tests – the camera sent over 50 million frames at 3200 frames per second without any problems.
    2. We have found an early prototype of our hub PCB, which had the USB SS traces less carefully laid out and had a different type of socket at the Host side (A-type, now we have Micro-B-type). This board does not exhibit the problem with link disconnection with any host, and perfectly tolerates the frequent transitions between U0 and U1.

    It is then hard to believe that the signal quality is poor when we can transfer huge amounts of data without any problems, and a similar (but less carefully designed) board with similar length of USB SS traces does not exhibit the issue.

    We would like to ask whether out interpretation of the waveforms is correct and we will be very grateful for any suggestions on what can be the problem during Recovery.

    We have also some additional questions:

    1. As we understand, disabling U1/U2 only concerns transitions initiated by a device or a host. In the case of transitions initiated by the HUB, due to its DS ports into Inactive state, it can go into U2. Is that right?
    2. Our waveforms show that in the SS.Inactive state, the test pulses from the HUB are negative peaks with 0.6 V amplitude. Is this correct? From the USB3 documentation and examples found in the Web, it should be positive pulses rather than negative ones.
    3. Is TX Common Voltage in U1 state actually lower by about 500 mV than in U2?

    Regards,

    Michal

  • Hi Michal:

       Thanks for your detailed information to debug the issue:

    1: if the downstream port device only support u1, then it will not go into U2.

    2: I saw it before , RX detect signal is positive pulse, But I will confirm again in our lab.

    3: In U1 u2, TX common mode voltage should not be change much.

       We saw another u1u2 issue recently which can be solved by u1u2 disable as well. Can you try with  09h:20 and 05h:12 which change u1u1 timeout ?

    Regadrs

    brian

  • Hi Brian,

    Thank you for sharing the additional information. We have set register 09h to 20h (reserved bit 5) and register 05h to 12h (reserved bit 1) -- I guess you meant hexadecimal values -- and it helped, too. The effect is similar to setting register 05h to 30h. The camera no longer sends the repeated U0-U1 transition requests, and the hub does not send them to the host. The link seems to be stable (after relatively short testing). When your new solution is applied, the communication between the hub and the host just after connecting resembles the communication with default configuration, three power state changes can be seen. With 05h:30h, there were no power state changes visible.

    Could you share more information on what actually changed with these bits? We would prefer to know what we fix and why, particularly that this is a medical device and we have to put some design information into our quality management documents. And which of the two solutions would you recommend to apply?

    Regarding question 1, I am afraid there was some misunderstanding. We have asked about a situation when downstream devices are all inactive/disconnected -- can the hub go into U2 then, even with u1u2disable bit set? (We do not have a problem with it, just prefer to know for sure to have a well-documented design.)

    Regards,

    Michal

  • Hi Michal::

        for 09h, bit 5 is used to enable custom U1U2 feature.

       for 05h, bit 1 is to set U1U2 timeout value, when this bit  is 1, timeout will be FF. For timeout FF, hub will not initial U1U2, but will accept U1U2 request from link partner

    Regards

    Brian

  • Hi Michal:

       Any update? Can you tell me what particular mother board you have issue? We may get one here to test as well.

    Regards

    Brian

  • Hi Brian,

    We tested TUF GAMING B660M-PLUS from ASUS and Z790 LiveMixer from ASROCK.

    On all boards we checked only the ports on the rear panel. On the TUF GAMING motherboard half of the ports do not work, and without any logical explanation - one of a pair of theoretically identical ports works, and the other does not. On this board the situation is stable - always a given port works or does not work. On the LiveMixer board the situation is worse. Of the 8 USB3 ports on the rear panel, none works much of the time, but it happens that three of them are able to work properly for several hours (including the only one that is connected to the ASM3042 controller). We also had such a case that for half of the day all 8 ports worked without any changes on our part - the day before none worked, the test system was turned off, the next day it was turned on and the non-working ports worked. An absurd situation.
    By comparison, on a dedicated card with the ASM3142 chipset, the problem never appeared during several days of testing.

    We ran more thorough tests on one of the selected ports on the TUF GAMING board (USB3 type A port placed next to type C), on which none of our current production boards ever worked.

    1. We tested the TUSB8044AEVM - no problems.
    2. We tested TUSB8020BEVM - no problems.
    3. We gave a redriver TUSB522PEVM between the production board and the computer - problems disappeared.
    4. We tested a very old prototype of our production board (same power supply, slightly different USB3 path layout, different USB3 sockets) - no problems.
    5. we tested an older production version (from a demonstration copy of our hardware) - problems appeared only occasionally.

    As you can see, the situation is strange and apparently there are some problems when transferring TS1 or TS2. Due to the fact that an early prototype version is running, we are practically sure that there is no problem with the power supply. Due to the fact that the situation does not change when changing cable lengths from 0.2 to 3 meters, we do not suspect that the problem is excessive signal attenuation. The problem is more complicated and we now have a hypothesis - the latest production version was made with ISOLA IS400 laminate. Perhaps there is too much dispersive signal noise. We do not know which laminate the prototype was made with. What we do know is that the first production version (which works ok almost all the time) was made using IT180A laminate, which is much better at USB3 frequencies.

    We are currently redesigning the board for a laminate that we will be sure is not a problem.

    Regards,
    Michal

  • Hi Michal:

      Thanks for all the information.  How long to re build the new board?

    Regards

    Brian

  • Hi Brian,

    We additionally replaced the socket on the host side (now we use the same B socket as on EVMs) and added additional filter stages for the main supply voltage and 1V1 voltage. To make sure that it is not the PCB laminate that is causing the problems, the next PCB will be made with isola i-speed prepreg. It went into production today. Build time of about two weeks. We will let you know what the results will be.

    Regards,

    Michal

  • HI Michal:

        Thanks for your update.

    Regards

    brian