This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/TUSB3410: Bulk Transfer error with USB3.0 xHCI hosts

Part Number: TUSB3410

Tool/software: Linux

Hi all,

I have tested our USB to serial products which use TUSB3410. There is a possibility of packet loss when I test with USB3.0 hosts (xHCI driver), but it tests normal when using USB2.0 hosts.

It is very easy to reproduce when transferring large data size (8KBytes) with USB3.0 hosts

Both Linux Kernel 3.11 and Kernel 4.13 have the same problem. Furthermore, I have tested by connecting the USB Protocol Analyzer tool, it shows that USB host sends entire data to the device without packet loss, but the device lose the packet to send. So I wonder whether it is the compatible issue for TUSB3410 with USB3.0 hosts.

Best regards,

CT

  • CT,
    It doesn't matter whether it is a USB2.0 Host or a USB3.0 Host, the TUSB3410 always connect at full-speed, your USB3.0 Host will just use its USB2.0 "portion" to enumerate and operate the TUSB3410.

    Can you check the xHCI driver at host controller?
  • Hello CT,

    What USB3.0 host are you using? it could be a problem with your xHCI host.

    Regards,
    Roberto
  • Hi Roberto and Peter,

    I type `lsusb -t` and it shows the USB host controller is xhci_hcd.

    # lsusb -t
    /:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
        |__ Port 3: Dev 3, If 0, Class=Vendor Specific Class, Driver=mxu11x0, 12M

    If it is the problem of xHCI host, how to explain that CATC(USB Protocol Analyzer) receives the whole data from the USB host but the USB device sends the data with packet loss?

  • Can you provide the CATC trace?

    Regards,
    JMMN
  • Hi JMMN,

    Here are two CATC files which include XHCI and EHCI traces.

    XHCI trace got 8192 bytes from USB3.0 host and then sent to the USB device, but the first 64 bytes of 8192 were not sent out from the USB device. 

    EHCI trace is almost the same as XHCI trace, but the result is normal.

    uport11x0_usb3_ubuntu17_xhci.usb

    uport11x0_usb3_ubuntu17_ehci.usb

  • Hi CT,

    The XHCI starts the bulk transfer to endpoint 1 with a DATA Toggle bit of 1 instead of 0. The TUSB3410 is likely trashing the first packet due to data toggle error. Please check for an updated driver.

    From the USB 2.0 Specification:

    5.8.5 Bulk Transfer Data Sequences
    Bulk transactions use data toggle bits that are toggled only upon successful transaction completion to
    preserve synchronization between transmitter and receiver when transactions are retried due to errors. Bulk
    transactions are initialized to DATA0 when the endpoint is configured by an appropriate control transfer.
    The host will also start the first bulk transaction with DATA0.

    Regards,
    JMMN
  • Hi JMMN,

    Thanks for your reply.

    I think you are right! I've done several tests after reading your post, and the test result is just like what you said - the first packet lost whenever its Data Toggle bit is 1.

    However, I noticed that there was already a workaround in `open()` function of the driver that will reset the data toggle:

    	/* reset the data toggle on the bulk endpoints to work around bug in
    	 * host controllers where things get out of sync some times */
    	usb_clear_halt(dev, port->write_urb->pipe);
    	usb_clear_halt(dev, port->read_urb->pipe);

    Why does this still happen if we have already applied this workaround?

    Which drivers should I update? XHCI driver? ti_usb_3410 driver?

    Regards,

    CT

  • CT,
    If eHCI is working, then this points to an issue with the xHCI driver and/or the xHCI host controller itself using the wrong toggle. As you said, the ti_usb_3410 already has a workaround to try to fix the data toggle, but it appears it's not working on the particular xHCI host you are using. I can only suggest updating your xHCI driver or trying a differen xHCI host controller.
    Regards,
    Brian
  • I found that XHCI doesn't really reset endpoint...

    static void xhci_endpoint_reset(struct usb_hcd *hcd,
    		struct usb_host_endpoint *ep)
    {
    	struct xhci_hcd *xhci;
    
    	xhci = hcd_to_xhci(hcd);
    
    	/*
    	 * We might need to implement the config ep cmd in xhci 4.8.1 note:
    	 * The Reset Endpoint Command may only be issued to endpoints in the
    	 * Halted state. If software wishes reset the Data Toggle or Sequence
    	 * Number of an endpoint that isn't in the Halted state, then software
    	 * may issue a Configure Endpoint Command with the Drop and Add bits set
    	 * for the target endpoint. that is in the Stopped state.
    	 */
    
    	/* For now just print debug to follow the situation */
    	xhci_dbg(xhci, "Endpoint 0x%x ep reset callback called\n",
    		 ep->desc.bEndpointAddress);
    }

    Unlike ehci_endpoint_reset(), which calls usb_settoggle() to reset Data Toggle.

    Does anyone know how to do software reset in alternative way?

  • In kernel v3.11, that function actually calls xhci_queue_reset_ep() which issues a Reset Endpoint TRB which should reset the data toggle.  I'm not sure why this removed in the later kernel versions.  It seems like a change that is incomplete.

    /* Deal with stalled endpoints.  The core should have sent the control message
     * to clear the halt condition.  However, we need to make the xHCI hardware
     * reset its sequence number, since a device will expect a sequence number of
     * zero after the halt condition is cleared.
     * Context: in_interrupt
     */
    void xhci_endpoint_reset(struct usb_hcd *hcd,
    		struct usb_host_endpoint *ep)
    {
    	struct xhci_hcd *xhci;
    	struct usb_device *udev;
    	unsigned int ep_index;
    	unsigned long flags;
    	int ret;
    	struct xhci_virt_ep *virt_ep;
    
    	xhci = hcd_to_xhci(hcd);
    	udev = (struct usb_device *) ep->hcpriv;
    	/* Called with a root hub endpoint (or an endpoint that wasn't added
    	 * with xhci_add_endpoint()
    	 */
    	if (!ep->hcpriv)
    		return;
    	ep_index = xhci_get_endpoint_index(&ep->desc);
    	virt_ep = &xhci->devs[udev->slot_id]->eps[ep_index];
    	if (!virt_ep->stopped_td) {
    		xhci_dbg(xhci, "Endpoint 0x%x not halted, refusing to reset.\n",
    				ep->desc.bEndpointAddress);
    		return;
    	}
    	if (usb_endpoint_xfer_control(&ep->desc)) {
    		xhci_dbg(xhci, "Control endpoint stall already handled.\n");
    		return;
    	}
    
    	xhci_dbg(xhci, "Queueing reset endpoint command\n");
    	spin_lock_irqsave(&xhci->lock, flags);
    	ret = xhci_queue_reset_ep(xhci, udev->slot_id, ep_index);
    	/*
    	 * Can't change the ring dequeue pointer until it's transitioned to the
    	 * stopped state, which is only upon a successful reset endpoint
    	 * command.  Better hope that last command worked!
    	 */
    	if (!ret) {
    		xhci_cleanup_stalled_ring(xhci, udev, ep_index);
    		kfree(virt_ep->stopped_td);
    		xhci_ring_cmd_db(xhci);
    	}
    	virt_ep->stopped_td = NULL;
    	virt_ep->stopped_trb = NULL;
    	virt_ep->stopped_stream = 0;
    	spin_unlock_irqrestore(&xhci->lock, flags);
    
    	if (ret)
    		xhci_warn(xhci, "FIXME allocate a new ring segment\n");
    }
    
  • I think even in kernel v3.11 or older version, we can't still reset data toggle by software because xhci driver refuses to reset when it is not in halt state.

  • CT,
    Yes, you are correct. The xHCI host will reject the reset EP command if the EP is not in halted state according to the xHCI 0.96 specification. My only other suggestion would be to try "xhci_queue_stop_endpoint()" before resetting the endpoint. Seems the Linux community may be working on fixing this already: www.spinics.net/.../msg166890.html
  • Thank you, Brian.

    It seems we can't do anything else except waiting for the Linux community.
    By the way, I want to know if there are any others using TUSB3410 have the same problem?
    And... Why does our USB product work well when it is running with xHCI on Windows? (I've analyzed the USB sniffer, our USB product accepts both DATA0 and DATA1 of the first packet on Windows.)

    Regards,
    CT
  • CT,

    This is the first time I've heard of this issue.  Are you testing with the same PC on Windows as you are for Linux?  I want to make sure we are comparing results with the same xHCI host controller and device.  Can you send the traces for both Windows and Linux under the same test condition?   There must be some difference as our device shouldn't behave differently if the packet streams are the same.

    Regards,

    Brian 

  • Hi Brian,

    Here are the traces with the same USB host (and the same motherboard) on Win10 and Ubuntu:

    win10_xhci_bulk_out_256_bytes.usb

    ubuntu_xhci_bulk_out_256_bytes.usb

    I tried to test under the same test condition, but the device drivers and the test programs are different between Windows and Linux after all.

    These two traces show that I sent 256 bytes to the USB device and both data toggle bit started with DATA1, but it was dropped only in the USB device attached to Ubuntu.

    Regards,

    CT

  • CT,

    I'm not able to make a good comparison of the traces.  In the Ubuntu trace, I see a CLEAR FEATURE ENDPT HALT which would reset toggle to zero and then first transaction to that endpoint is the Bulk OUT transfer with toggle = 1 which is incorrect. 

    For the Windows trace, I'm not seeing any CLEAR FEATURE ENDPT HALT so using data toggle of 1 may be okay there (previous bulk OUT may have used toggle = 0 but I could not see it in the trace).  Is Windows never issuing CLEAR FEATURE ENDPT HALT or was it not captured.

    Can you capture the Windows trace showing the CLEAR FEATURE ENDPT HALT and then the first bulk OUT transfer to the device after that?

    Regards,
    Brian

  • Hi Brian,

    You are right, I've captured some traces from device initialization to port opened, and there is no CLEAR_FEATURE ENDPOINT_HALT on Windows.

    It is very strange that Windows device driver doesn't need to CLEAR FEATURE but Linux driver does.

    I know endpoint_reset, which called in usb_clear_halt(), is different between XHCI and EHCI on Linux so that the data toggle will not be set to zero on XHCI. 

    But when we compare Linux with Windows, we just don't know why Linux device driver needs this workaround and Windows driver seems no need for the same USB device.

    Regards,

    CT

  • Hi CT,

    That's the problem. In usb_clear_halt(), Linux sends the CLEAR FEATURE to the device which causes the device to reset it's toggle to zero, but the usb_reset_endpoint() does not actually reset the xHCI host's toggle. You can probably work around this issue by commenting out the usb_clear_halt() calls in the ti_usb3410_5052.c: ti_open(). I'm not sure of the history of that reset either so it's possible that change could cause issues on some eHCI hosts. The proper fix is to get the xHCI driver maintainers to fix the reset function so it actually performs a true reset instead of only printing debug.

    Regards,
    Brian

  • Hi Brian,

    Thanks for explaining about usb_clear_halt() and CLEAR FEATURE. You make it clearer for me.

    I've tried to comment out usb_clear_halt() in open(), but it makes bulk transfer abnormal even in eHCI hosts. Is it possible to fix Linux device driver without using usb_clear_halt()? I think driver in Windows can work well without CLEAR FEATURE, Linux should too.

    Regards,
    CT
  • CT,

    I agree that if Windows can work, then Linux should as well. That said, I don't know enough about the Linux driver stack or the root cause to why they require the usb_clear_halt() to propose a solution. I tried to e-mail the driver developer but did not get any response. Sorry I could not be more helpful.

    Regards,
    Brian
  • Hi all,

    I have a workaround, and so far it works fine for me. Just call usb_set_interface() in port open().

    According to USB 2.0 Spec. 9.1.1.5: 

    9.1.1.5 Configured
    Before a USB device’s function may be used, the device must be configured. From the device’s
    perspective, configuration involves correctly processing a SetConfiguration() request with a non-zero
    configuration value. Configuring a device or changing an alternate setting causes all of the status and
    configuration values associated with endpoints in the affected interfaces to be set to their default values.
    This includes setting the data toggle of any endpoint using data toggles to the value DATA0.

    Although xHCI refuses to reset endpoint when calling usb_clear_halt(), we can still use usb_set_interface() to initialize toggle sequence.

    I put the following codes in the beginning of open():

     	/* reset the data toggle on the bulk endpoints to work around bug in
    	 * host controllers where things get out of sync some times */
    	usb_clear_halt(dev, port->write_urb->pipe);
    	usb_clear_halt(dev, port->read_urb->pipe);
    	usb_set_interface(dev, 
    			dev->config->interface[0]->cur_altsetting->desc.iInterface, 
    			dev->config->interface[0]->cur_altsetting->desc.bAlternateSetting);    

    Hope this can help anyone who might suffer from the similar issue.

    Regards,

    CT