This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3517 USB RNDIS problem?

Other Parts Discussed in Thread: AM3517

I am using an AM3517 on a LogicPD module to communicate with a PC application via RNDIS.  I am finding that on Windows XP and 7 that USB communication locks up after several minutes.  The PC application and AM3517 communicate using a query-response protocol.  (The PC sends a query, the AM3517 processes it and responds back.)  From two captures of traffic using Wireshark, it looks that the final packet of the response from the AM3517 does not appear and a host receive call times out.  Afterwards, you cannot ping from the computer to the AM3517 or the other direction.  Unplugging and replugging the device I can ping from the computer to the AM3517 again.

I am using TI SDK 05.02.00.00.  I modified one line of the kernel source to disable high speed because hardware problems prevented the AM3517 from communicating with some PCs.  I commented out line "| MUSB_POWER_HSENAB on line 908 of ../drivers/usb/musb/musb_core.c to do this.  I noticed that RNDIS ran extremely slow after I disabled high speed, so I disabled DMA transfer by setting "CONFIG_MUSB_PIO_ONLY=y" in the kernel configuration file.

I know that when the board that the module is on could communicate using high speed and an enabled DMA, this problem was not seen.  Long-term testing of USB communication was done.

Any ideas on how to diagnose what the problem is?  Also, a fix to allow enabling of the DMA could be useful also.

  • Hi all,

    I gathered some more information over the past two weeks. 

    For background, there is a request buffer list that lives on the USB driver, which is made up of struct musb_request.  For transmission of messages via USB, the musb_gadget_queue function is called to add a message to the queue.  This function will start transmitting the first part (64 bytes for full-speed) of the message if it is the only message in the queue.  An interrupt will show up after the transmit is done and the CPU will transmit the next part of the message.  This continues until the message is completely sent and dropped from the request queue.

    It looks like a tx interrupt for continuing the transmission of the request is not seen by the software after a while, causing the transmit end to get stuck.  I sent several pings to the ARM quickly with the -f option. I discovered this by inserting pr_debug statements in various places in the am35x_musb_interrupt and other low level driver functions.  I looked at TXPKTRDY in the TXCSR register when this error occurs, and it goes from 1 after the last transmit buffer was written, back down to 0.  However, the interrupt never shows up.  Is this a known bug in the Linux kernel or AM3517 silicon?  Is there a farily easy workaround for this if this problem can't be solved?

  • I found one solution that appears to work.  I use a timer and reset it every time a transmit message is requested in musb_gadget_queue for endpoint 1.  When the timer goes off, it checks to see if there is any transmit messages left and sends them.  I forgot to handle the case where there could be two transmit interrupt not seen in a row, so the fix could be more substantial.  Also, it only looks at USB endpoint 1.  The patch for this fix is below:

    diff -Naur ./linux-2.6.37-psp04.02.00.07_original/drivers/usb/musb/musb_gadget.c ./linux-2.6.37-psp04.02.00.07/drivers/usb/musb/musb_gadget.c
    --- ./linux-2.6.37-psp04.02.00.07_original/drivers/usb/musb/musb_gadget.c    2011-05-09 16:15:00.000000000 -0400
    +++ ./linux-2.6.37-psp04.02.00.07/drivers/usb/musb/musb_gadget.c    2012-08-01 13:10:32.143886194 -0400
    @@ -1125,6 +1125,31 @@
             rxstate(musb, req);
     }
     
    +/*
    + * The software uses this timer to detect when a transmit interrupt is missed and perform the
    + * transmit.
    + */
    +static struct timer_list m_txtimer;
    +const int EPNUM_TX_TIMER = 1;
    +
    +/*
    + * The purpose of this function is to detect when a transmit interrupt is missed and
    + * perform the transmit, so that the USB transmit end does not lock up for the user.
    + */
    +static void txtimer_timeout(unsigned long musb_ptr)
    +{
    +    struct musb * musb = (struct musb *)musb_ptr;
    +    unsigned long        lockflags;
    +
    +    spin_lock_irqsave(&musb->lock, lockflags);
    +
    +    pr_debug("txtimer_timeout detects possible tx error");
    +
    +    musb_g_tx(musb, EPNUM_TX_TIMER);
    +
    +    spin_unlock_irqrestore(&musb->lock, lockflags);
    +}
    +
     static int musb_gadget_queue(struct usb_ep *ep, struct usb_request *req,
                 gfp_t gfp_flags)
     {
    @@ -1191,6 +1216,18 @@
         /* add request to the list */
         list_add_tail(&(request->request.list), &(musb_ep->req_list));
     
    +    if(request->tx && musb_ep == &musb->endpoints[EPNUM_TX_TIMER].ep_in)
    +    {
    +        //print that the timer went off after one ms
    +        mod_timer(&m_txtimer,jiffies+msecs_to_jiffies(10));
    +    }
    +    //TODO:  ignore printing a warning for now...
    +    //else if(request->tx)
    +    //{
    +        //warn the user if using a different endpoint than 1
    +        //WARN_ON_ONCE(musb_ep != &musb->endpoints[EPNUM_TX_TIMER].ep_in);
    +    //}
    +
         /* it this is the head of the queue, start i/o ... */
         if (!musb_ep->busy && &request->request.list == musb_ep->req_list.next)
             musb_ep_restart(musb, request);
    @@ -1701,6 +1738,9 @@
         musb->is_active = 0;
         musb_platform_try_idle(musb, 0);
     
    +    //set up the transmit timer here...
    +    setup_timer(&m_txtimer,txtimer_timeout,(unsigned long)musb);
    +
         status = device_register(&musb->g.dev);
         if (status != 0) {
             put_device(&musb->g.dev);
    @@ -1714,6 +1754,12 @@
         if (musb != the_gadget)
             return;
     
    +    //delete active timers here
    +    if(timer_pending(&m_txtimer))
    +    {
    +        del_timer(&m_txtimer);
    +    }
    +
         device_unregister(&musb->g.dev);
         the_gadget = NULL;
     }

    Here's the patch that makes the USB run at full-speed:

    diff -Naur ./linux-2.6.37-psp04.02.00.07_original/drivers/usb/musb/musb_core.c ./linux-2.6.37-psp04.02.00.07/drivers/usb/musb/musb_core.c
    --- ./linux-2.6.37-psp04.02.00.07_original/drivers/usb/musb/musb_core.c    2011-05-09 16:15:00.000000000 -0400
    +++ ./linux-2.6.37-psp04.02.00.07/drivers/usb/musb/musb_core.c    2012-08-01 13:18:56.044385309 -0400
    @@ -977,7 +977,7 @@
         /* put into basic highspeed mode and start session */
         musb_writeb(regs, MUSB_POWER, MUSB_POWER_ISOUPDATE
                             | MUSB_POWER_SOFTCONN
    -                        | MUSB_POWER_HSENAB
    +    //JTD hack:  remove high speed enable                    | MUSB_POWER_HSENAB
                             /* ENSUSPEND wedges tusb */
                             /* | MUSB_POWER_ENSUSPEND */
                             );

    If anyone has any info. on why the transmit interrupt does not appear at times please let me know.  I do believe I saw the TXPKTRDY bit in the TXCSR register go to 1 after writing the last transmit packet and come back to 0, but did not see the tx interrupt bit go high on endpoint 1 in the EP_INTR_SRC_MASKED_REG.

  • I dug more into it and found the true source of the "missed" tx interrupt.  Looking at the AM3517 Silicon Errata document, advisory 1.1.20 states that reads of the POWER or FADDR registers can clear the INTRTX register.  This is exactly what was happening.  At the beginning of the musb_interrupt function this read happens and provides no added functionality to the code.  Below is my patch for this bug.  I am not sure why this bug did not appear to be exposed in high-speed mode, but it is certainly a bug.

    Notice that this patch does not handle excessive reads of FADDR, INTRTXE, INTRRXE, or INTRUSBE, if they are there (I think FADDR is read directly, though software should know the value).  Reads of the endpoint fifos are done correctly for the AM3517 (in am35x_musb_fifo_read).

    Also, do not apply the "JTD hack" in this patch at 977 if you want high speed USB.

    --- ./linux-2.6.37-psp04.02.00.07_original/drivers/usb/musb/musb_core.c    2011-05-09 16:15:00.000000000 -0400
    +++ ./linux-2.6.37-psp04.02.00.07/drivers/usb/musb/musb_core.c    2012-08-01 18:03:52.504460106 -0400
    @@ -488,11 +488,11 @@
      */
     
     static irqreturn_t musb_stage0_irq(struct musb *musb, u8 int_usb,
    -                u8 devctl, u8 power)
    +                u8 devctl)
     {
         irqreturn_t handled = IRQ_NONE;
     
    -    DBG(3, "<== Power=%02x, DevCtl=%02x, int_usb=0x%x\n", power, devctl,
    +    DBG(3, "DevCtl=%02x, int_usb=0x%x\n", devctl,
             int_usb);
     
         /* in host mode, the peripheral may issue remote wakeup.
    @@ -506,6 +506,7 @@
             if (devctl & MUSB_DEVCTL_HM) {
     #ifdef CONFIG_USB_MUSB_HDRC_HCD
                 void __iomem *mbase = musb->mregs;
    +            u8 power;
     
                 switch (musb->xceiv->state) {
                 case OTG_STATE_A_SUSPEND:
    @@ -513,6 +514,11 @@
                      * will stop RESUME signaling
                      */
     
    +                //read power register here...
    +                //because of a AM3517 silicon errata if POWER is read it can clear INTRTX!
    +                //See Advisory 1.1.20 of AM35x ARM Microprocessor Silicon 1.1,1.0 Errata
    +                power = musb_readb(musb->mregs, MUSB_POWER);
    +
                     if (power & MUSB_POWER_SUSPENDM) {
                         /* spurious */
                         musb->int_usb &= ~MUSB_INTR_SUSPEND;
    @@ -682,8 +688,8 @@
     
     #endif
         if (int_usb & MUSB_INTR_SUSPEND) {
    -        DBG(1, "SUSPEND (%s) devctl %02x power %02x\n",
    -                otg_state_string(musb), devctl, power);
    +        DBG(1, "SUSPEND (%s) devctl %02x\n",
    +                otg_state_string(musb), devctl);
             handled = IRQ_HANDLED;
     
             switch (musb->xceiv->state) {
    @@ -977,7 +983,7 @@
         /* put into basic highspeed mode and start session */
         musb_writeb(regs, MUSB_POWER, MUSB_POWER_ISOUPDATE
                             | MUSB_POWER_SOFTCONN
    -                        | MUSB_POWER_HSENAB
    +    //JTD hack:  remove high speed enable                    | MUSB_POWER_HSENAB
                             /* ENSUSPEND wedges tusb */
                             /* | MUSB_POWER_ENSUSPEND */
                             );
    @@ -1627,12 +1633,11 @@
     irqreturn_t musb_interrupt(struct musb *musb)
     {
         irqreturn_t    retval = IRQ_NONE;
    -    u8        devctl, power;
    +    u8        devctl;
         int        ep_num;
         u32        reg;
     
         devctl = musb_readb(musb->mregs, MUSB_DEVCTL);
    -    power = musb_readb(musb->mregs, MUSB_POWER);
     
         DBG(4, "** IRQ %s usb%04x tx%04x rx%04x\n",
             (devctl & MUSB_DEVCTL_HM) ? "host" : "peripheral",
    @@ -1651,7 +1656,7 @@
          */
         if (musb->int_usb)
             retval |= musb_stage0_irq(musb, musb->int_usb,
    -                devctl, power);
    +                devctl);
     
         /* "stage 1" is handling endpoint irqs */
     

  • Also, I was able to use DMA again after power register read removal.

  • Sorry, it appears that with DMA enabled, USB still runs slow.  When I connected it to a Windows XP machine, pings took about 1/2 a second and a user application communicating over USB ran slow.  The DMA also appeared to slow some pings down when connected to a Linux machine also (ping <address>).  I am not sure what is causing this problem.  With DMA disabled on USB, I did not see this problem with the Linux machine or the Windows XP machine.