This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/PROCESSOR-SDK-AM335X: USB gadget issue with RT kernel

Part Number: PROCESSOR-SDK-AM335X

Tool/software: Linux

Hello,

we activated the 'Fully Preemptible Kernel (RT)'  mode in our linux config and have now the issue that the process [ksoftirqd/0] has always high CPU load and blocks the system.

We use the Linux USB Gadget functionality and the problem is more reproducable when a USB connection is available.

Our System:

CPU: TI AM335X

repository: git://git.ti.com/processor-sdk/processor-sdk-linux.git

branch: processor-sdk-linux-rt-03.03.00

Do you know some problems with the preembtion and the ksoftirqd/0?

What can be the cause, that the ksoftirqd/0 has a high cpu load?

Thanks in advance

Regards,

Christian Fisahn

  • Christian,

    Christian Fisahn said:
    we activated the 'Fully Preemptible Kernel (RT)'  mode in our linux config and have now the issue that the process [ksoftirqd/0] has always high CPU load and blocks the system.

    Do you mean the issue didn't happen before switched to "Fully Preemprible Kernel"?

    Christian Fisahn said:
    We use the Linux USB Gadget functionality and the problem is more reproducable when a USB connection is available.

    What USB gadget function do you use?

    Christian Fisahn said:
    What can be the cause, that the ksoftirqd/0 has a high cpu load?

    Please exam /proc/interrupts to see which interrupt has high rate.


  • Hello,


    there was a copy paste mistake in the first post "ksoftirqd/0" must be "ktimersoftd/0"!

    Do you mean the issue didn't happen before switched to "Fully Preemprible Kernel"?

    Yes, this statement is correct.

    What USB gadget function do you use?

    g_multi -> Serial Emulation, RNDIS and bulk device Emulation.

    Please exam /proc/interrupts to see which interrupt has high rate.

    • With plugged USB Connection to a PC

    Mem: 62192K used, 189572K free, 4K shrd, 12K buff, 41320K cached

    CPU:  0.0% usr 93.4% sys  0.0% nic  6.5% idle  0.0% io  0.0% irq  0.0% sirq

    Load average: 0.61 0.35 0.17 1/100 202

      PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND

        4     2 root     SW       0  0.0   0 86.0 [ktimersoftd/0]

     

    cat /proc/interrupts | grep -i time; sleep 10; cat /proc/interrupts | grep time

     

    16:   16456125      INTC  68 Level     gp_timer

    16:   16957202      INTC  68 Level     gp_timer

     

    >>  (16957202-16456125)/10 = ~50107

    • Without USB Connection to a PC

    cat /proc/interrupts | grep -i time; sleep 10; cat /proc/interrupts | grep time

     

    16:       5045      INTC  68 Level     gp_timer

    16:       6048      INTC  68 Level     gp_timer

     

    >>  (6048-5045)/10 = ~100 (CONFIG_HZ_100‘)

     

    In normal Operation, a USB Connection will be plugged, the ktimersoftd cpu load rise up  (50-80%)  to but will go back down to zero

    But sometimes it stays up at 100%.

     

    Thanks

     

    Regards,

     

    Christian


  • Christian,

    Thanks for the details. Let me first try to reproduce it and check what is happening. I will keep you posted.
  • Christian,

    By the way, can you please add 'usbcore.autosuspend=-1' in u-boot bootargs to see if the issue still happens?

  • Christian,

    I think I am able to see the same issue you reported, the hrtimer in CPPI dma driver spawns too fast, then I got >30K/sec gp_timer interrupts.
    I will work on fixing the driver bug. Meanwhile, you could disable CPPI DMA in kernel menuconfig to avoid this issue until I have a solution.
    Bt the way, "usbcore.autosuspend=-1" wouldn't solve the problem.
  • Hello Bin Liu,

    we have also a procedure how to reproduce it:
    1. Establish a connection over USB gadget Serial to the device e.g. with minicom
    2. Do some periodic print outs on this session (e.g. top)
    3. Establish a parallel connection over USB gadget RNDIS to the device e.g. Telnet
    4. Watch the processes with top
    5. Close the usb gadget serial connection (step 1), without stopping the peridoic print out.
    6. The ktimer.. process will show the high cpu load
    7. On reconnection the usb gadget serial connection, the ktimer... process recover again in normal state.

    I will test with the disabling "CPPI DMA in kernel menuconfig ".

    thanks

    Regards,

    Christian
  • Hello,

    disabling the DMA-Mode (this means to activate PIO-Mode) resolve the problem, described in the last post.
    The sporadic issue that this issue occurs on plugging the USB connection we will keep a watch on.

    So we look forward to your fix.

    thanks

    Regards,

    Christian
  • Christian,

    In my observation in the g_multi usecase, the issue happens when the USB CPPI DMA tries to transmit 1-byte packet but got stuck, the driver uses a timer to keep polling for the transmit completion, which causes the timer interrupt storm.

    I have added an work item in my backlog to investigate why the 1-byte transfer failed. But meanwhile please use the following kernel patch as the solution, which makes transfers for packet size <= 8 bytes to not use CPPI DMA, which is more efficient anyway.

    diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c
    index 86066328e480..10c28f15d536 100644
    --- a/drivers/usb/musb/musb_gadget.c
    +++ b/drivers/usb/musb/musb_gadget.c
    @@ -301,7 +301,7 @@ static void txstate(struct musb *musb, struct musb_request *req)
                    request_size = min_t(size_t, request->length - request->actual,
                                            musb_ep->dma->max_len);
     
    -               use_dma = (request->dma != DMA_ADDR_INVALID && request_size);
    +               use_dma = (request->dma != DMA_ADDR_INVALID && request_size > 8);
     
                    /* MUSB_TXCSR_P_ISO is still set correctly */
     
    
    
  • Thanks for the fast response and patch.

    We implement your quick solution and look forward for the final fix.

  • Christian,

    I am unable to replicate the issue with mainline kernel v4.16-rc6, so the work item added in my backlog is dropped, the patch above will be the final fix, unless we found the patch causes other issue.