Because of the holidays, TI E2E™ design support forum responses will be delayed from Dec. 25 through Jan. 2. Thank you for your patience.

This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3356: kworker becomes high cpu usage and some software stop with USB connected LTE modem

Part Number: AM3356

Tool/software:

We are facing a problem which the cpu usage of kworker becomes high (about 70%)
and some software stop working.
At this time, musb_urb_dequeque() in usdb_dhrc and rx_submit() in usbnet have high cpu usage in perf tool.

When we repeats following steps with USB connected LTE modem, the kworker's problem occurs once every 20 to 200 times.

And the problem is recovered if musb driver is reloaded as bellow.

--------------------
$ rmmod musb_dsps
$ rmmod musb_hdrc
$ modprobe musb_dsps
--------------------

Please tell me the cause of this problem and how to solve or prevent it.

- Here are the steps:
1. Set Modem power on.
2. Wait for serial ports to appears.
(/dev/ttyACM[0-3] appears about 10 sec after power on.)
3. Wait for a ready-to-use message from MODEM.
(^SYSTART messages is transmitted from MODEM when MODEM becomes ready.)
4. Send following AT commands to /dev/ttyACM0 to activate network.
---------------------------------------------------------
[2024-07-26 13:44:41.806] ATE1
[2024-07-26 13:44:41.806] OK
[2024-07-26 13:44:41.951] AT+CMEE=2
[2024-07-26 13:44:41.951] OK
[2024-07-26 13:44:42.088] AT^SGPSC="NMEA/Output","on"
[2024-07-26 13:44:42.088] ^SGPSC: "Nmea/Output","on"
[2024-07-26 13:44:42.088] OK
[2024-07-26 13:44:42.230] AT^SGPSC="Engine","1"
[2024-07-26 13:44:42.230] ^SGPSC: "Engine","1"
[2024-07-26 13:44:42.230] OK
[2024-07-26 13:44:42.359] AT+CPIN?
[2024-07-26 13:44:42.359] +CPIN: READY
[2024-07-26 13:44:42.359] OK
...
[2024-07-26 13:44:43.272] # make_line_active
[2024-07-26 13:44:43.481] AT+COPS?
[2024-07-26 13:44:43.481] +COPS: 0,0,"NTT DOCOMO",7
[2024-07-26 13:44:43.498] OK
[2024-07-26 13:44:43.608] AT+CEREG?
[2024-07-26 13:44:43.637] +CIEV: prov,1,"Commercial-DCM"
[2024-07-26 13:44:43.637] +CEREG: 0,1
[2024-07-26 13:44:43.637] OK

[2024-07-26 13:44:43.766] AT^SWWAN?
[2024-07-26 13:44:43.766] OK

[2024-07-26 13:44:43.878] AT+CREG?
[2024-07-26 13:44:43.878] +CREG: 0,1
[2024-07-26 13:44:43.893] OK

[2024-07-26 13:44:44.006] AT^SWWAN=1,1
[2024-07-26 13:44:44.075] OK
---------------------------------------------------------
5. Receive GPS NMEA sentences on /det/ttyS1 for about 50 seconds.
6. Set Modem power off.

7. repeat from step 1.

- Board: Custom board

- LTE modem interface: USB1 in host mode.
LTE network interface name is eth0.

- device tree:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
&am33xx_pinmux {
...
usb1_pins_default: usb1_pins_default {
pinctrl-single,pins = <
AM33XX_IOPAD(0xa34, PIN_OUTPUT | MUX_MODE0) /* (F15) USB1_DRVVBUS.USB1_DRVVBUS */
>;
};
usb1_pins_sleep: usb1_pins_sleep {
pinctrl-single,pins = <
AM33XX_IOPAD(0xa34, PIN_OUTPUT | MUX_MODE7) /* (F15) USB1_DRVVBUS.gpio3_13 */
>;
};
}
&usb {
status = "okay";
};
&usb_ctrl_mod {
status = "okay";
};
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

- result of 'top' command when the problem occurs:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
top - 23:22:36 up 13:10, 2 users, load average: 10.90, 10.05, 8.71
Tasks: 89 total, 2 running, 87 sleeping, 0 stopped, 0 zombie
%Cpu(s): 14.7 us, 76.5 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 8.8 si, 0.0 st
MiB Mem : 237.2 total, 111.5 free, 34.0 used, 91.7 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 192.8 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5762 root 20 0 0 0 0 R 64.0 0.0 341:42.00 kworker/0:3+events
10 root 20 0 0 0 0 S 16.0 0.0 75:42.61 ksoftirqd/0
6798 root 20 0 3608 1784 1508 R 16.0 0.7 0:00.11 top
1 root 20 0 24728 5428 3864 S 0.0 2.2 1:58.37 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.08 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_kthre
11 root 20 0 0 0 0 I 0.0 0.0 0:19.47 rcu_preempt
12 root 20 0 0 0 0 S 0.0 0.0 0:00.56 kdevtmpfs
13 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

- result of 'perf' command when the problem occurs:

This shows musb_urb_dequeque() in usdb_dhrc and rx_submit() in usbnet have high cpu usage.

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-----------------------------------------------------------------
Samples: 12K of event 'cycles', Event count (approx.): 354629522
Children Self Command Shared Object Symbol
+ 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] ret_from_fork
+ 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] kthread
+ 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] worker_thread
+ 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] process_one_work
+ 76.69% 0.00% kworker/0:6+eve [usbnet] [k] usbnet_deferred_kevent
+ 76.69% 0.00% kworker/0:6+eve [usbnet] [k] __handle_link_change.pa
+ 76.63% 1.00% kworker/0:6+eve [usbnet] [k] unlink_urbs.constprop.0
+ 72.65% 0.96% kworker/0:6+eve [kernel.kallsyms] [k] usb_hcd_unlink_urb
+ 71.03% 27.84% kworker/0:6+eve [musb_hdrc] [k] musb_urb_dequeue
+ 30.17% 0.25% kworker/0:6+eve [kernel.kallsyms] [k] __softirqentry_text_sta
+ 30.17% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] __irq_svc
+ 30.17% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] __handle_domain_irq
+ 30.17% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] irq_exit
+ 28.49% 0.08% kworker/0:6+eve [kernel.kallsyms] [k] tasklet_action_common.c
+ 28.41% 0.13% kworker/0:6+eve [usbnet] [k] usbnet_bh
+ 28.28% 0.19% kworker/0:6+eve [usbnet] [k] rx_alloc_submit
+ 28.08% 27.81% kworker/0:6+eve [usbnet] [k] rx_submit
+ 13.04% 0.15% kworker/0:6+eve [kernel.kallsyms] [k] preempt_schedule_common
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

- journal log around the problem:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Jul 25 16:24:42 am335x-evm kernel: usb 1-1: new high-speed USB device number 28 using musb-hdrc
Jul 25 16:24:42 am335x-evm kernel: usb 1-1: New USB device found, idVendor=1e2d, idProduct=0069, bcdDevice= 0.00
Jul 25 16:24:42 am335x-evm kernel: usb 1-1: New USB device strings: Mfr=4, Product=3, SerialNumber=5
Jul 25 16:24:42 am335x-evm kernel: usb 1-1: Product: PLSx3
Jul 25 16:24:42 am335x-evm kernel: usb 1-1: Manufacturer: Cinterion Wireless Modules
Jul 25 16:24:42 am335x-evm kernel: usb 1-1: SerialNumber: 4a722271
Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.0: ttyACM0: USB ACM device
Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.2: ttyACM1: USB ACM device
Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.4: ttyACM2: USB ACM device
Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.6: ttyACM3: USB ACM device
Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {create} index 284 type 1 <ETHER>
Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {update} flags 4098 <DOWN>
Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 address 00:A0:C6:22:27:10 mtu 1500
Jul 25 16:24:42 am335x-evm systemd-networkd[718]: eth0: Link UP
Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 operstate 2 <DOWN>
Jul 25 16:24:42 am335x-evm systemd-networkd[718]: eth0: Gained carrier
Jul 25 16:24:42 am335x-evm connmand[715]: Adding interface eth0 [ ethernet ]
Jul 25 16:24:42 am335x-evm kernel: cdc_ether 1-1:1.8 eth0: register 'cdc_ether' at usb-musb-hdrc.1-1, CDC Ethernet Device, 00:a0:c6:22:27:10
Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {update} flags 102467 <UP,RUNNING,LOWER_UP>
Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 address 00:A0:C6:22:27:10 mtu 1500
Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 operstate 0 <UNKNOWN>
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Thanks

  • Hi,

    Which version of the kernel do you use on AM3356?

  • Hi Bin san,

    I forgot to describe important informations.

    kernel version is 5.10.100+
    URL: git://git.ti.com/ti-linux-kernel/ti-linux-kernel.git
    tag: 08.02.00.006
    branch: ti-linux-5.10.y

    SDK version is 08_02_00_24

    Best regards,
    Katsuyama

  • Hi Bin-san,

    I attach kernel config.linux-config.zip

  • Hi Katsuyama-san,

    Let me look into how to debug the problem.

      4554 # CONFIG_MUSB_PIO_ONLY is not set
      4555 CONFIG_USB_TI_CPPI41_DMA=y

    Your kernel has MUSB DMA enabled. Can you please disable it in kernel menuconfig to use MUSB PIO mode to see if the issue still happens?

  • Hi Bin san,

    Than you for your advice.
    I tried disabling MUSB DMA with CONFIG_UMSB_PIO_ONLY=y.
    But, unfotunately, there was no effect.

    We are in a very cornered situation.
    Are there any other possible solutions?

    Best regards,
    Katsuyama

  • Hi Bin san,

    The following log remains in journal.log both before and after disabling MUSB DMA:

    Fullscreen
    1
    am335x-evm systemd-udevd[117]: 47400000.dma-controller: Process '/sbin/modprobe of:Ndma-controllerT(null)Cti,am3359-cppi41' failed with exit code 1.
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    Is there any relationship?


    Best regards,
    Katsuyama

  • Hi Katsuyama-san,

    Bin is out of office until the end of next week. Please expect a delayed response.

  • Hi Katsuyama-san,

    Sorry for my late response.

    The following log remains in journal.log both before and after disabling MUSB DMA:

    I am not sure what causes this message, it could be something in the root filesystem. But I won't worry about it, since the MUSB DMA driver should be built in kernel when it is enabled (CONFIG_USB_TI_CPPI41_DMA=y).

  • Hi Bin san,

    Thank you for your reply.

    I got following trace log when the problem occurred.

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    Aug 11 06:52:24 am335x-evm kernel: ------------[ cut here ]------------
    Aug 11 06:52:24 am335x-evm kernel: WARNING: CPU: 0 PID: 22742 at drivers/usb/musb/musb_host.c:113 musb_h_tx_flush_fifo+0x13c/0x140 [musb_hdrc]
    Aug 11 06:52:24 am335x-evm kernel: musb-hdrc musb-hdrc.1: Could not flush host TX2 fifo: csr: 2403
    Aug 11 06:52:24 am335x-evm kernel: Modules linked in: musb_dsps musb_hdrc cdc_ether usbnet cdc_acm iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ip_tables x_tables 8021q g_serial wl18xx wlcore mac80211 libarc4 cfg80211 omap_aes_driver omap_sham omap_crypto crypto_engine pm33xx ti_emif_sram c_can_platform c_can can_dev wlcore_sdio phy_am335x phy_generic wkup_m3_ipc phy_am335x_control omap_wdt wkup_m3_rproc uio_module_drv(O) uio cryptodev(O) [last unloaded: musb_hdrc]
    Aug 11 06:52:24 am335x-evm kernel: CPU: 0 PID: 22742 Comm: modem_status.sh Tainted: G O 5.10.100+ #1
    Aug 11 06:52:24 am335x-evm kernel: Hardware name: Generic AM33XX (Flattened Device Tree)
    Aug 11 06:52:24 am335x-evm kernel: [<c010d584>] (unwind_backtrace) from [<c010a12c>] (show_stack+0x10/0x14)
    Aug 11 06:52:24 am335x-evm kernel: [<c010a12c>] (show_stack) from [<c0a44b54>] (__warn+0xd4/0xec)
    Aug 11 06:52:24 am335x-evm kernel: [<c0a44b54>] (__warn) from [<c0a44c04>] (warn_slowpath_fmt+0x98/0xc8)
    Aug 11 06:52:24 am335x-evm kernel: [<c0a44c04>] (warn_slowpath_fmt) from [<bf24d254>] (musb_h_tx_flush_fifo+0x13c/0x140 [musb_hdrc])
    Aug 11 06:52:24 am335x-evm kernel: [<bf24d254>] (musb_h_tx_flush_fifo [musb_hdrc]) from [<bf24e6dc>] (musb_cleanup_urb+0x110/0x1fc [musb_hdrc])
    Aug 11 06:52:24 am335x-evm kernel: [<bf24e6dc>] (musb_cleanup_urb [musb_hdrc]) from [<bf24e8e4>] (musb_urb_dequeue+0x11c/0x150 [musb_hdrc])
    Aug 11 06:52:24 am335x-evm kernel: [<bf24e8e4>] (musb_urb_dequeue [musb_hdrc]) from [<c07570d4>] (usb_hcd_unlink_urb+0xac/0xc4)
    Aug 11 06:52:24 am335x-evm kernel: [<c07570d4>] (usb_hcd_unlink_urb) from [<c0758360>] (usb_poison_urb+0x60/0xdc)
    Aug 11 06:52:24 am335x-evm kernel: [<c0758360>] (usb_poison_urb) from [<bf03ce94>] (acm_poison_urbs+0x24/0x5c [cdc_acm])
    Aug 11 06:52:24 am335x-evm kernel: [<bf03ce94>] (acm_poison_urbs [cdc_acm]) from [<bf03d158>] (acm_port_shutdown+0x14/0xb8 [cdc_acm])
    Aug 11 06:52:24 am335x-evm kernel: [<bf03d158>] (acm_port_shutdown [cdc_acm]) from [<c05997c8>] (tty_port_shutdown+0x74/0xa0)
    Aug 11 06:52:24 am335x-evm kernel: [<c05997c8>] (tty_port_shutdown) from [<c0599d9c>] (tty_port_close+0x3c/0x7c)
    Aug 11 06:52:24 am335x-evm kernel: [<c0599d9c>] (tty_port_close) from [<c05916a0>] (tty_release+0xf0/0x4b0)
    Aug 11 06:52:24 am335x-evm kernel: [<c05916a0>] (tty_release) from [<c024536c>] (__fput+0x78/0x270)
    Aug 11 06:52:24 am335x-evm kernel: [<c024536c>] (__fput) from [<c0142a88>] (task_work_run+0x98/0xcc)
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    And I found your following patches for kernel 4.2.0 in 2016.

    patchwork.kernel.org/.../

    (1/2) marc.info/
    (2/2) marc.info/

    Will these solve our problem in kernel 5.10?
    If so, does this mean I don't need to disable MUSB DMA?

    Best regards,
    Katsuyama

  • Hi Katsuyama-san,

    I got following trace log when the problem occurred.

    This kernel WARNING should be harmless, it can be ignored.

    Have you tested with MUSB CPPI41DMA disabled to see if the issue still happens?

  • Hi Bin san,

    Thank you for your reply.

    This kernel WARNING should be harmless, it can be ignored.

    I see. Thank you for your answer.

    Have you tested with MUSB CPPI41DMA disabled to see if the issue still happens?

    Yes, I could have a good result, but not perfect.
    The combination with reloading musb_hdrc & musb_dsps before powering on the modem
    seems perfect.

    I checked the source code of the driver.
    The '(1/2) patch' has already been applied in kernel 5.10.
    So, I applied '(2/2) patch' and 'retries=10000' one and test with enabling MUSB DMA.
    But, the result was NG.

    Some people in my company and my customer don't want to disable MSUB DMA and
    reload the musb driver. And they expected a good result with these patches.

    So, can I get your comment on the following?
    1. These patches are valid or invalid?
    2. The mechanism which this problem occurs with MUSB DMA.
    3. Cons of disabling MUSB DMA.
    (I understand that transfer speeds will be slower.

    Best regards,
    Katsuyama

  • Hi Katsuyama-san,

    1. These patches are valid or invalid?

    As you said, the first patch (1/2) is already in the kernel, so it is valid. I will review the second patch and get back to you.

    2. The mechanism which this problem occurs with MUSB DMA.

    First of all, without further debug into the issue, I don't know if the issue is related to the MUSB DMA. That is why I asked you to test with the DMA disabled to see if the issue still happens or not.

    3. Cons of disabling MUSB DMA.
    (I understand that transfer speeds will be slower.

    Did you compared the transfer speed with and without MUSB DMA in your use case? What is the difference?