Tool/software:
We are facing a problem which the cpu usage of kworker becomes high (about 70%)
and some software stop working.
At this time, musb_urb_dequeque() in usdb_dhrc and rx_submit() in usbnet have high cpu usage in perf tool.
When we repeats following steps with USB connected LTE modem, the kworker's problem occurs once every 20 to 200 times.
And the problem is recovered if musb driver is reloaded as bellow.
--------------------
$ rmmod musb_dsps
$ rmmod musb_hdrc
$ modprobe musb_dsps
--------------------
Please tell me the cause of this problem and how to solve or prevent it.
- Here are the steps:
1. Set Modem power on.
2. Wait for serial ports to appears.
(/dev/ttyACM[0-3] appears about 10 sec after power on.)
3. Wait for a ready-to-use message from MODEM.
(^SYSTART messages is transmitted from MODEM when MODEM becomes ready.)
4. Send following AT commands to /dev/ttyACM0 to activate network.
---------------------------------------------------------
[2024-07-26 13:44:41.806] ATE1
[2024-07-26 13:44:41.806] OK
[2024-07-26 13:44:41.951] AT+CMEE=2
[2024-07-26 13:44:41.951] OK
[2024-07-26 13:44:42.088] AT^SGPSC="NMEA/Output","on"
[2024-07-26 13:44:42.088] ^SGPSC: "Nmea/Output","on"
[2024-07-26 13:44:42.088] OK
[2024-07-26 13:44:42.230] AT^SGPSC="Engine","1"
[2024-07-26 13:44:42.230] ^SGPSC: "Engine","1"
[2024-07-26 13:44:42.230] OK
[2024-07-26 13:44:42.359] AT+CPIN?
[2024-07-26 13:44:42.359] +CPIN: READY
[2024-07-26 13:44:42.359] OK
...
[2024-07-26 13:44:43.272] # make_line_active
[2024-07-26 13:44:43.481] AT+COPS?
[2024-07-26 13:44:43.481] +COPS: 0,0,"NTT DOCOMO",7
[2024-07-26 13:44:43.498] OK
[2024-07-26 13:44:43.608] AT+CEREG?
[2024-07-26 13:44:43.637] +CIEV: prov,1,"Commercial-DCM"
[2024-07-26 13:44:43.637] +CEREG: 0,1
[2024-07-26 13:44:43.637] OK
[2024-07-26 13:44:43.766] AT^SWWAN?
[2024-07-26 13:44:43.766] OK
[2024-07-26 13:44:43.878] AT+CREG?
[2024-07-26 13:44:43.878] +CREG: 0,1
[2024-07-26 13:44:43.893] OK
[2024-07-26 13:44:44.006] AT^SWWAN=1,1
[2024-07-26 13:44:44.075] OK
---------------------------------------------------------
5. Receive GPS NMEA sentences on /det/ttyS1 for about 50 seconds.
6. Set Modem power off.
7. repeat from step 1.
- Board: Custom board
- LTE modem interface: USB1 in host mode.
LTE network interface name is eth0.
- device tree:
&am33xx_pinmux { ... usb1_pins_default: usb1_pins_default { pinctrl-single,pins = < AM33XX_IOPAD(0xa34, PIN_OUTPUT | MUX_MODE0) /* (F15) USB1_DRVVBUS.USB1_DRVVBUS */ >; }; usb1_pins_sleep: usb1_pins_sleep { pinctrl-single,pins = < AM33XX_IOPAD(0xa34, PIN_OUTPUT | MUX_MODE7) /* (F15) USB1_DRVVBUS.gpio3_13 */ >; }; } &usb { status = "okay"; }; &usb_ctrl_mod { status = "okay"; }; &usb1_phy { status = "okay"; }; &usb1 { pinctrl-names = "default", "sleep"; pinctrl-0 = <&usb1_pins_default>; pinctrl-1 = <&usb1_pins_sleep>; status = "okay"; dr_mode = "host"; };
- result of 'top' command when the problem occurs:
top - 23:22:36 up 13:10, 2 users, load average: 10.90, 10.05, 8.71
Tasks: 89 total, 2 running, 87 sleeping, 0 stopped, 0 zombie
%Cpu(s): 14.7 us, 76.5 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 8.8 si, 0.0 st
MiB Mem : 237.2 total, 111.5 free, 34.0 used, 91.7 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 192.8 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5762 root 20 0 0 0 0 R 64.0 0.0 341:42.00 kworker/0:3+events
10 root 20 0 0 0 0 S 16.0 0.0 75:42.61 ksoftirqd/0
6798 root 20 0 3608 1784 1508 R 16.0 0.7 0:00.11 top
1 root 20 0 24728 5428 3864 S 0.0 2.2 1:58.37 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.08 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_kthre
11 root 20 0 0 0 0 I 0.0 0.0 0:19.47 rcu_preempt
12 root 20 0 0 0 0 S 0.0 0.0 0:00.56 kdevtmpfs
13 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns
- result of 'perf' command when the problem occurs:
This shows musb_urb_dequeque() in usdb_dhrc and rx_submit() in usbnet have high cpu usage.
----------------------------------------------------------------- Samples: 12K of event 'cycles', Event count (approx.): 354629522 Children Self Command Shared Object Symbol + 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] ret_from_fork + 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] kthread + 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] worker_thread + 76.97% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] process_one_work + 76.69% 0.00% kworker/0:6+eve [usbnet] [k] usbnet_deferred_kevent + 76.69% 0.00% kworker/0:6+eve [usbnet] [k] __handle_link_change.pa + 76.63% 1.00% kworker/0:6+eve [usbnet] [k] unlink_urbs.constprop.0 + 72.65% 0.96% kworker/0:6+eve [kernel.kallsyms] [k] usb_hcd_unlink_urb + 71.03% 27.84% kworker/0:6+eve [musb_hdrc] [k] musb_urb_dequeue + 30.17% 0.25% kworker/0:6+eve [kernel.kallsyms] [k] __softirqentry_text_sta + 30.17% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] __irq_svc + 30.17% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] __handle_domain_irq + 30.17% 0.00% kworker/0:6+eve [kernel.kallsyms] [k] irq_exit + 28.49% 0.08% kworker/0:6+eve [kernel.kallsyms] [k] tasklet_action_common.c + 28.41% 0.13% kworker/0:6+eve [usbnet] [k] usbnet_bh + 28.28% 0.19% kworker/0:6+eve [usbnet] [k] rx_alloc_submit + 28.08% 27.81% kworker/0:6+eve [usbnet] [k] rx_submit + 13.04% 0.15% kworker/0:6+eve [kernel.kallsyms] [k] preempt_schedule_common + 12.94% 0.25% kworker/0:6+eve [kernel.kallsyms] [k] __sched_text_start + 12.69% 12.68% kworker/0:6+eve [kernel.kallsyms] [k] finish_task_switch + 5.13% 0.00% ksoftirqd/0 [kernel.kallsyms] [k] ret_from_fork + 5.13% 0.00% ksoftirqd/0 [kernel.kallsyms] [k] kthread + 5.13% 0.40% ksoftirqd/0 [kernel.kallsyms] [k] smpboot_thread_fn + 4.52% 0.27% ksoftirqd/0 [kernel.kallsyms] [k] schedule -----------------------------------------------------------------
- journal log around the problem:
Jul 25 16:24:42 am335x-evm kernel: usb 1-1: new high-speed USB device number 28 using musb-hdrc Jul 25 16:24:42 am335x-evm kernel: usb 1-1: New USB device found, idVendor=1e2d, idProduct=0069, bcdDevice= 0.00 Jul 25 16:24:42 am335x-evm kernel: usb 1-1: New USB device strings: Mfr=4, Product=3, SerialNumber=5 Jul 25 16:24:42 am335x-evm kernel: usb 1-1: Product: PLSx3 Jul 25 16:24:42 am335x-evm kernel: usb 1-1: Manufacturer: Cinterion Wireless Modules Jul 25 16:24:42 am335x-evm kernel: usb 1-1: SerialNumber: 4a722271 Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.0: ttyACM0: USB ACM device Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.2: ttyACM1: USB ACM device Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.4: ttyACM2: USB ACM device Jul 25 16:24:42 am335x-evm kernel: cdc_acm 1-1:1.6: ttyACM3: USB ACM device Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {create} index 284 type 1 <ETHER> Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {update} flags 4098 <DOWN> Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 address 00:A0:C6:22:27:10 mtu 1500 Jul 25 16:24:42 am335x-evm systemd-networkd[718]: eth0: Link UP Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 operstate 2 <DOWN> Jul 25 16:24:42 am335x-evm systemd-networkd[718]: eth0: Gained carrier Jul 25 16:24:42 am335x-evm connmand[715]: Adding interface eth0 [ ethernet ] Jul 25 16:24:42 am335x-evm kernel: cdc_ether 1-1:1.8 eth0: register 'cdc_ether' at usb-musb-hdrc.1-1, CDC Ethernet Device, 00:a0:c6:22:27:10 Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {update} flags 102467 <UP,RUNNING,LOWER_UP> Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 address 00:A0:C6:22:27:10 mtu 1500 Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {newlink} index 284 operstate 0 <UNKNOWN> Jul 25 16:24:42 am335x-evm connmand[715]: eth0 {add} route fe80:: gw :: scope 0 <UNIVERSE> Jul 25 16:24:43 am335x-evm connmand[715]: eth0 {RX} 0 packets 0 bytes Jul 25 16:24:43 am335x-evm connmand[715]: eth0 {TX} 2 packets 700 bytes Jul 25 16:24:43 am335x-evm connmand[715]: eth0 {update} flags 36867 <UP> Jul 25 16:24:43 am335x-evm systemd-networkd[718]: eth0: Lost carrier Jul 25 16:24:43 am335x-evm connmand[715]: eth0 {newlink} index 284 address 00:A0:C6:22:27:10 mtu 1500 Jul 25 16:24:43 am335x-evm connmand[715]: eth0 {newlink} index 284 operstate 2 <DOWN> Jul 25 16:24:43 am335x-evm connmand[715]: eth0 {del} route fe80:: gw :: scope 0 <UNIVERSE> Jul 25 16:24:43 am335x-evm systemd-udevd[6709]: 1-1:1.1: Process '/sbin/modprobe usb:v1E2Dp0069d0000dcEFdsc02dp01ic0Aisc00ip00in01' failed with exit code 1. Jul 25 16:24:43 am335x-evm systemd-udevd[6711]: 1-1:1.3: Process '/sbin/modprobe usb:v1E2Dp0069d0000dcEFdsc02dp01ic0Aisc00ip00in03' failed with exit code 1. Jul 25 16:24:43 am335x-evm systemd-udevd[6716]: 1-1:1.7: Process '/sbin/modprobe usb:v1E2Dp0069d0000dcEFdsc02dp01ic0Aisc00ip00in07' failed with exit code 1. Jul 25 16:24:43 am335x-evm systemd-udevd[6713]: 1-1:1.5: Process '/sbin/modprobe usb:v1E2Dp0069d0000dcEFdsc02dp01ic0Aisc00ip00in05' failed with exit code 1. Jul 25 16:24:43 am335x-evm systemd-udevd[6718]: 1-1:1.9: Process '/sbin/modprobe usb:v1E2Dp0069d0000dcEFdsc02dp01ic0Aisc00ip00in09' failed with exit code 1. Jul 25 16:24:43 am335x-evm systemd-udevd[6717]: Using default interface naming scheme 'v243'. Jul 25 16:24:43 am335x-evm systemd-udevd[6717]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable. Jul 25 16:24:44 am335x-evm systemd-networkd[718]: eth0: IPv6 successfully enabled Jul 25 16:24:44 am335x-evm connmand[715]: eth0 {RX} 0 packets 0 bytes Jul 25 16:24:44 am335x-evm connmand[715]: eth0 {TX} 2 packets 700 bytes Jul 25 16:24:44 am335x-evm connmand[715]: eth0 {newlink} index 284 address 00:A0:C6:22:27:10 mtu 1500 Jul 25 16:24:44 am335x-evm connmand[715]: eth0 {newlink} index 284 operstate 2 <DOWN>
Thanks