This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3352: NMI watchdog: BUG: soft lockup .....

Part Number: AM3352

Hello to everybody,

we have developed a custom board equipped with the AM3352 and running Kernel 4.1.33 .

It happens, randomly, sometimes after a couple of days and sometimes after few hours, that the CPU is reported as stuck for more than 20 seconds .

The kernel traces the issue on the console and, after some consecutive occurrences of the issue, everything seems to come back to normal operation.

---------------------------

root@arm:~# [31280.144378] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper:0]
[31280.151591] Modules linked in: ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc binfmt_misc ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter iptable_mangle ip_tables x_tables ti_am335x_adc kfifo_buf industrialio omap_sham omap_aes tilcdc omap_rng rng_core drm_kms_helper evdev rtc_m41t80 ti_am335x_tsc ti_am335x_tscadc leds_gpio uio_pdrv_genirq uio option usb_wwan usbserial usb_f_acm u_serial usb_f_rndis g_multi usb_f_mass_storage u_ether libcomposite
[31280.204326] CPU: 0 PID: 0 Comm: swapper Tainted: G W 4.1.33_APS_vdm-bone24 #22
[31280.212729] Hardware name: Generic AM33XX (Flattened Device Tree)
[31280.218868] task: c09ff330 ti: c09f8000 task.ti: c09f8000
[31280.224338] PC is at arch_cpu_idle+0x28/0x2c
[31280.228646] LR is at arch_cpu_idle+0x27/0x2c
[31280.232955] pc : [<c000ee00>] lr : [<c000edff>] psr: 400d0033
[31280.232955] sp : c09f9f98 ip : 70617773 fp : c09fb104
[31280.244501] r10: c0a96270 r9 : 00000001 r8 : c0aac228
[31280.249763] r7 : 00000000 r6 : 00000000 r5 : c09fb0f8 r4 : c09f8000
[31280.256331] r3 : c001b361 r2 : 00000000 r1 : 00000000 r0 : 00000001
[31280.262906] Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment kernel
[31280.270438] Control: 50c5387d Table: 8f4b0019 DAC: 00000015
[31280.276228] CPU: 0 PID: 0 Comm: swapper Tainted: G W 4.1.33_APS_vdm-bone24 #22
[31280.284626] Hardware name: Generic AM33XX (Flattened Device Tree)
[31280.290817] [<c0012e21>] (unwind_backtrace) from [<c0010fd5>] (show_stack+0x11/0x14)
[31280.298638] [<c0010fd5>] (show_stack) from [<c008de03>] (watchdog_timer_fn+0x10b/0x130)
[31280.306730] [<c008de03>] (watchdog_timer_fn) from [<c0061c31>] (__run_hrtimer+0x45/0x11c)
[31280.314981] [<c0061c31>] (__run_hrtimer) from [<c0061f1b>] (hrtimer_interrupt+0xbb/0x1bc)
[31280.323232] [<c0061f1b>] (hrtimer_interrupt) from [<c002102b>] (omap2_gp_timer_interrupt+0x23/0x28)
[31280.332364] [<c002102b>] (omap2_gp_timer_interrupt) from [<c005a385>] (handle_irq_event_percpu+0x6d/0x130)
[31280.342095] [<c005a385>] (handle_irq_event_percpu) from [<c005a469>] (handle_irq_event+0x21/0x2c)
[31280.351042] [<c005a469>] (handle_irq_event) from [<c005be0d>] (handle_level_irq+0x55/0x90)
[31280.359375] [<c005be0d>] (handle_level_irq) from [<c0059d6b>] (generic_handle_irq+0x23/0x2c)
[31280.367880] [<c0059d6b>] (generic_handle_irq) from [<c0059f03>] (__handle_domain_irq+0x3b/0x80)
[31280.376664] [<c0059f03>] (__handle_domain_irq) from [<c05e7c9b>] (__irq_svc+0x3b/0x5c)
[31280.384654] [<c05e7c9b>] (__irq_svc) from [<c000ee00>] (arch_cpu_idle+0x28/0x2c)
[31280.392123] [<c000ee00>] (arch_cpu_idle) from [<c004a465>] (cpu_startup_entry+0x17d/0x1c8)
[31280.400468] [<c004a465>] (cpu_startup_entry) from [<c0988a4f>] (start_kernel+0x337/0x340)

---------------------------

The load average of the CPU remains always very low and  the memory occupation never reaches the 50%:

Mem: 105120K used, 142784K free, 0K shrd, 5313216K buff, 5313260K cached
CPU: 0% usr 2% sys 0% nic 97% idle 0% io 0% irq 0% sirq
Load average: 0.01 0.02 0.00 1/70 4156

Can you, please, address me on how to debug and solve this lockup ?

Thank You

Filippo