Part Number: PROCESSOR-SDK-AM335X
Hello,
I am supporting a product based upon the AM335x processor running Linux built from the AM335x processor SDK version 4.02.00.09 (kernel 4.9.59) and have observed an occasional kernel panic as follows:
[1283423.745953] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe00f9030
[1283423.754222] pgd = c0004000
[1283423.757255] [e00f9030] *pgd=9d824811, *pte=47401653, *ppte=47401453
[1283423.764067] Internal error: : 1008 [#1] PREEMPT ARM
[1283423.769384] Modules linked in: omap_serial_tpi(O) rnet(O) power_freq(O) actnet(O) ionet(O) arcnet_sohard(O) ti_am335x_adc(O) ti_am335x_tscadc(O) kfifo_buf industrialio bridge stp llc iptable_filter ip_tables cryptodev(O) [last unloaded: power_freq]
[1283423.792834] CPU: 0 PID: 24111 Comm: kworker/0:1 Tainted: G O 4.9.59-alc #1
[1283423.801441] Hardware name: Generic AM33XX (Flattened Device Tree)
[1283423.808058] Workqueue: pm pm_runtime_work
[1283423.812463] task: ddeda000 task.stack: d1fd8000
[1283423.817421] PC is at musb_default_readl+0x10/0x18
[1283423.822562] LR is at dsps_interrupt+0x50/0x30c
[1283423.827425] pc : [<c0524660>] lr : [<c0531668>] psr: 80030193
[1283423.827425] sp : d1fd9c48 ip : d1fd9c58 fp : d1fd9c54
[1283423.839871] r10: dd9d4710 r9 : c0c981e4 r8 : e00f9000
[1283423.845551] r7 : d1fd9cd4 r6 : ddcc6010 r5 : c085060c r4 : ddaf3740
[1283423.852601] r3 : 00010002 r2 : c0524650 r1 : e00f9030 r0 : e00f9000
[1283423.859657] Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
[1283423.867441] Control: 10c5387d Table: 9c0c0019 DAC: 00000051
[1283423.873667] Process kworker/0:1 (pid: 24111, stack limit = 0xd1fd8210)
When this panic occurs the PC is always set to musb_default_readl+0x10 as shown above. Tracing the dump above shows that the failure occurs in the __raw_readl() call which is called indirectly via the musb_readl() call in the dsps_interrupt() function in musb_dsps.c. Debugging the problem via ftrace has indicated that the panic occurs if the USB peripheral is in a suspended (clock disabled) state when the ISR is called and an attempt is made to read the USB status register via musb_readl().
Disabling kernel runtime power management for the USB peripheral via /sys/bus/platform/drivers/musb-hdrc/musb-hdrc.X/power/control prevents the panic from occurring. However, if possible I would like to understand and address the root cause of the interrupt/ISR being triggered while the device is in the suspended state rather than relying on disabling power management for the device.
Any help or suggestions are appreciated.