Tool/software:
Dear,
If a IRQ request fails in am65-cpsw-nuss driver it will cause a NULL pointer exception in the cleanup chain. Root cause for the kernel panic is a memset in the free method:
The actual crash happens in the free_netdev method:
https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/net/core/dev.c?h=ti-linux-6.1.y#n10714
The crash is in the call to list_for_each_entry_safe and I believe it is due to the napi_tx list being overwritten with 0s in the memset.
It is easy to reproduce by faking a IRQ request failure in am65_cpsw_nuss_ndev_add_tx_napi:
Just add "ret = -ENOMEM" or "ret = -EINVAL" after the call to devm_request_irq.
What is the reason for the memset? Can it be safely removed?
[ 2.562201] am65-cpsw-nuss c000000.ethernet: failure requesting tx0 irq 43, -12
[ 2.569525] am65-cpsw-nuss c000000.ethernet: Failed to add tx NAPI -12
[ 2.578000] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 2.586786] Mem abort info:
[ 2.589587] ESR = 0x0000000096000004
[ 2.593333] EC = 0x25: DABT (current EL), IL = 32 bits
[ 2.598636] SET = 0, FnV = 0
[ 2.601684] EA = 0, S1PTW = 0
[ 2.604819] FSC = 0x04: level 0 translation fault
[ 2.609689] Data abort info:
[ 2.612562] ISV = 0, ISS = 0x00000004
[ 2.616394] CM = 0, WnR = 0
[ 2.619351] [0000000000000000] user address but active_mm is swapper
[ 2.625695] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 2.631947] Modules linked in:
[ 2.634995] CPU: 0 PID: 9 Comm: kworker/u4:0 Not tainted 6.1.69-ti-g4a7ab3a0163e #1
[ 2.642636] Hardware name: Schneider Electric Automation Server Premium 3 (DT)
[ 2.649841] Workqueue: events_unbound deferred_probe_work_func
[ 2.655676] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2.662622] pc : free_netdev+0xc0/0x1b0
[ 2.666452] lr : free_netdev+0xc0/0x1b0
[ 2.670276] sp : ffff80000983bb10
[ 2.673579] x29: ffff80000983bb10 x28: 0000000000000000 x27: 0000000000000000
[ 2.680704] x26: ffff00000000c000 x25: ffff00000000900d x24: 0000000000000048
[ 2.687827] x23: 0000000000000022 x22: ffff80000983bb58 x21: ffff0000028fb050
[ 2.694950] x20: ffff0000028fb000 x19: fffffffffffffea8 x18: 0000000000000008
[ 2.702075] x17: 202c333420717269 x16: 0000000000000008 x15: 0000000000000001
[ 2.709197] x14: 0000000000000026 x13: 0000000000000399 x12: 0000000000000001
[ 2.716322] x11: 0000000000000000 x10: 00000000000009a0 x9 : ffff80000983b910
[ 2.723445] x8 : ffff0000000d7a00 x7 : ffff00005fbb7340 x6 : ffff8000094dad30
[ 2.730568] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff800009566bf8
[ 2.737692] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000000d7000
[ 2.744817] Call trace:
[ 2.747252] free_netdev+0xc0/0x1b0
[ 2.750731] devm_free_netdev+0x14/0x20
[ 2.754561] devres_release_all+0xa8/0x110
[ 2.758649] device_unbind_cleanup+0x18/0x70
[ 2.762909] really_probe+0x21c/0x2dc
[ 2.766561] __driver_probe_device+0x78/0x114
[ 2.770909] driver_probe_device+0xd8/0x15c
[ 2.775081] __device_attach_driver+0xb8/0x134
[ 2.779514] bus_for_each_drv+0x80/0xdc
[ 2.783339] __device_attach+0xa0/0x1a0
[ 2.787164] device_initial_probe+0x14/0x20
[ 2.791336] bus_probe_device+0x98/0xa0
[ 2.795161] deferred_probe_work_func+0x88/0xc0
[ 2.799680] process_one_work+0x1b0/0x320
[ 2.803685] worker_thread+0x220/0x430
[ 2.807428] kthread+0x104/0x10c
[ 2.810647] ret_from_fork+0x10/0x20
[ 2.814219] Code: 97fff8a5 9400631c 35fffee0 97d6f36a (f940ae62)
[ 2.820298] ---[ end trace 0000000000000000 ]---
