This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Kernel Panic does not force reboot on SDK7 3.12 kernel??

Hi guys

I have a headless embedded system running debian wheezy with TI SDK7 kernel 3.12 on AM335x processor.

I have (or so I thought) set it up so that on kernel panic or oops it restarts after 5 seconds:


sysctl -w kernel.panic=5
sysctl -w kernel.panic_on_oops=1
sysctl -w vm.panic_on_oom=1

These are set in rcS.

Also I have the following in /etc/sysctl.conf:


kernel.panic = 5
kernel.panic_on_oops=1
vm.panic_on_oom=1


But to my surprise, i came to a dead system and plugged in my debugger and looked at the console logs and it was stuck in this mode:



[ 326.179980] CPU: 0 PID: 4987 Comm: ifconfig Not tainted 3.12.10-svn14 #5
[ 326.187033] Backtrace:
[ 326.189621] <c00179fc> (dump_backtrace+0x0/0x10c) from <c0017b98> (show_stack+0x18/0x1c)
[ 326.198505] r6:c087b838 r5:c087b838 r4:c087bb20 r3:00000000
[ 326.204501] <c0017b80> (show_stack+0x0/0x1c) from <c05d5ee4> (dump_stack+0x20/0x28)
[ 326.212948] <c05d5ec4> (dump_stack+0x0/0x28) from <c0094894> (rcu_check_callbacks+0x2a0/0x6f8)
[ 326.222402] <c00945f4> (rcu_check_callbacks+0x0/0x6f8) from <c0051b60> (update_process_times+0x44/0x70)
[ 326.232678] <c0051b1c> (update_process_times+0x0/0x70) from <c00849fc> (tick_sched_handle+0x50/0x5c)
[ 326.242660] r7:ddf17c80 r6:c087b340 r5:00020fcd r4:9bcc8cae
[ 326.248657] <c00849ac> (tick_sched_handle+0x0/0x5c) from <c0084bbc> (tick_sched_timer+0x48/0x78)
[ 326.258287] <c0084b74> (tick_sched_timer+0x0/0x78) from <c0065f8c> (__run_hrtimer.isra.22+0x60/0xfc)
[ 326.268276] r7:00000000 r6:c0084b74 r5:c0879e00 r4:c087b340
[ 326.274272] <c0065f2c> (__run_hrtimer.isra.22+0x0/0xfc) from <c0066770> (hrtimer_interrupt+0x108/0x2f0)
[ 326.284535] r6:c0879e00 r5:00020fcd r4:9bcc8919 r3:00020fcd
[ 326.290533] <c0066668> (hrtimer_interrupt+0x0/0x2f0) from <c002b678> (omap2_gp_timer_interrupt+0x2c/0x3c)
[ 326.300984] <c002b64c> (omap2_gp_timer_interrupt+0x0/0x3c) from <c00766d4> (handle_irq_event_percpu+0x54/0x1b8)
[ 326.311982] <c0076680> (handle_irq_event_percpu+0x0/0x1b8) from <c0076890> (handle_irq_event+0x58/0x80)
[ 326.322249] <c0076838> (handle_irq_event+0x0/0x80) from <c0078eb4> (handle_level_irq+0x90/0x108)
[ 326.331871] r5:00000054 r4:dd806cc0
[ 326.335658] <c0078e24> (handle_level_irq+0x0/0x108) from <c0075f70> (generic_handle_irq+0x28/0x38)
[ 326.345464] r4:00000054 r3:c0078e24
[ 326.349250] <c0075f48> (generic_handle_irq+0x0/0x38) from <c00156c0> (handle_IRQ+0x38/0x8c)
[ 326.358415] r4:c0882f44 r3:00000110
[ 326.362200] <c0015688> (handle_IRQ+0x0/0x8c) from <c00087cc> (omap3_intc_handle_irq+0x68/0x7c)
[ 326.371633] r6:c08b6970 r5:ddf17c80 r4:fa200000 r3:00000080
[ 326.377628] <c0008764> (omap3_intc_handle_irq+0x0/0x7c) from <c05da580> (__irq_svc+0x40/0x74)
[ 326.386968] Exception stack(0xddf17c80 to 0xddf17cc8)
[ 326.392297] 7c80: 00000000 c08b7a40 00000000 00000100 00000202 00000054 c08b7a84 c08b7a80
[ 326.400914] 7ca0: ddf16000 ddf16000 00000001 ddf17d14 ddf17cc8 ddf17cc8 c004b300 c004b314
[ 326.409523] 7cc0: 20000113 ffffffff
[ 326.413197] r7:ddf17cb4 r6:ffffffff r5:20000113 r4:c004b314
[ 326.419193] <c004b288> (__do_softirq+0x0/0x1c4) from <c004b4ec> (do_softirq+0x54/0x60)
[ 326.427906] <c004b498> (do_softirq+0x0/0x60) from <c004b79c> (irq_exit+0xac/0xf4)
[ 326.436149] r4:ddf16000 r3:00000000
[ 326.439936] <c004b6f0> (irq_exit+0x0/0xf4) from <c00156c4> (handle_IRQ+0x3c/0x8c)
[ 326.448186] r4:c0882f44 r3:00000110
[ 326.451974] <c0015688> (handle_IRQ+0x0/0x8c) from <c00087cc> (omap3_intc_handle_irq+0x68/0x7c)
[ 326.461407] r6:c08b6970 r5:ddf17d88 r4:fa200000 r3:00000080
[ 326.467399] <c0008764> (omap3_intc_handle_irq+0x0/0x7c) from <c05da580> (__irq_svc+0x40/0x74)
[ 326.476749] Exception stack(0xddf17d88 to 0xddf17dd0)
[ 326.482072] 7d80: ddd6ec40 00000000 fa1cc000 c03a7c00 ddd6e800 ddd6ec40
[ 326.490689] 7da0: 00000001 00040081 00008914 ddf16000 00000001 ddf17de4 ddf17d98 ddf17dd0
[ 326.499300] 7dc0: c03a7c38 c03a7580 60000013 ffffffff
[ 326.504616] r7:ddf17dbc r6:ffffffff r5:60000013 r4:c03a7580
[ 326.510612] <c03a74e0> (c_can_close+0x0/0x114) from <c04f2bc0> (__dev_close_many+0x90/0xd8)
[ 326.519771] r5:ddf17e00 r4:ddd6e800
[ 326.523555] <c04f2b30> (_dev_close_many+0x0/0xd8) from <c04f2c38> (_dev_close+0x30/0x48)
[ 326.532621] r5:000000c0 r4:ddd6e800
[ 326.536408] <c04f2c08> (_dev_close+0x0/0x48) from <c04f70ac> (_dev_change_flags+0x90/0x140)
[ 326.545761] <c04f701c> (__dev_change_flags+0x0/0x140) from <c04f71ec> (dev_change_flags+0x18/0x50)
[ 326.555567] r7:00000000 r6:00000000 r5:00040081 r4:ddd6e800
[ 326.561562] <c04f71d4> (dev_change_flags+0x0/0x50) from <c0567634> (devinet_ioctl+0x620/0x6d8)
[ 326.571001] r6:00000000 r5:ddcdfe8c r4:00000000 r3:00008914
[ 326.576993] <c0567014> (devinet_ioctl+0x0/0x6d8) from <c056862c> (inet_ioctl+0x1b4/0x1c8)
[ 326.585981] <c0568478> (inet_ioctl+0x0/0x1c8) from <c04e16a0> (sock_ioctl+0x70/0x29c)
[ 326.594603] <c04e1630> (sock_ioctl+0x0/0x29c) from <c00f0c30> (do_vfs_ioctl+0x84/0x5c4)
[ 326.603395] r6:00008914 r5:ddb1b8c0 r4:00000000 r3:c04e1630
[ 326.609390] <c00f0bac> (do_vfs_ioctl+0x0/0x5c4) from <c00f11e4> (SyS_ioctl+0x74/0x84)
[ 326.618018] <c00f1170> (SyS_ioctl+0x0/0x84) from <c00147c0> (ret_fast_syscall+0x0/0x30)

and it didn't restart.

Any ideas why it didn't restart and how I can ensure it does restart on situations like this?

  • Have you tried to pass the "panic" variable in bootargs ?
    panic=5
  • Hi,

    If you want to use "sysctl" then you have update the "sysctl.conf" file into OS by using the following command.

    cat /proc/sys/kernel/panic

    sysctl -p /etc/sysctl.conf

    After this, you can also check the following through proc fs.

    cat /proc/sys/kernel/panic

    Also you can update by doing the following.

    echo 5 > /proc/sys/kernel/panic
  • Thanks guys!

    I had another unit that failed with the same problem. Upon closer inspection, I found a line that I missed copying the first time...which is a huge clue as to what happened:

    [   17.375650] INFO: rcu_preempt self-detected stall on CPU { 0}  (t=54852456 jiffies g=5901 c=5900 q=35)

    [   17.385483] CPU: 0 PID: 2988 Comm: ifconfig Not tainted 3.12.10-svn14 #5
    [   17.392539] Backtrace:
    [   17.395129] [<c00179fc>] (dump_backtrace+0x0/0x10c) from [<c0017b98>] (show_stack+0x18/0x1c)
    [   17.404023]  r6:c087b838 r5:c087b838 r4:c087bb20 r3:00000000
    [   17.410018] [<c0017b80>] (show_stack+0x0/0x1c) from [<c05d5ee4>] (dump_stack+0x20/0x28)
    [   17.418467] [<c05d5ec4>] (dump_stack+0x0/0x28) from [<c0094894>] (rcu_check_callbacks+0x2a0/0x6f8)
    [   17.427922] [<c00945f4>] (rcu_check_callbacks+0x0/0x6f8) from [<c0051b60>] (update_process_times+0x44/0x70)
    [   17.438192] [<c0051b1c>] (update_process_times+0x0/0x70) from [<c00849fc>] (tick_sched_handle+0x50/0x5c)
    [   17.448176]  r7:dc0e3c80 r6:c087b340 r5:0000f987 r4:0b91cbf7
    [   17.454173] [<c00849ac>] (tick_sched_handle+0x0/0x5c) from [<c0084bbc>] (tick_sched_timer+0x48/0x78)
    [   17.463856] [<c0084b74>] (tick_sched_timer+0x0/0x78) from [<c0065f8c>] (__run_hrtimer.isra.22+0x60/0xfc)
    [   17.473887]  r7:00000000 r6:c0084b74 r5:c0879e00 r4:c087b340
    [   17.479906] [<c0065f2c>] (__run_hrtimer.isra.22+0x0/0xfc) from [<c0066770>] (hrtimer_interrupt+0x108/0x2f0)
    [   17.490221]  r6:c0879e00 r5:0000f987 r4:0b91c863 r3:0000f987
    [   17.496244] [<c0066668>] (hrtimer_interrupt+0x0/0x2f0) from [<c002b678>] (omap2_gp_timer_interrupt+0x2c/0x3c)
    [   17.506747] [<c002b64c>] (omap2_gp_timer_interrupt+0x0/0x3c) from [<c00766d4>] (handle_irq_event_percpu+0x54/0x1b8)
    [   17.517799] [<c0076680>] (handle_irq_event_percpu+0x0/0x1b8) from [<c0076890>] (handle_irq_event+0x58/0x80)
    [   17.528116] [<c0076838>] (handle_irq_event+0x0/0x80) from [<c0078eb4>] (handle_level_irq+0x90/0x108)
    [   17.537778]  r5:00000054 r4:dd806cc0
    [   17.541580] [<c0078e24>] (handle_level_irq+0x0/0x108) from [<c0075f70>] (generic_handle_irq+0x28/0x38)
    [   17.551426]  r4:00000054 r3:c0078e24
    [   17.555229] [<c0075f48>] (generic_handle_irq+0x0/0x38) from [<c00156c0>] (handle_IRQ+0x38/0x8c)
    [   17.564438]  r4:c0882f44 r3:00000110
    [   17.568242] [<c0015688>] (handle_IRQ+0x0/0x8c) from [<c00087cc>] (omap3_intc_handle_irq+0x68/0x7c)
    [   17.577675]  r6:c08b6970 r5:dc0e3c80 r4:fa200000 r3:00000080
    [   17.583669] [<c0008764>] (omap3_intc_handle_irq+0x0/0x7c) from [<c05da580>] (__irq_svc+0x40/0x74)
    [   17.593023] Exception stack(0xdc0e3c80 to 0xdc0e3cc8)
    [   17.598353] 3c80: 00000000 c08b7a40 00000000 00000100 00000202 00000054 c08b7a84 c08b7a80
    [   17.606980] 3ca0: dc0e2000 dc0e2000 00000001 dc0e3d14 dc0e3cc8 dc0e3cc8 c004b300 c004b314
    [   17.615594] 3cc0: 20000113 ffffffff
    [   17.619265]  r7:dc0e3cb4 r6:ffffffff r5:20000113 r4:c004b314
    [   17.625262] [<c004b288>] (__do_softirq+0x0/0x1c4) from [<c004b4ec>] (do_softirq+0x54/0x60)
    [   17.633977] [<c004b498>] (do_softirq+0x0/0x60) from [<c004b79c>] (irq_exit+0xac/0xf4)
    [   17.642228]  r4:dc0e2000 r3:00000000
    [   17.646016] [<c004b6f0>] (irq_exit+0x0/0xf4) from [<c00156c4>] (handle_IRQ+0x3c/0x8c)
    [   17.654267]  r4:c0882f44 r3:00000110
    [   17.658054] [<c0015688>] (handle_IRQ+0x0/0x8c) from [<c00087cc>] (omap3_intc_handle_irq+0x68/0x7c)
    [   17.667488]  r6:c08b6970 r5:dc0e3d88 r4:fa200000 r3:00000080
    [   17.673481] [<c0008764>] (omap3_intc_handle_irq+0x0/0x7c) from [<c05da580>] (__irq_svc+0x40/0x74)
    [   17.682827] Exception stack(0xdc0e3d88 to 0xdc0e3dd0)
    [   17.688152] 3d80:                   ddd5ec40 00000000 fa1cc000 c03a7c00 ddd5e800 ddd5ec40
    [   17.696771] 3da0: 00000001 00040081 00008914 dc0e2000 00000001 dc0e3de4 dc0e3d98 dc0e3dd0
    [   17.705386] 3dc0: c03a7c38 c03a7580 60000013 ffffffff
    [   17.710708]  r7:dc0e3dbc r6:ffffffff r5:60000013 r4:c03a7580
    [   17.716705] [<c03a74e0>] (c_can_close+0x0/0x114) from [<c04f2bc0>] (__dev_close_many+0x90/0xd8)
    [   17.725866]  r5:dc0e3e00 r4:ddd5e800
    [   17.729651] [<c04f2b30>] (__dev_close_many+0x0/0xd8) from [<c04f2c38>] (__dev_close+0x30/0x48)
    [   17.738720]  r5:000000c0 r4:ddd5e800
    [   17.742506] [<c04f2c08>] (__dev_close+0x0/0x48) from [<c04f70ac>] (__dev_change_flags+0x90/0x140)
    [   17.751862] [<c04f701c>] (__dev_change_flags+0x0/0x140) from [<c04f71ec>] (dev_change_flags+0x18/0x50)
    [   17.761663]  r7:00000000 r6:00000000 r5:00040081 r4:ddd5e800
    [   17.767660] [<c04f71d4>] (dev_change_flags+0x0/0x50) from [<c0567634>] (devinet_ioctl+0x620/0x6d8)
    [   17.777096]  r6:00000000 r5:ddcdfe8c r4:00000000 r3:00008914
    [   17.783091] [<c0567014>] (devinet_ioctl+0x0/0x6d8) from [<c056862c>] (inet_ioctl+0x1b4/0x1c8)
    [   17.792087] [<c0568478>] (inet_ioctl+0x0/0x1c8) from [<c04e16a0>] (sock_ioctl+0x70/0x29c)
    [   17.800710] [<c04e1630>] (sock_ioctl+0x0/0x29c) from [<c00f0c30>] (do_vfs_ioctl+0x84/0x5c4)
    [   17.809505]  r6:00008914 r5:dc140b40 r4:00000000 r3:c04e1630
    [   17.815502] [<c00f0bac>] (do_vfs_ioctl+0x0/0x5c4) from [<c00f11e4>] (SyS_ioctl+0x74/0x84)
    [   17.824125] [<c00f1170>] (SyS_ioctl+0x0/0x84) from [<c00147c0>] (ret_fast_syscall+0x0/0x30)

    The first line:

    [   17.375650] INFO: rcu_preempt self-detected stall on CPU { 0}  (t=54852456 jiffies g=5901 c=5900 q=35)

    shows that it was a RCU stall detection that caused the lock up. I read a lot about it and there is a way to suppress this. I'm not 100% positive if this is a MUST feature to have and what disabling it could do to a system and what issues it could cause.

    Here is what I found that suppresses it:

    echo 1 > /sys/module/rcutree/parameters/rcu_cpu_stall_suppress

    Can someone help me understand this feature and why I may need this turned out and what I could run into if I suppress it? 

    Also if someone knows how to force a system reboot on RCU Stall detection and could help me, it would be great!

    Thanks!