This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM335x Kernel Panic on heavy network traffic

Hi,

I'm using linux-3.2.0-psp04.06.00.11 sdk.

When I use AM335x on heave network traffic, receive 13M bytes per second and send 13M bytes per second, kernel panic some times, any help please.

here is the panic log:

Unable to handle kernel NULL pointer dereference at virtual address 00000084
pgd = db184000
[00000084] *pgd=9b166831, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1]
Modules linked in:
CPU: 0 Not tainted (3.2.0 #31)
PC is at __skb_recv_datagram+0x120/0x244
LR is at udp_recvmsg+0x70/0x314
pc : [<c0252f3c>] lr : [<c02983ec>] psr: 600e0093
sp : dc7abd40 ip : c029837c fp : dc3f7a80
r10: dc7abd98 r9 : 00000000 r8 : c0253088
r7 : dc7aa000 r6 : 00000040 r5 : dc3f7abc r4 : dc7c9a40
r3 : 00000000 r2 : 200e0013 r1 : 00000000 r0 : 00000080
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 10c5387d Table: 9b184019 DAC: 00000015
Process NetworkTest (pid: 429, stack limit = 0xdc7aa2e8)
Stack: (0xdc7abd40 to 0xdc7ac000)
bd40: 00000008 dc7abd9c 00000000 c02532b0 00000000 dc7abeec 00000584 00000584
bd60: 00000010 c0481024 dc7abef4 dc3f7a80 00000040 00000000 000005dc c0491000
bd80: c0491000 c02983ec dd1f5c80 0000052e 00000000 dc7abf10 00000000 0000052e
bda0: dd23da80 c0481024 dc7abef4 00000000 00000000 00000001 c0330058 dc7abdf8
bdc0: dc7abef4 c02a00e0 00000040 00000000 dc7abddc 00000001 00000001 00000010
bde0: c02a00a0 ffffffff 00000000 c0248408 00000040 00000006 c0468d5c 0000005e
be00: 00000040 000005dc dc8f1300 c0465538 00000000 dc7abef4 c0468d5c c005cd38
be20: c0475374 c005a6ec 0000018c c000ec14 00000004 0000005e dc7abe58 c0008590
be40: 00000000 00000001 ffffffff 00000000 00000000 00000000 00000000 00000000
be60: db8519c0 c0013b9c 00000000 00000000 00000001 db27cbe0 dc7abdf8 c024e92c
be80: db8aa840 c01e45b4 c046770c c005b7ec c046770c 0000002b 00000000 c005b854
bea0: 00000040 600e0113 c04901ac 00000010 bec0c7ac dc7abf10 bec0c7d0 000005dc
bec0: dc8f1300 00000040 dc7abf10 bec0c7d0 dc7aa000 00400000 00392ef4 c024a28c
bee0: c048718c fffffff7 00000001 bec0c7d0 000005dc dc7abf10 00000080 dc7abeec
bf00: 00000001 00000000 00000000 c0487100 1e270002 8301a8c0 00000000 00000000
bf20: 0000005d 00000000 fa200000 800e001f c0475374 0000005d 00000000 c000ec18
bf40: 00000004 0000005d dc7abf68 c0008590 4862b92c c000de40 400e0013 ffffffff
bf60: dc7abf9c c000d980 00000019 bec0c7d0 000005dc 00000000 4ce2bcc8 bec0c7ac
bf80: 4ca7ef98 00000124 800e001f 0006e3d5 4cec37c8 bec0c7ac 4ca7ef98 00000124
bfa0: c000dee8 c000dd40 4cec37c8 bec0c7ac 0000001f bec0c7d0 000005dc 00000000
bfc0: 4cec37c8 bec0c7ac 4ca7ef98 00000124 000005dc 4cec8798 4cec37c8 00392ef4
bfe0: 00000000 bec0c790 4862b91c 4862b92c 800e001f 0000001f 78969215 6a58c6d7
[<c0252f3c>] (__skb_recv_datagram+0x120/0x244) from [<c02983ec>] (udp_recvmsg+0x
70/0x314)
[<c02983ec>] (udp_recvmsg+0x70/0x314) from [<c02a00e0>] (inet_recvmsg+0x40/0x54)

[<c02a00e0>] (inet_recvmsg+0x40/0x54) from [<c0248408>] (sock_recvmsg+0x9c/0xc0)

[<c0248408>] (sock_recvmsg+0x9c/0xc0) from [<c024a28c>] (sys_recvfrom+0x84/0xdc)

[<c024a28c>] (sys_recvfrom+0x84/0xdc) from [<c000dd40>] (ret_fast_syscall+0x0/0x
30)
Code: e58b3044 e8940009 e5841000 e5841004 (e5803004)
---[ end trace c5ceed4d6fe8b90f ]---

  • Hi Simon,

    Could you post some details about the hardware environment: processor model, speed, Ethernet interface, speed and PHY used?

  • Hi Biser,

    Thank you for your reply and sorry for late response.

    Here is our hardware environment:

    Processor: AM3354BZCZ100 running at 1000MHZ

    Network: we use RGMI to connect to RTL8370 network switch directly. Network speed is always 1G.

    I've backtraced source code and found panic at  function __skb_unlink

    The skb->next and skb->prev are all NULL, but the list->qlen still have 50 packages.

    It seems like skb was overwritten by someone. I don't know whether it's cpsw driver or dma.

     

    Thank you 

    Simon

  • Hi Simon,

    I have asked for help on this.

  • Hi Biser,

    I got several more panic, sometimes it' only take a few minutes to get these panics.

    Panic 1:

    [ 3703.015584] Unable to handle kernel paging request at virtual address 3e0029d
    8
    [ 3703.023189] pgd = da404000
    [ 3703.026013] [3e0029d8] *pgd=00000000
    [ 3703.029759] Internal error: Oops: 5 [#1]
    [ 3703.033855] Modules linked in:
    [ 3703.037047] CPU: 0 Not tainted (3.2.0-kn8 #606)
    [ 3703.042160] PC is at cpsw_rx_handler+0x8/0x16c
    [ 3703.046804] LR is at cpdma_chan_process+0x150/0x1e4
    [ 3703.051901] pc : [<c0211f24>] lr : [<c020fc28>] psr: 200f0113
    [ 3703.051905] sp : da3f9ebc ip : 0000001f fp : c04b4db0
    [ 3703.063902] r10: dc5cd380 r9 : 3e0029c4 r8 : 000005f2
    [ 3703.069359] r7 : 00000001 r6 : 3e0029c4 r5 : ffc07f60 r4 : dc5d8940
    [ 3703.076180] r3 : c0211f1c r2 : 00010000 r1 : 000005f2 r0 : 3e0029c4
    [ 3703.083002] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kern
    el
    [ 3703.090640] Control: 10c5387d Table: 9a404019 DAC: 00000015
    [ 3703.096644] Process KN8140 (pid: 497, stack limit = 0xda3f82e8)
    [ 3703.102828] Stack: (0xda3f9ebc to 0xda3fa000)
    [ 3703.107380] 9ea0:
    dc5d8940
    [ 3703.115930] 9ec0: ffc07f60 00010000 00000001 c020fc28 c0502e3c 00000040 da3f9
    ec8 dc5dcbb0
    [ 3703.124479] 9ee0: 00000040 0000012c 00000040 c04ef180 c04cd710 000530e2 c04ef
    188 c02140fc
    [ 3703.133029] 9f00: 00000001 dc5dcbb0 0000012c 00000040 c04ef180 c0295bec c04ef
    180 c050af28
    [ 3703.141578] 9f20: 00000000 00000001 c0500b8c 0000000c da3f8000 c04cc678 da3f8
    008 00000100
    [ 3703.150127] 9f40: 00000003 c0034318 da3f9f84 c000d9c0 c04d0eac c0500b00 00000
    00a c04cc884
    [ 3703.158676] 9f60: 3d3b1424 c04de468 0000005d 00000000 fa200000 3d3b1424 00646
    568 0063c494
    [ 3703.167226] 9f80: 00000006 c000ec58 00000004 0000005d da3f9fb0 c0008590 4cab4
    f5c 485726b0
    [ 3703.175775] 9fa0: 200f001f ffffffff 7c8b5756 c000db3c 4cab4f5c 4cab5388 00000
    54c 40f82435
    [ 3703.184325] 9fc0: 0015ff00 5e0040b0 5553c3c9 7c8b5756 3d3b1424 00646568 0063c
    494 00000006
    [ 3703.192875] 9fe0: 0040f940 beffef74 0086830f 485726b0 200f001f ffffffff ddbfd
    cbf ffdffeff
    [ 3703.201438] [<c0211f24>] (cpsw_rx_handler+0x8/0x16c) from [<da3f9ec8>] (0xda3
    f9ec8)
    [ 3703.209460] Code: e3a00001 eaffffe6 e92d40f0 e1a06000 (e5904014)
    [ 3703.223530] ---[ end trace bc7256a29d66915f ]---
    [ 3703.228361] Kernel panic - not syncing: Fatal exception in interrupt
    [ 3703.235023] [<c001336c>] (unwind_backtrace+0x0/0xec) from [<c034c814>] (panic
    +0x68/0x184)
    [ 3703.243594] [<c034c814>] (panic+0x68/0x184) from [<c00115d0>] (die+0x2d4/0x35
    c)
    [ 3703.251245] [<c00115d0>] (die+0x2d4/0x35c) from [<c034c694>] (__do_kernel_fau
    lt.part.5+0x64/0x74)
    [ 3703.260536] [<c034c694>] (__do_kernel_fault.part.5+0x64/0x74) from [<c00143d8
    >] (do_bad_area+0x0/0x9c)
    [ 3703.270273] [<c00143d8>] (do_bad_area+0x0/0x9c) from [<da3f9e70>] (0xda3f9e70
    )

    Panic 2:

    3470.046384] Unable to handle kernel NULL pointer dereference at virtual addre
    ss 00000004
    [ 3470.054858] pgd = dafa0000
    [ 3470.057681] [00000004] *pgd=9af85831, *pte=00000000, *ppte=00000000
    [ 3470.064244] Internal error: Oops: 817 [#1]
    [ 3470.068521] Modules linked in:
    [ 3470.071714] CPU: 0 Not tainted (3.2.0-kn8 #593)
    [ 3470.076823] PC is at __skb_recv_datagram+0x120/0x244
    [ 3470.082019] LR is at udp_recvmsg+0x70/0x314
    [ 3470.086389] pc : [<c028f65c>] lr : [<c02df1fc>] psr: 600e0093
    [ 3470.086393] sp : daf97d30 ip : c02df18c fp : dd7fdcc0
    [ 3470.098415] r10: daf97d88 r9 : 00000000 r8 : c028f7a8
    [ 3470.103883] r7 : daf96000 r6 : 00000040 r5 : dd7fdcfc r4 : db235d80
    [ 3470.110717] r3 : 00000000 r2 : 200e0013 r1 : 00000000 r0 : 00000000
    [ 3470.117554] Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment ker
    nel
    [ 3470.125299] Control: 10c5387d Table: 9afa0019 DAC: 00000015
    [ 3470.131315] Process KN8140 (pid: 498, stack limit = 0xdaf962e8)
    [ 3470.137512] Stack: (0xdaf97d30 to 0xdaf98000)
    [ 3470.142074] 7d20: 00000008 daf97d8c 00000
    000 c028f9d0
    [ 3470.150642] 7d40: dc5ce340 daf97ed8 00000000 00000020 dbad6200 c04eba54 daf97
    ee0 dd7fdcc0
    [ 3470.159210] 7d60: 00000040 00000000 00000640 c050adc0 c050adc0 c02df1fc c04b4
    db0 00000574
    [ 3470.167777] 7d80: 00000000 daf97efc 00000000 00000574 ffd8eba0 c04eba54 daf97
    ee0 00000000
    [ 3470.176344] 7da0: 00000000 00000001 c0380c04 daf97de8 daf97ee0 c02e7058 00000
    040 00000000
    [ 3470.184912] 7dc0: daf97dcc a00e0113 00000000 00000010 c02e7018 ffffffff 00000
    000 c02848f4
    [ 3470.193479] 7de0: 00000040 c0029604 000000c3 dbd32480 00000040 00000640 dc8f8
    b40 c0288d30
    [ 3470.202045] 7e00: 00000000 daf97ee0 00000000 dc6cf680 dd144480 c02df5b4 00000
    003 dc6cf680
    [ 3470.210613] 7e20: 00000011 c02e19f4 00000006 c04b327c 00000000 00000001 fffff
    fff 00000000
    [ 3470.219179] 7e40: 00000000 00000000 00000000 00000000 dc56e100 c04b2600 00000
    000 00000000
    [ 3470.227746] 7e60: dc5d88c0 c0013668 daf97de8 dc5d88c0 dc5ce340 c020fdf0 00000
    000 00000020
    [ 3470.236313] 7e80: dd380080 00000042 ffffffff dc5dc800 000006fe dc6cf680 00000
    590 00000010
    [ 3470.244880] 7ea0: beffdf68 00000040 4ce98348 c04b2048 00000640 dc8f8b40 00000
    040 4ce98348
    [ 3470.253447] 7ec0: beffdf68 4ce9938d daf97efc c02868ac fffffff7 00000001 4ce99
    38d 00000640
    [ 3470.262015] 7ee0: daf97efc 00000080 daf97ed8 00000001 00000000 00000000 00000
    004 1c270002
    [ 3470.270582] 7f00: 8301a8c0 00000000 00000000 00000040 c04ef2c0 c02959b4 c04ef
    2c0 c050b068
    [ 3470.279150] 7f20: 00000000 00000001 c0500ccc 0000000c daf96000 c04cc7b8 daf96
    008 00000100
    [ 3470.287717] 7f40: 00000001 c0034330 00000000 c0008590 c04d0fec c0500c40 00000
    00a c04cc9c4
    [ 3470.296284] 7f60: 52cc321c c04de5a8 0000005d 00000000 fa200000 52cc321c c04de
    5a8 f4c18d40
    [ 3470.304851] 7f80: 00000000 4ce98348 beffdf68 beffdf68 00000124 c000df28 daf96
    000 00400000
    [ 3470.313418] 7fa0: 00552f54 c000dd80 4ce98348 beffdf68 0000001e 4ce9938d 00000
    640 00000000
    [ 3470.321984] 7fc0: 4ce98348 beffdf68 beffdf68 00000124 4ce98348 00000640 4ce99
    38d 00552f54
    [ 3470.330552] 7fe0: 00000000 beffdf50 4862b91c 4862b92c 800e001f 0000001e 00000
    000 00000000
    [ 3470.339131] [<c028f65c>] (__skb_recv_datagram+0x120/0x244) from [<c02df1fc>]
    (udp_recvmsg+0x70/0x314)
    [ 3470.348777] [<c02df1fc>] (udp_recvmsg+0x70/0x314) from [<c02e7058>] (inet_rec
    vmsg+0x40/0x54)
    [ 3470.357606] [<c02e7058>] (inet_recvmsg+0x40/0x54) from [<c02848f4>] (sock_rec
    vmsg+0x9c/0xc0)
    [ 3470.366432] [<c02848f4>] (sock_recvmsg+0x9c/0xc0) from [<c02868ac>] (sys_recv
    from+0x9c/0x108)
    [ 3470.375355] [<c02868ac>] (sys_recvfrom+0x9c/0x108) from [<c000dd80>] (ret_fas
    t_syscall+0x0/0x30)
    [ 3470.384544] Code: e58b3044 e8940009 e5841000 e5841004 (e5803004)
    [ 3470.406106] ---[ end trace 85286f5d1256bf92 ]---

    Thank you

    Simon.