This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CAN kernel crash after 300 seconds (INITIAL_JIFFIES)

Hi,

I am using the CAN driver interface in our processes. Everything works fine (sending and receiving messages). But after 300 seconds of uptime, the kernel crashes unexpected. I've noticed the definition of "INITIAL_JIFFIES" is 300. My suspision it has something to do with this. It only crashes if I'm using the CAN driver.

Does anyone else have seen this problem or could give me a hand please? Is there a bug in the driver/kernel or did I missed something?

The loggings are looking like this:

[  299.985748] Unable to handle kernel paging request at virtual address 56fe37ec[  299.993743] Unable to handle kernel paging request at virtual address 73007289 [  300.001251] Unable to handle kernel paging request at virtual address 73007289 [  300.008789] Unable to handle kernel paging request at virtual address 73007289 [  300.016326] Unable to handle kernel paging request at virtual address 73007289

Kind regards,

Jenthe

  • Hi Jenthe,

    Please post the whole crash log including the stack dump. Thank you.

    Best regards,
    Miroslav

  • Hi Miroslave. This is a complete logging

    It always shows these "kernel paging". The backtrace ([  300.441711] Backtrace:) is not always shown, it is shown sometimes, and sometimes not.

     

         [  299.895721] Unable to handle kernel paging request at virtual address 56fe37ec

    [  299.903259] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.910797] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.918304] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.925842] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.933380] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.940887] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.948425] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.955963] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.963500] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.971008] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.978546] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.986083] Unable to handle kernel paging request at virtual address 410d70e4

    [  299.993591] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.001129] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.008666] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.016174] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.023712] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.031250] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.038757] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.046295] Unable to handle kernel paging request at virtual address 410d70e4

    [  300.053833] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.061340] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.068878] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.076416] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.083923] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.091461] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.098999] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.106506] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.114044] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.121582] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.129089] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.136627] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.144165] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.151672] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.159210] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.166748] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.174255] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.181793] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.189331] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.196838] Unable to handle kernel paging request at virtual address 019d9c89

    [  300.204376] Unable to handle kernel paging request at virtual address 75625f89

    [  300.211914] Unable to handle kernel paging request at virtual address 75625f89

    [  300.219421] Unable to handle kernel paging request at virtual address 75625f89

    [  300.226959] Unable to handle kernel paging request at virtual address 75625f89

    [  300.234497] Unable to handle kernel paging request at virtual address 75625f89

    [  300.242004] Unable to handle kernel paging request at virtual address 75625f89

    [  300.249542] Unable to handle kernel paging request at virtual address 75625f89

    [  300.257080] Unable to handle kernel paging request at virtual address 75625f89

    [  300.264587] Unable to handle kernel paging request at virtual address 75625f89

    [  300.272125] Unable to handle kernel paging request at virtual address 75625f89

    [  300.279663] Unable to handle kernel paging request at virtual address 75625f89

    [  300.287170] Unable to handle kernel paging request at virtual address 75625f89

    [  300.294708] Unable to handle kernel paging request at virtual address 75625f89

    [  300.302246] Unable to handle kernel paging request at virtual address 75625f89

    [  300.309753] Unable to handle kernel paging request at virtual address 75625f89

    [  300.317291] Unable to handle kernel paging request at virtual address 75625f89

    [  300.324829] Unable to handle kernel paging request at virtual address 75625f89

    [  300.332336] Unable to handle kernel paging request at virtual address 75625f89

    [  300.339874] Unable to handle kernel paging request at virtual address 75625f89

    [  300.347412] Unable to handle kernel paging request at virtual address 75625f89

    [  300.354919] pgd = c0004000

    [  300.357757] [75625f89] *pgd=00000000

    [  300.361480] Internal error: Oops: 5 [#1]

    [  300.365570] Modules linked in: drvIRDA(O) drvPario(O)

    [  300.370849] CPU: 0    Tainted: G           O  (3.2.0 #3)

    [  300.376403] PC is at show_pte+0x24/0xc0

    [  300.380401] LR is at __do_kernel_fault+0x5c/0x8c

    [  300.385223] pc : [<c0019c40>]    lr : [<c0019d38>]    psr: 20000193

    [  300.385223] sp : c03fc158  ip : c03fc178  fp : c03fc174

    [  300.397186] r10: c04099cd  r9 : ffffffff  r8 : 75625f65

    [  300.402618] r7 : 00000005  r6 : 000003ab  r5 : 75625f65  r4 : 75625f89

    [  300.409423] r3 : c04649a8  r2 : 00000001  r1 : 75625f89  r0 : c03a9078

    [  300.416229] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user

    [  300.423767] Control: 10c5387d  Table: 864e4019  DAC: 00000015

    [  300.429748] Process ocal_loœÀ" (pid: 1778384895, stack limit = 0xc03fa2f0)

    [  300.437194] Stack: (0xc03fc158 to 0x5f638e79)

    [  300.441711] Backtrace:

    [  300.444274] [<c0019c1c>] (show_pte+0x0/0xc0) from [<c0019d38>] (__do_kernel_fault+0x5c/0x8c)

    [  300.453094]  r6:00000000 r5:75625f89 r4:c03fc2a8 r3:c04649a8

    [  300.459014] [<c0019cdc>] (__do_kernel_fault+0x0/0x8c) from [<c0019ebc>] (do_page_fault+0x154/0x1f0)

    [  300.468444]  r8:00000005 r7:00000005 r6:75625f89 r5:c03fc2a8 r4:75625f65

    [  300.475280] r3:c03fc2a8

    [  300.477996] [<c0019d68>] (do_page_fault+0x0/0x1f0) from [<c001a064>] (do_translation_fault+0xa0/0xa8)

    [  300.487640] [<c0019fc4>] (do_translation_fault+0x0/0xa8) from [<c00083a0>] (do_DataAbort+0x38/0xa0)

    [  300.497070]  r7:00000005 r6:c044b2fc r5:75625f89 r4:00000005

    [  300.503021] [<c0008368>] (do_DataAbort+0x0/0xa0) from [<c0013898>] (__dabt_svc+0x38/0x60)

    [  300.511535] Exception stack(0xc03fc2a8 to 0xc03fc2f0)

    [  300.516815] c2a0:                   c03a9078 75625f89 00000001 c04649a8 75625f89 75625f65

    [  300.525360] c2c0: 000003ab 00000005 75625f65 ffffffff c04099cd c03fc30c c03fc310 c03fc2f0

    [  300.533905] c2e0: c0019d38 c0019c40 20000193 ffffffff

    [  300.539154]  r8:75625f65 r7:c03fc2dc r6:ffffffff r5:20000193 r4:c0019c40

    [  300.546173] [<c0019c1c>] (show_pte+0x0/0xc0) from [<c0019d38>] (__do_kernel_fault+0x5c/0x8c)

    [  300.554992]  r6:00000000 r5:75625f89 r4:c03fc440 r3:c04649a8

    [  300.560913] [<c0019cdc>] (__do_kernel_fault+0x0/0x8c) from [<c0019ebc>] (do_page_fault+0x154/0x1f0)

    [  300.570343]  r8:00000005 r7:00000005 r6:75625f89 r5:c03fc440 r4:75625f65

    [  300.577178] r3:c03fc440

    [  300.579925] [<c0019d68>] (do_page_fault+0x0/0x1f0) from [<c001a064>] (do_translation_fault+0xa0/0xa8)

    [  300.589538] [<c0019fc4>] (do_translation_fault+0x0/0xa8) from [<c00083a0>] (do_DataAbort+0x38/0xa0)

    [  300.598999]  r7:00000005 r6:c044b2fc r5:75625f89 r4:00000005

    [  300.604919] [<c0008368>] (do_DataAbort+0x0/0xa0) from [<c0013898>] (__dabt_svc+0x38/0x60)

    [  300.613464] Exception stack(0xc03fc440 to 0xc03fc488)

    [  300.618713] c440: c03a9078 75625f89 00000001 c04649a8 75625f89 75625f65 000003ab 00000005

    [  300.627258] c460: 75625f65 ffffffff c04099cd c03fc4a4 c03fc4a8 c03fc488 c0019d38 c0019c40

    [  300.635803] c480: 20000193 ffffffff

    [  300.639434]  r8:75625f65 r7:c03fc474 r6:ffffffff r5:20000193 r4:c0019c40

    [  300.646453] [<c0019c1c>] (show_pte+0x0/0xc0) from [<c0019d38>] (__do_kernel_fa

  • Hi Jenthe,

    I'd also like to ask you what hardware platform and which SDK/PSP version are you using?

    I can see that your kernel is reported as Tainted. You can check why this is happening. Refer to this document for information about tainted kernels and also debugging techniques for Oops messages.

    The INITIAL_JIFFIES value is by default set to -300*HZ, which means that the jiffies value will wrap (overflow) about 5 minutes after boot:

    /*
     * Have the 32 bit jiffies value wrap 5 minutes after boot
     * so jiffies wrap bugs show up earlier.
     */
    #define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))

    You can try and change the INITIAL_JIFFIES value to 0 to check if the issue will still occur when 5 minutes pass after boot. If I remember correctly, on 32-bit systems the jiffies value will overflow after more than 400 days.

    Best regards,
    Miroslav

  • Hi Miroslav,

     

    I am investiging the problem. When changing the INITIAL_JIFFIES to a different value to -200, the error occurs after 200 seconds.

    The hardware is an AM335x processor. We are using Gnu45 on Ubuntu environment.

     

     

    Kind regards,

    Jenthe

  • Hi Jenthe,

    Yes, the processor should be AM335x as this is AM335x related forum, but what is the platform you are using - is it a TI supported board (AM335x GP EVM, AM335x Starter Kit, Beaglebone etc.) or is it a custom board you have developed?

    Also regarding the software running on the board, I haven't heard of Gnu45. I suppose your host machine (your PC) is running Ubuntu, but I'm interested in what is running on your board. Which Software Development Kit (SDK) / Platform Support Package (PSP) did you download and are using? Have you done any modifications to the software? If yes - please post what exactly do these changes involve.

    Did you try setting the INITIAL_JIFFIES value to 0. This is a hack, but should allow you to use you CAN communication for longer than 5 minutes until a proper bugfix is found.

    Best regards,
    Miroslav