This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM35x Linux RT OOPS

Other Parts Discussed in Thread: AM3505, AM3517

Hi All,

I'm working on a custom design based on AM3505. We're working on 2.6.33.7-rt29 kernel with RT extension from arago projects git.

I'm facing several instability issues during booting stage, likely due to preemption.

At first sight we thought at hardware issues with ddr timings, but now I had a common kernel with the AM3517 evm board and I have some random boot crash also with this one.

Test setup is based on a board continuously rebooting: Kernel oops are really random, and they seem related to preemption model: playing with kernel config and debug options lead to unclear results.

I tried to merge RT30 fixes related to memory management, with no luck.

Have you got any repo or good address to point out a stable RT release for AM35x devices?

Thanks a lot,

E

  • Hi,

    it's me again.

    I forgot to mention that I had to backport serial-omap driver to replace 8250 driver. I have to deeply investigate if this is the root cause of the issues (even if it's not evident at the moment).

    So I'd like to have a feedback from any of you running 2.6.33.7-RT29 kernel with no stability issues or boot crash on a AM35xx based target.

    Thanks again,

    E

  • Hi Enrico,

    I think basing a design on RT_PREEMPT in the current state of affairs with respect to kernel version is asking for troubles, especially with such an old version

    There are several options for Xenomai kernels on AM3x and that is a stable base, with 3.8 not too far off in the future; the one I'm using on a beaglebone is based on 3.2.21 which isnt device tree yet but solid.

    check the xenomai mailing list and the linuxcnc-developers archives

    regards

    Michael

  • Hi,

    this is an example of crash I'm currently facing with this system:

    ------------------------------------

    UBIFS: mounted UBI device 0, volume 0, name "rootfs"
    UBIFS: file system size: 14403840 bytes (14066 KiB, 13 MiB, 110 LEBs)
    UBIFS: journal size: 2356992 bytes (2301 KiB, 2 MiB, 18 LEBs)
    UBIFS: media format: w4/r0 (latest is w4/r0)
    UBIFS: default compressor: zlib
    UBIFS: reserved for root: 0 bytes (0 KiB)
    VFS: Mounted root (ubifs filesystem) on device 0:12.
    devtmpfs: mounted
    Freeing init memory: 172K
    Starting logging: OK
    Initializing random number generator... done.
    UBI: attaching mtd8 to ubi1
    UBI: physical eraseblock size: 131072 bytes (128 KiB)
    UBI: logical eraseblock size: 126976 bytes
    UBI: smallest flash I/O unit: 2048
    UBI: VID header offset: 2048 (aligned 2048)
    UBI: data offset: 4096
    Unable to handle kernel paging request at virtual address 00050008
    pgd = c0004000
    [00050008] *pgd=00000000
    Internal error: Oops: 817 [#1] PREEMPT
    last sysfs file: /sys/class/ubi/version
    Modules linked in:
    CPU: 0 Not tainted (2.6.33.7-rt29 #1)
    PC is at free_block+0x68/0x12c
    LR is at drain_array+0xc0/0x184
    pc : [<c00ad1f4>] lr : [<c00ad378>] psr: 00000113
    sp : cf84ff10 ip : cfb43200 fp : 00000000
    r10: cf84ff44 r9 : c04b37c4 r8 : 00000001
    r7 : cf800388 r6 : cf8051ac r5 : 00000000 r4 : cf800340
    r3 : 00070006 r2 : 00050004 r1 : cf87c440 r0 : cf802240
    Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
    Control: 10c5387d Table: 8f028019 DAC: 00000017
    Process events/0 (pid: 16, stack limit = 0xcf84e2e8)
    Stack: (0xcf84ff10 to 0xcf850000)
    ff00: cf805180 00000001 cf802264 cf8051a8
    ff20: cf800340 c04b37c4 c04e78d0 c00ad378 cf84ff44 cf01001c c04b3690 cf84ff3c
    ff40: cf84ff3c 00000000 cf800340 cf802240 00000027 00000000 c04a7c98 c00ad78c
    ff60: 00000000 00000000 cf817ac0 cf84e000 c00ad734 cf817ad8 c04b37c8 00000000
    ff80: c04b37c4 c006b014 cf84ffac 00000000 cf82f040 c006e4c0 cf84ff98 cf84ff98
    ffa0: 00000000 cf821f40 c006aec4 cf817ac0 00000000 00000000 00000000 c006e1d4
    ffc0: 00000000 00000000 cf84ffc8 cf84ffc8 cf84ffd0 cf84ffd0 00000000 00000000
    ffe0: cf84ffe0 cf84ffe0 00000000 00000000 00000000 c0034f94 fdf5f76f fefbddc3
    [<c00ad1f4>] (free_block+0x68/0x12c) from [<c00ad378>] (drain_array+0xc0/0x184)
    [<c00ad378>] (drain_array+0xc0/0x184) from [<c00ad78c>] (cache_reap+0x58/0x13c)
    [<c00ad78c>] (cache_reap+0x58/0x13c) from [<c006b014>] (worker_thread+0x150/0x1cc)
    [<c006b014>] (worker_thread+0x150/0x1cc) from [<c006e1d4>] (kthread+0x7c/0x84)
    [<c006e1d4>] (kthread+0x7c/0x84) from [<c0034f94>] (kernel_thread_exit+0x0/0x8)
    Code: eafffffe e593101c e5970004 e891000c (e5823004)
    ---[ end trace b5a3241af1ef2651 ]---

    ------------------------------------

    Often the same trace happens while shutting down the system.

    Sometimes while rebooting the system hangs without output trace: using a jtag debugger I've found the kernel into usual cpu idle task.

    I suspect emac driver because all trouble seems related to eth activity.

    Any feedback from other AM35xx Linux 2.6.32/33 users?

    Thanks in advance,

    E

  • Hi All,

    the system with PREMPT_NONE selected, and PREEMPT_SOFTIRQS/HARDIRQS enabled, seems stable, so for now we go on with this release.

    Further note for those using this kernel with PREEMPT_RT active: emac shutdown seems crappy, ifdown can lead to system lock when disabling peripheral's interrupts.

    E