This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DRA712: DRA711 Board: sometime reboot system cause kernel crash

Part Number: DRA712
Other Parts Discussed in Thread: DRA71

hi all experts,

my board SOC is DRA711, kernel version is 4.14.94, and run VisionSDK0304 on IPU2 when early boot.

now i do reboot test every 6 seconds, but sometime it cause kernel crash, log as following:

 

Uncompressing Linux... done, booting the kernel.
1:tttttttttttttttttt start_kernel..123..
[    0.397813] Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
[    0.397816] pgd = c0003000
[    0.397820] [00000000] *pgd=80000080004003, *pmd=00000000
[    0.397833] Internal error: : 1211 [#1] PREEMPT SMP ARM
[    0.397837] Modules linked in:
[    0.397846] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.14.94-rt50-03028-g6d246d8-dirty #47
[    0.397848] Hardware name: Generic DRA72X (Flattened Device Tree)
[    0.397852] task: dd068000 task.stack: dd066000
[    0.397864] PC is at regmap_mmio_read32le+0x1c/0x20
[    0.397871] LR is at regmap_mmio_read+0x40/0x5c
[    0.397875] pc : [<c0588da4>]    lr : [<c0588f64>]    psr: 20000013
[    0.397879] sp : dd067ca8  ip : dd067cb8  fp : dd067cb4
[    0.397883] r10: 00000000  r9 : 0000ffff  r8 : 00000000
[    0.397887] r7 : dd3de200  r6 : dd067d0c  r5 : 0000002c  r4 : dd3ddbc0
[    0.397890] r3 : fa44a000  r2 : dd067d0c  r1 : fa44a02c  r0 : 00000000
[    0.397896] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[    0.397901] Control: 30c5387d  Table: 80003000  DAC: fffffffd
[    0.397905] Process swapper/0 (pid: 1, stack limit = 0xdd066210)
[    0.397909] Stack: (0xdd067ca8 to 0xdd068000)
[    0.397916] 7ca0:                   dd067cd4 dd067cb8 c0588f64 c0588d94 dd3de200 0000002c
[    0.397922] 7cc0: dd067d0c dd3de200 dd067ce4 dd067cd8 c058366c c0588f30 dd067d04 dd067ce8
[    0.397930] 7ce0: c0584e88 c058365c 00000000 dd3de200 0000002c 0000aaaa dd067d3c dd067d08
[    0.397936] 7d00: c0585208 c0584e2c c0759efc c024ffd0 dd3de200 dd3de200 0000002c 00000000
[    0.397943] 7d20: 0000ffff 00000000 0000aaaa 00000000 dd067d74 dd067d40 c058626c c0585180
[    0.397950] 7d40: 00000000 00000000 dd067d74 00000000 dd1e3410 dd1e3400 dd398510 dfcc515c
[    0.397957] 7d60: dd1e3410 c0c24104 dd067dcc dd067d78 c04b18b4 c0586220 00000000 00000000
[    0.397963] 7d80: 00000000 00000000 dd067dbc dd067d98 c039cb34 c0399000 dd1e3410 00000000
[    0.397970] 7da0: dd1e3418 dd1e3410 ffffffed c0c240bc fffffdfb c0c240bc 00000000 00000000
[    0.397976] 7dc0: dd067dec dd067dd0 c056ef1c c04b17d4 dd1e3410 c0c978dc c0c978f4 00000000
[    0.397983] 7de0: dd067e1c dd067df0 c056d424 c056eed0 00000000 dd1e3410 c0c240bc dd1e3444
[    0.397990] 7e00: 00000000 c0a37834 00000075 c0c3d000 dd067e3c dd067e20 c056d59c c056d218
[    0.397996] 7e20: 00000000 c0c240bc c056d4f0 00000000 dd067e64 dd067e40 c056b74c c056d4fc
[    0.398003] 7e40: dd04486c dd1d6b48 c075a0c0 c0c240bc dd398480 c0c2d050 dd067e74 dd067e68
[    0.398009] 7e60: c056cdd0 c056b704 dd067e9c dd067e78 c056c8c8 c056cdb8 c0939224 dd067e88
[    0.398016] 7e80: c0c240bc 00000000 c0a1d9d4 c0c3d000 dd067eb4 dd067ea0 c056df10 c056c744
[    0.398023] 7ea0: c0c2d050 00000000 dd067ecc dd067eb8 c056ee74 c056de9c ffffe000 00000000
[    0.398029] 7ec0: dd067edc dd067ed0 c0a1d9f0 c056ee38 dd067f4c dd067ee0 c0201860 c0a1d9e0
[    0.398036] 7ee0: dd067f4c dd067ef0 c0247f00 c0a00624 c0919a4c c0919a2c c0919a78 c0922ed8
[    0.398042] 7f00: 00000000 c0919a04 00000006 00000006 c0955554 c09a54d0 dfcff992 dfcff9a5
[    0.398049] 7f20: c0270148 c09a54d0 00000007 c09a54d0 c0a46494 00000007 c0c3d000 c0a37834
[    0.398055] 7f40: dd067f94 dd067f50 c0a00f80 c0201820 00000006 00000006 00000000 c0a00618
[    0.398062] 7f60: 00000000 c0a00618 00000000 00000000 c0754ab0 00000000 00000000 00000000
[    0.398069] 7f80: 00000000 00000000 dd067fac dd067f98 c0754ac0 c0a00d64 00000000 c0754ab0
[    0.398075] 7fa0: 00000000 dd067fb0 c0207aa0 c0754abc 00000000 00000000 00000000 00000000
[    0.398081] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    0.398087] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[    0.398089] Backtrace:
[    0.398104] [<c0588d88>] (regmap_mmio_read32le) from [<c0588f64>] (regmap_mmio_read+0x40/0x5c)
[    0.398115] [<c0588f24>] (regmap_mmio_read) from [<c058366c>] (_regmap_bus_reg_read+0x1c/0x20)
[    0.398122]  r7:dd3de200 r6:dd067d0c r5:0000002c r4:dd3de200
[    0.398131] [<c0583650>] (_regmap_bus_reg_read) from [<c0584e88>] (_regmap_read+0x68/0xc0)
[    0.398141] [<c0584e20>] (_regmap_read) from [<c0585208>] (_regmap_update_bits+0x94/0xcc)
[    0.398146]  r7:0000aaaa r6:0000002c r5:dd3de200 r4:00000000
[    0.398157] [<c0585174>] (_regmap_update_bits) from [<c058626c>] (regmap_update_bits_base+0x58/0x7c)
[    0.398164]  r10:00000000 r9:0000aaaa r8:00000000 r7:0000ffff r6:00000000 r5:0000002c
[    0.398167]  r4:dd3de200
[    0.398178] [<c0586214>] (regmap_update_bits_base) from [<c04b18b4>] (ti_iodelay_probe+0xec/0x468)
[    0.398185]  r10:c0c24104 r9:dd1e3410 r8:dfcc515c r7:dd398510 r6:dd1e3400 r5:dd1e3410
[    0.398188]  r4:00000000
[    0.398197] [<c04b17c8>] (ti_iodelay_probe) from [<c056ef1c>] (platform_drv_probe+0x58/0xb4)
[    0.398204]  r10:00000000 r9:00000000 r8:c0c240bc r7:fffffdfb r6:c0c240bc r5:ffffffed
[    0.398207]  r4:dd1e3410
[    0.398219] [<c056eec4>] (platform_drv_probe) from [<c056d424>] (driver_probe_device+0x218/0x2e4)
[    0.398224]  r7:00000000 r6:c0c978f4 r5:c0c978dc r4:dd1e3410
[    0.398235] [<c056d20c>] (driver_probe_device) from [<c056d59c>] (__driver_attach+0xac/0xb0)
[    0.398242]  r10:c0c3d000 r9:00000075 r8:c0a37834 r7:00000000 r6:dd1e3444 r5:c0c240bc
[    0.398245]  r4:dd1e3410 r3:00000000
[    0.398256] [<c056d4f0>] (__driver_attach) from [<c056b74c>] (bus_for_each_dev+0x54/0xa4)
[    0.398262]  r7:00000000 r6:c056d4f0 r5:c0c240bc r4:00000000
[    0.398271] [<c056b6f8>] (bus_for_each_dev) from [<c056cdd0>] (driver_attach+0x24/0x28)
[    0.398276]  r6:c0c2d050 r5:dd398480 r4:c0c240bc
[    0.398286] [<c056cdac>] (driver_attach) from [<c056c8c8>] (bus_add_driver+0x190/0x214)
[    0.398296] [<c056c738>] (bus_add_driver) from [<c056df10>] (driver_register+0x80/0xfc)
[    0.398301]  r7:c0c3d000 r6:c0a1d9d4 r5:00000000 r4:c0c240bc
[    0.398310] [<c056de90>] (driver_register) from [<c056ee74>] (__platform_driver_register+0x48/0x50)
[    0.398314]  r5:00000000 r4:c0c2d050
[    0.398326] [<c056ee2c>] (__platform_driver_register) from [<c0a1d9f0>] (ti_iodelay_driver_init+0x1c/0x20)
[    0.398330]  r5:00000000 r4:ffffe000
[    0.398339] [<c0a1d9d4>] (ti_iodelay_driver_init) from [<c0201860>] (do_one_initcall+0x4c/0x170)
[    0.398347] [<c0201814>] (do_one_initcall) from [<c0a00f80>] (kernel_init_freeable+0x228/0x2c4)
[    0.398353]  r8:c0a37834 r7:c0c3d000 r6:00000007 r5:c0a46494 r4:c09a54d0
[    0.398364] [<c0a00d58>] (kernel_init_freeable) from [<c0754ac0>] (kernel_init+0x10/0x114)
[    0.398370]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0754ab0
[    0.398372]  r4:00000000
[    0.398382] [<c0754ab0>] (kernel_init) from [<c0207aa0>] (ret_from_fork+0x14/0x34)
[    0.398386]  r5:c0754ab0 r4:00000000
[    0.398394] Code: e5903000 e0831001 e5910000 f57ff04f (e89da800)
[    1.021105] ---[ end trace 0000000000000002 ]---
[    1.021266] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    1.021266]

objdump the kernel assembly code of "regmap_mmio_read32le" function as following:

c0592228 <regmap_mmio_read32le>:
c0592228: e1a0c00d  mov ip, sp
c059222c: e92dd800  push {fp, ip, lr, pc}
c0592230: e24cb004  sub fp, ip, #4
c0592234: e5903000  ldr r3, [r0]
c0592238: e0831001  add r1, r3, r1
c059223c: e5910000  ldr r0, [r1]
c0592240: f57ff04f  dsb sy
c0592244: e89da800  ldm sp, {fp, sp, pc}

static unsigned int regmap_mmio_read32le(struct regmap_mmio_context *ctx,
             unsigned int reg)
{

 return readl(ctx->regs + reg);
}

it is very simple function, and the assembly code is simple too, why register r0 turn into 000000, but execute "ldr r3, [r0]" is not 00000?

could anyone give me a hint? what cause this problem?

best regards!

  • Hi,

    Could you try migrating to the latest http://software-dl.ti.com/infotainment/esd/jacinto6/processor-sdk-linux-automotive/latest/index_FDS.html
    and checking if the issue is still present?

    - Keerthy

  • Hi Keerthy,

    Sorry, we have no time to try migrating latest sdk, as we do so much modify for our custom board.

    best regards!

  • Hello Fanok,

    I believe the crash is not seen at all iterations of reboot but only during some iterations of reboot correct?

    I will get back to you in couple of days.

    Regards,
    Keerthy

  • hi Keerthy,

    yes,  it happens about once a dozen times, but every time crash in ti_iodelay_driver_init.

    best regards!

  • hi Keerthy,

    we found the code where casued the kernel crash, the case is that we do earlyboot  and run visionSDK on IPU2 core,

    After IPU2 get running, we play audio through MCASP4 with EDMA channel. if i delete the following code, EDMA is not work with MCASP4, it will not cause  the kernel panic.

    DMAXBARConnect(SOC_IRQ_DMARQ_CROSSBAR_REGISTERS_BASE, EDMA, (UInt32)XBAR_INST_DMA_EDMA_DREQ_32, McASP4_DREQ_RX);
    DMAXBARConnect(SOC_IRQ_DMARQ_CROSSBAR_REGISTERS_BASE, EDMA, (UInt32)XBAR_INST_DMA_EDMA_DREQ_33, McASP4_DREQ_TX);
    but i dont know why MCASP4 transfer the audio with EDMA can cause the kernel panic sometime?
    if i add delay 1 second or more time, after that time to play audio, and kernel has already boot finish, it will not cause system crash too.
    best regards!
  • hi Keerthy,

    has anything progress?

    best regards!

  • Hi Fanok,

    Do you have additional changes on top of VisionSDK0304? Adding delay is helping.
    I am not having the DRA71 board currently. I am not able to reproduce it yet locally.

    With the delay are you able to work around the issue?

    - Keerthy

  • hi Keerthy,

    i think other SOC may have this problem too, while kernel boot  on A15 side doing init calls, during this period,

    on M4 side use MCASP with EDMA to play audio. this case can reproduce this problem. 

    yes, with the delay can avoid kernel panic, as kernel had already finish boot.

    best regards!

  • Hi Fanok,

    I am on dra71 board now. I am doing early boot with IPU2. I tried reboot at least more than
    a dozen times. I am on vsdk 3.07. Are you still facing the issue? Do you have a work around?
    Let me know if you still face MCASP issue.

    Best Regards,
    Keerthy

  • Hi Keerthy,

    I am still strugling for this issue.

    my colleague find a Sillicon Errato of DRA72x (SR 2.0, 1.0) and DRA71x (SR 2.1, 2.0) SoC for Automotive Infotainment,

    it has the following describe about this issue:

    i933 Access to IODELAY at Same Time as Other Peripheral on L4_PER2 Can Hang
    CRITICALITY Medium
    DESCRIPTION If read/write accesses are performed concurrently from one initiator to the IODELAY
    module address space and one initiator to another peripheral address space in the
    L4_PER2 segment of the L4 interconnect then the access to the IODELAY module can
    hang, leading to an overall system hang. The concurrent accesses may be from two
    different initiators, or could be from one initiator capable of issuing multiple transactions
    through the interconnect. In this context, initiator can be a compute core (MPU, DSP,
    IPU, etc.) or a DMA/Master peripheral (EDMA, SDMA, etc.)
    The hang occurs due to a protocol violation on the interconnect OCP bus when
    responses from the IODELAY module and other module on the L4_PER2 segment occur
    on the same cycle.
    The condition which hangs the system can be avoided by performing all IODELAY
    configurations during initial MPU boot, before other initiators are enabled. This approach
    may be acceptable for many peripherals, but may pose limitations for a few peripherals.
    For example, this approach may limit data transfer speeds of an SD Card or other device
    attached to the MMCn interface since IODELAY normally changes when the transfer
    mode is changed during run-time. In this example, the hang may occur if other initiators
    are accessing peripherals on L4_PER2 while IODELAY is changed to support a new SD
    Card or MMC transfer mode.
    The following peripherals are connected to L4_PER2 and should not be accessed while
    IODELAY configuration is modified: UART7, UART8, UART9, MCASP4_DAT,
    MCASP5_DAT, MCASP6_DAT, MCASP7_DAT, MCASP8_DAT, MCASP1_CFG,
    MCASP2_CFG, MCASP3_CFG, MCASP4_CFG, MCASP5_CFG, MCASP6_CFG,
    MCASP7_CFG, MCASP8_CFG, GMAC_SW, PWMSS1, PWMSS2, PWMSS3, ATL,
    MLB, VCP1, VCP2, DCAN2.

    so i think this is the SOC bug, fix this issue is impossible.

    best regards!

  • Fanok,

    Thanks a lot for bringing in the details on this issue. This errata mandates avoiding the access to MCASP
    while IODELAY configuration. Hence the occurrence is not all boot times. Since the root cause is done can we
    close this thread. Or do you have more to check on this.

    Best Regards,
    Keerthy