This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM62A7: System hangs in call trace issue during Linux booting initialization caused by kernel paging & wave5_vdi_readl.

Part Number: AM62A7

Tool/software:

Hello TI Expert,

We met System hang issue during during Power on and off test caused by kernel paging & wave5_vdi_readl.

The issue incidence : 3 /1553

Our environment is as following:

TI SDK BSP version: V09_00_02

 

Linux version: 6.1.80

Is there any patch to solve these unstable issue?

 kernel paging:

[Wed Jun 11 10:32:07.227 2025] [    0.000000] OF: reserved mem: failed to allocate memory for node 'linux,cma': size 576 MiB
[Wed Jun 11 10:32:07.415 2025] [    0.020265] Unable to handle kernel paging request at virtual address 0000000000e00000
[Wed Jun 11 10:32:07.430 2025] [    0.028379] Mem abort info:
[Wed Jun 11 10:32:07.430 2025] [    0.031227]   ESR = 0x0000000096000004
[Wed Jun 11 10:32:07.430 2025] [    0.035062]   EC = 0x25: DABT (current EL), IL = 32 bits
[Wed Jun 11 10:32:07.430 2025] [    0.040493]   SET = 0, FnV = 0
[Wed Jun 11 10:32:07.446 2025] [    0.043608]   EA = 0, S1PTW = 0
[Wed Jun 11 10:32:07.446 2025] [    0.046815]   FSC = 0x04: level 0 translation fault
[Wed Jun 11 10:32:07.446 2025] [    0.051800] Data abort info:
[Wed Jun 11 10:32:07.446 2025] [    0.054736]   ISV = 0, ISS = 0x00000004
[Wed Jun 11 10:32:07.446 2025] [    0.058654]   CM = 0, WnR = 0
[Wed Jun 11 10:32:07.463 2025] [    0.061680] [0000000000e00000] user address but active_mm is swapper
[Wed Jun 11 10:32:07.463 2025] [    0.068176] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[Wed Jun 11 10:32:07.478 2025] [    0.074580] Modules linked in:
[Wed Jun 11 10:32:07.478 2025] [    0.077701] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.80-ti-g2e423244f8c0 #1
[Wed Jun 11 10:32:07.478 2025] [    0.085261] Hardware name: Texas Instruments AM62A7 SK (DT)
[Wed Jun 11 10:32:07.493 2025] [    0.090953] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Wed Jun 11 10:32:07.493 2025] [    0.098069] pc : __pi_strlen+0x14/0x150
[Wed Jun 11 10:32:07.493 2025] [    0.101993] lr : kstrdup+0x28/0x90
[Wed Jun 11 10:32:07.493 2025] [    0.105470] sp : ffff80000948b9f0
[Wed Jun 11 10:32:07.508 2025] [    0.108851] x29: ffff80000948b9f0 x28: 0000000000000000 x27: ffff8000090a00c8
[Wed Jun 11 10:32:07.508 2025] [    0.116149] x26: ffff800008fbd9c0 x25: 0000000000000000 x24: ffff000800cb4700
[Wed Jun 11 10:32:07.524 2025] [    0.123446] x23: 0000000000000000 x22: 0000000000000000 x21: ffff8000081c92b4
[Wed Jun 11 10:32:07.524 2025] [    0.130743] x20: 0000000000000cc0 x19: 0000000000e00000 x18: ffffffffffffffff
[Wed Jun 11 10:32:07.540 2025] [    0.138040] x17: 000000000000001c x16: 00000000cccf7694 x15: ffff000800cb2987
[Wed Jun 11 10:32:07.540 2025] [    0.145338] x14: ffffffffffffffff x13: ffff000800cb2985 x12: 00000000550b1709
[Wed Jun 11 10:32:07.540 2025] [    0.152635] x11: 0000000000000000 x10: 0000000000000079 x9 : 0000000000000001
[Wed Jun 11 10:32:07.556 2025] [    0.159932] x8 : 0101010101010101 x7 : ffff000800108000 x6 : 0000000000000001
[Wed Jun 11 10:32:07.556 2025] [    0.167229] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000041ed
[Wed Jun 11 10:32:07.572 2025] [    0.174526] x2 : ffff800008be0000 x1 : 0000000000000cc0 x0 : 0000000000e00000
[Wed Jun 11 10:32:07.572 2025] [    0.181824] Call trace:
[Wed Jun 11 10:32:07.572 2025] [    0.184317]  __pi_strlen+0x14/0x150
[Wed Jun 11 10:32:07.590 2025] [    0.187879]  kstrdup_const+0x34/0x40
[Wed Jun 11 10:32:07.590 2025] [    0.191530]  __kernfs_new_node+0x54/0x23c
[Wed Jun 11 10:32:07.590 2025] [    0.195629]  kernfs_new_node+0x68/0x90
[Wed Jun 11 10:32:07.590 2025] [    0.199460]  kernfs_create_dir_ns+0x34/0x94
[Wed Jun 11 10:32:07.606 2025] [    0.203735]  internal_create_group+0x2ac/0x3cc
[Wed Jun 11 10:32:07.606 2025] [    0.208275]  internal_create_groups.part.0+0x4c/0xc0
[Wed Jun 11 10:32:07.606 2025] [    0.213348]  sysfs_create_groups+0x1c/0x30
[Wed Jun 11 10:32:07.606 2025] [    0.217531]  device_add+0x654/0x74c
[Wed Jun 11 10:32:07.621 2025] [    0.221095]  device_create_groups_vargs+0xe0/0x134
[Wed Jun 11 10:32:07.621 2025] [    0.225988]  device_create_with_groups+0x58/0x84
[Wed Jun 11 10:32:07.621 2025] [    0.230705]  vtconsole_class_init+0xa0/0xfc
[Wed Jun 11 10:32:07.635 2025] [    0.234982]  do_one_initcall+0x54/0x1d0
[Wed Jun 11 10:32:07.641 2025] [    0.238900]  kernel_init_freeable+0x218/0x284
[Wed Jun 11 10:32:07.641 2025] [    0.243351]  kernel_init+0x24/0x130
[Wed Jun 11 10:32:07.641 2025] [    0.246917]  ret_from_fork+0x10/0x20
[Wed Jun 11 10:32:07.641 2025] [    0.250572] Code: 92402c04 b200c3e8 f13fc09f 5400088c (a9400c02) 
[Wed Jun 11 10:32:07.656 2025] [    0.256798] ---[ end trace 0000000000000000 ]---
[Wed Jun 11 10:32:07.656 2025] [    0.261525] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[Wed Jun 11 10:32:07.672 2025] [    0.269350] SMP: stopping secondary CPUs
[Wed Jun 11 10:32:07.672 2025] [    0.273359] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
[Wed Jun 11 10:33:03.436 2025] �

wave5_vdi_readl:

[Tue Jun 10 21:55:37.119 2025] [    5.688167] Internal error: synchronous external abort: 0000000096000010 [#1] PREEMPT SMP
[Tue Jun 10 21:55:37.119 2025] [    5.696359] Modules linked in: max96717 crct10dif_ce snd_soc_simple_card snd_soc_simple_card_utils display_connector rtc_ti_k3 k3_j72xx_bandgap e5010_jpeg_enc dwc3_am62 ti_k3_r5_remoteproc tidss drm_dma_helper j721e_csi2rx wave5 drm_kms_helper ti_k3_dsp_remoteproc videobuf2_dma_contig virtio_rpmsg_bus v4l2_mem2mem sa2ul syscopyarea rpmsg_ns sysfillrect videobuf2_memops videobuf2_v4l2 ti_k3_common snd_soc_davinci_mcasp sysimgblt fb_sys_fops snd_soc_ti_udma snd_soc_ti_edma videobuf2_common cdns_dphy_rx snd_soc_ti_sdma max96712_des ox03f10 i2c_atr v4l2_fwnode v4l2_async snd_soc_tlv320aic3x_i2c snd_soc_tlv320aic3x max96717_ser m_can_platform m_can videodev gs mc can_dev optee_rng rng_core cryptodev(O) fuse drm drm_panel_orientation_quirks ipv6
[Tue Jun 10 21:55:37.188 2025] [    5.761892] CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G           O       6.1.80-ti-g2e423244f8c0 #1
[Tue Jun 10 21:55:37.188 2025] [    5.771269] Hardware name: Texas Instruments AM62A7 SK (DT)
[Tue Jun 10 21:55:37.204 2025] [    5.776829] pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Tue Jun 10 21:55:37.204 2025] [    5.783776] pc : wave5_vdi_readl+0xc/0x30 [wave5]
[Tue Jun 10 21:55:37.204 2025] [    5.788494] lr : wave5_vpu_timer_callback+0x24/0x8c [wave5]
[Tue Jun 10 21:55:37.220 2025] [    5.794066] sp : ffff800008003ea0
[Tue Jun 10 21:55:37.220 2025] [    5.797368] x29: ffff800008003ea0 x28: 0000000000000002 x27: ffff800000f23194
[Tue Jun 10 21:55:37.220 2025] [    5.804494] x26: 0000000000000001 x25: 00000000000000c0 x24: 00000001529c634d
[Tue Jun 10 21:55:37.239 2025] [    5.811618] x23: ffff00087f93e680 x22: ffff00087f93e6e0 x21: ffff00087f93e680
[Tue Jun 10 21:55:37.239 2025] [    5.818742] x20: ffff00080140b080 x19: ffff00080140b200 x18: 0000000000000000
[Tue Jun 10 21:55:37.255 2025] [    5.825866] x17: ffff8008767fe000 x16: ffff800008000000 x15: ffff800009b33aec
[Tue Jun 10 21:55:37.255 2025] [    5.832991] x14: 0000000000000000 x13: 0000000000000000 x12: 000000007ffff000
[Tue Jun 10 21:55:37.255 2025] [    5.840114] x11: 0000000000000040 x10: ffff000800008170 x9 : ffff000800008168
[Tue Jun 10 21:55:37.270 2025] [    5.847238] x8 : ffff0008004004b8 x7 : 0000000000000000 x6 : 0000000000000000
[Tue Jun 10 21:55:37.270 2025] [    5.854360] x5 : 0000000000000000 x4 : ffff00080140b200 x3 : ffff00087f93ebd0
[Tue Jun 10 21:55:37.289 2025] [    5.861484] x2 : 0000000000000000 x1 : 0000000000000044 x0 : ffff800009ff0044
[Tue Jun 10 21:55:37.289 2025] [    5.868608] Call trace:
[Tue Jun 10 21:55:37.289 2025] [    5.871044]  wave5_vdi_readl+0xc/0x30 [wave5]
[Tue Jun 10 21:55:37.305 2025] [    5.875404]  __hrtimer_run_queues+0x138/0x1b0
[Tue Jun 10 21:55:37.305 2025] [    5.879757]  hrtimer_interrupt+0xe8/0x244
[Tue Jun 10 21:55:37.305 2025] [    5.883756]  arch_timer_handler_phys+0x34/0x44
[Tue Jun 10 21:55:37.305 2025] [    5.888195]  handle_percpu_devid_irq+0x84/0x130
[Tue Jun 10 21:55:37.320 2025] [    5.892717]  generic_handle_domain_irq+0x2c/0x44
[Tue Jun 10 21:55:37.320 2025] [    5.897326]  gic_handle_irq+0x50/0x124
[Tue Jun 10 21:55:37.320 2025] [    5.904984]  do_interrupt_handler+0x80/0x8c
[Tue Jun 10 21:55:37.336 2025] [    5.909158]  el1_interrupt+0x34/0x70
[Tue Jun 10 21:55:37.336 2025] [    5.912728]  el1h_64_irq_handler+0x18/0x2c
[Tue Jun 10 21:55:37.336 2025] [    5.916816]  el1h_64_irq+0x64/0x68
[Tue Jun 10 21:55:37.336 2025] [    5.920207]  __kmem_cache_free+0xfc/0x270
[Tue Jun 10 21:55:37.351 2025] [    5.924209]  kfree+0x5c/0x7c
[Tue Jun 10 21:55:37.351 2025] [    5.927083]  skb_free_head+0x40/0x80
[Tue Jun 10 21:55:37.351 2025] [    5.930653]  skb_release_data+0x104/0x1d0
[Tue Jun 10 21:55:37.351 2025] [    5.934653]  kfree_skb_reason+0x40/0xb0
[Tue Jun 10 21:55:37.368 2025] [    5.938479]  skb_free_datagram+0x18/0x24
[Tue Jun 10 21:55:37.370 2025] [    5.942392]  __unix_dgram_recvmsg+0x18c/0x444
[Tue Jun 10 21:55:37.370 2025] [    5.946738]  unix_dgram_recvmsg+0x40/0x4c
[Tue Jun 10 21:55:37.370 2025] [    5.950737]  ____sys_recvmsg+0x8c/0x210
[Tue Jun 10 21:55:37.385 2025] [    5.954563]  ___sys_recvmsg+0x80/0xf0
[Tue Jun 10 21:55:37.387 2025] [    5.958217]  __sys_recvmsg+0x68/0xcc
[Tue Jun 10 21:55:37.387 2025] [    5.961783]  __arm64_sys_recvmsg+0x24/0x30
[Tue Jun 10 21:55:37.387 2025] [    5.965869]  invoke_syscall+0x48/0x114
[Tue Jun 10 21:55:37.387 2025] [    5.969610]  el0_svc_common.constprop.0+0xd4/0xfc
[Tue Jun 10 21:55:37.403 2025] [    5.974303]  do_el0_svc+0x20/0x30
[Tue Jun 10 21:55:37.403 2025] [    5.977610]  el0_svc+0x28/0xa0
[Tue Jun 10 21:55:37.403 2025] [    5.980658]  el0t_64_sync_handler+0xbc/0x140
[Tue Jun 10 21:55:37.403 2025] [    5.984919]  el0t_64_sync+0x18c/0x190
[Tue Jun 10 21:55:37.403 2025] [    5.988576] Code: d503201f f940ac00 d503233f 8b214000 (b9400000) 
[Tue Jun 10 21:55:37.422 2025] [    5.994657] ---[ end trace 0000000000000000 ]---
[Tue Jun 10 21:55:37.422 2025] [    5.999264] Kernel panic - not syncing: synchronous external abort: Fatal exception in interrupt
[Tue Jun 10 21:55:37.437 2025] [    6.008028] SMP: stopping secondary CPUs
[Tue Jun 10 21:55:37.440 2025] [    6.011945] Kernel Offset: disabled
[Tue Jun 10 21:55:37.440 2025] [    6.015421] CPU features: 0x00000,00800084,0000420b
[Tue Jun 10 21:55:37.440 2025] [    6.020286] Memory Limit: none
[Tue Jun 10 21:55:37.440 2025] [    6.023334] ---[ end Kernel panic - not syncing: synchronous external abort: Fatal exception in interrupt ]---

  • Hi Arthur,

    Can you elaborate more on your setup? 

    Were you running any Gstreamer pipeline before powering-on and off? Are you stressing the system with any other applications running?

    Is there a possibility you test this with 10.1 SDK released code?  Or provide us details to reproduce the issue.

    Best Regards,

    Suren

  • Hello Suren,

    Yes, we didn't run any Gstreamer pipeline before powering-on and off, but we run RPMSG communication service between A53 & MCU R5 in every powering-on.

    How about using rpmsg pingpong continuously in every powering-on and off to check? 

     

    Looks like 'linux,cma': size is related the kernel paging issue.

    We will try to import the " https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1344404/am62a7-sdk-09-01-rare-boot-crash-related-to-wave5_vdi_readl "  change to solve issue about wave5_vdi_readl.

  • Hi Arthur,

    Yes please, try to import that patch and let us know how it goes.

    Best Regards.

    Suren

  • Hello Suren,

    We tested 5501 times for powering-on and off with the " https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1344404/am62a7-sdk-09-01-rare-boot-crash-related-to-wave5_vdi_readl " solution.

    There is still 1 power on failed by Kernel paging call trace occured as following:

     [    0.000000] OF: reserved mem: failed to allocate memory for node 'linux,cma': size 576 MiB
     [    0.020265] Unable to handle kernel paging request at virtual address 0000000000e00000
     [    0.028379] Mem abort info:
     [    0.031227]   ESR = 0x0000000096000004
     [    0.035062]   EC = 0x25: DABT (current EL), IL = 32 bits
     [    0.040493]   SET = 0, FnV = 0
     [    0.043608]   EA = 0, S1PTW = 0
     [    0.046815]   FSC = 0x04: level 0 translation fault
     [    0.051800] Data abort info:
     [    0.054736]   ISV = 0, ISS = 0x00000004
     [    0.058654]   CM = 0, WnR = 0
     [    0.061680] [0000000000e00000] user address but active_mm is swapper
     [    0.068176] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
     [    0.074580] Modules linked in:
     [    0.077701] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.80-ti-g2e423244f8c0 #1
     [    0.085261] Hardware name: Texas Instruments AM62A7 SK (DT)
     [    0.090953] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     [    0.098069] pc : __pi_strlen+0x14/0x150
     [    0.101993] lr : kstrdup+0x28/0x90
     [    0.105470] sp : ffff80000948b9f0
     [    0.108851] x29: ffff80000948b9f0 x28: 0000000000000000 x27: ffff8000090a00c8
     [    0.116149] x26: ffff800008fbd9c0 x25: 0000000000000000 x24: ffff000800cb4700
     [    0.123446] x23: 0000000000000000 x22: 0000000000000000 x21: ffff8000081c92b4
     [    0.130743] x20: 0000000000000cc0 x19: 0000000000e00000 x18: ffffffffffffffff
     [    0.138040] x17: 000000000000001c x16: 00000000cccf7694 x15: ffff000800cb2987
     [    0.145338] x14: ffffffffffffffff x13: ffff000800cb2985 x12: 00000000550b1709
     [    0.152635] x11: 0000000000000000 x10: 0000000000000079 x9 : 0000000000000001
     [    0.159932] x8 : 0101010101010101 x7 : ffff000800108000 x6 : 0000000000000001
     [    0.167229] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000041ed
     [    0.174526] x2 : ffff800008be0000 x1 : 0000000000000cc0 x0 : 0000000000e00000
     [    0.181824] Call trace:
     [    0.184317]  __pi_strlen+0x14/0x150
     [    0.187879]  kstrdup_const+0x34/0x40
     [    0.191530]  __kernfs_new_node+0x54/0x23c
     [    0.195629]  kernfs_new_node+0x68/0x90
     [    0.199460]  kernfs_create_dir_ns+0x34/0x94
     [    0.203735]  internal_create_group+0x2ac/0x3cc
     [    0.208275]  internal_create_groups.part.0+0x4c/0xc0
     [    0.213348]  sysfs_create_groups+0x1c/0x30
     [    0.217531]  device_add+0x654/0x74c
     [    0.221095]  device_create_groups_vargs+0xe0/0x134
     [    0.225988]  device_create_with_groups+0x58/0x84
     [    0.230705]  vtconsole_class_init+0xa0/0xfc
     [    0.234982]  do_one_initcall+0x54/0x1d0
     [    0.238900]  kernel_init_freeable+0x218/0x284
     [    0.243351]  kernel_init+0x24/0x130
     [    0.246917]  ret_from_fork+0x10/0x20
     [    0.250572] Code: 92402c04 b200c3e8 f13fc09f 5400088c (a9400c02) 
     [    0.256798] ---[ end trace 0000000000000000 ]---
     [    0.261525] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
     [    0.269350] SMP: stopping secondary CPUs
     [    0.273359] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
     ?

    Why is this issue occured?

  • Hi Arthur,

    Are you seeing the wave5_vdi_readl error after adding the patch? Also, you are seeing kernel paging issue once in 5500 runs?

    Best Regards,

    Suren

  • Hello Suren,

    No, I'm not seeing the wave5_vdi_readl error after adding the patch.

    I'm seeing kernel paging issue once in 5500 runs after adding the patch.

    Please help to check and provide some idea about this issue's cause.

  • Hi Arthur,

    How many boards have you tested? Do they all have the same failure signature in the kernel log?

     [    0.020265] Unable to handle kernel paging request at virtual address 0000000000e00000
    ...
     [    0.098069] pc : __pi_strlen+0x14/0x150
     [    0.101993] lr : kstrdup+0x28/0x90

  • Looks like 'linux,cma': size is related the kernel paging issue.

    I don't think the cma error in the kernel log is related to the kernel paging request error, but you can try to apply the following kernel patch to see if the paging issue still happens. The patch should make the cma allocation failure go away.

    diff --git a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
    index 3047586a6d9d..e5d31df198e5 100644
    --- a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
    +++ b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
    @@ -49,7 +49,7 @@ linux,cma {
                            compatible = "shared-dma-pool";
                            reusable;
                            size = <0x00 0x24000000>;
    -                       alloc-ranges = <0x00 0xc0000000 0x00 0x24000000>;
    +                       alloc-ranges = <0x00 0xc0000000 0x00 0x2400000>;
                            linux,cma-default;
                    };

  • Hello Bin Liu,

    After patch based on you provided. 

    The issue is still existed.

     [    0.000000] OF: reserved mem: failed to allocate memory for node 'linux,cma': size 576 MiB

    Why do you think alloc-ranges shoud be reduced to be 36MiB?

  • Hi Arthur,

    36MB is just a random smaller number, just removed a '0' from the original value. I want to see when cma allocation is not failed, if the crash issue still happens, though I doubt both are related.