This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

J784S4XEVM: Linux kernel panic while extensive usage of RAM on custom board.

Part Number: J784S4XEVM
Other Parts Discussed in Thread: TDA4VH-Q1

Tool/software:

Hello everyone,

We are currently working with SDK 10.0 on a custom board based on the J784S4 and are encountering Linux kernel panics under heavy RAM and CPU usage. This issue was initially reported by our customer while developing applications on the board, and we have been able to reproduce it using the stress-ng --vm 5 --vm-bytes 2G --timeout 10m  tool. Based on our observations, we suspect that the issue may be related to RAM misconfiguration.

As part of our modifications, we removed two instances and reduced the total RAM from 32 GB to 16 GB. I will provide a patch detailing all the changes we made.

Additionally, we ran the memtester application as part of our diagnostics, but it did not report any errors.

Could you provide guidance on how to further investigate and resolve this issue? Any insights would be greatly appreciated.

Best regards,

Dušan Stanišić

memtester_board.txt
root@j784s4-evm:~# memtester 1024M 5 
memtester version 4.6.0 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1024MB (1073741824 bytes)
got  1024MB (1073741824 bytes), trying mlock ...locked.
Loop 1/5:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         

Loop 2/5:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         

Loop 3/5:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         

Loop 4/5:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         

Loop 5/5:
  Stuck Address       : ok         
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok         
  Block Sequential    : ok         
  Checkerboard        : ok         
  Bit Spread          : ok         
  Bit Flip            : ok         
  Walking Ones        : ok         
  Walking Zeroes      : ok         

Done.
root@j784s4-evm:~# 

linux_panic.txt
 3925.840110] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[ 3925.840410] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[ 3925.840650] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[ 3925.840664] Mem abort info:
[ 3925.840669]   ESR = 0x0000000096000046
[ 3925.840675]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.840681]   SET = 0, FnV = 0
[ 3925.840684]   EA = 0, S1PTW = 0
[ 3925.840686]   FSC = 0x06: level 2 translation fault
[ 3925.840689] Data abort info:
[ 3925.840690]   ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
[ 3925.840694]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 3925.840697]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.840700] user pgtable: 4k pages, 48-bit VAs, pgdp=00000008861f5000
[ 3925.840705] [0000000000000008] pgd=08000009061c5003, p4d=08000009061c5003, pud=08000008881b6003, pmd=0000000000000000
[ 3925.840724] Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP
[ 3925.840734] Modules linked in: ipv6
[ 3925.840751] CPU: 3 PID: 614 Comm: stress-ng-vm Not tainted 6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.840761] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.840765] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.840772] pc : pgtable_trans_huge_withdraw+0x40/0x68
[ 3925.840819] lr : __split_huge_pmd+0x7f0/0xa78
[ 3925.840831] sp : ffff80008326bbb0
[ 3925.840834] x29: ffff80008326bc20 x28: ffff000809c70000 x27: 0000ffff87c00000
[ 3925.840843] x26: 0000000000000002 x25: ffff000802919b40 x24: 0000000000000000
[ 3925.840851] x23: ffff000800f4df00 x22: ffff800080f5b000 x21: ffff00081ac181f0
[ 3925.840859] x20: ffff00081ac181f0 x19: ffff000800f4df00 x18: 0060000936000fc1
[ 3925.840867] x17: 0000000000000000 x16: ffff800080d13670 x15: 0001400000000000
[ 3925.840875] x14: 0001600000000000 x13: 05f6004000000000 x12: 0000000000000200
[ 3925.840884] x11: 0000000000000020 x10: 05f6000000000000 x9 : ffff80008136f000
[ 3925.840891] x8 : ffff80008326bd58 x7 : 0001000000000000 x6 : 0000000000000003
[ 3925.840899] x5 : 05f6000ffff87c00 x4 : 0000000000000002 x3 : 0000000000000000
[ 3925.840907] x2 : 0000000000000000 x1 : fffffc00206b0600 x0 : fffffc00202aa7c0
[ 3925.840915] Call trace:
[ 3925.840920]  pgtable_trans_huge_withdraw+0x40/0x68
[ 3925.840928]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3925.840933]  __handle_mm_fault+0x5cc/0xc8c
[ 3925.840946]  handle_mm_fault+0x68/0x280
[ 3925.840951]  do_page_fault+0x140/0x490
[ 3925.840960]  do_mem_abort+0x44/0x94
[ 3925.840963]  el0_da+0x30/0x88
[ 3925.840983]  el0t_64_sync_handler+0xb4/0x12c
[ 3925.840989]  el0t_64_sync+0x190/0x194
[ 3925.841003] Code: d1002042 f9000822 b4000102 a9408803 (f9000462) 
[ 3925.841009] ---[ end trace 0000000000000000 ]---
[ 3925.841014] note: stress-ng-vm[614] exited with preempt_count 1
[ 3925.841130] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[ 3925.841133] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[ 3925.841137] Mem abort info:
[ 3925.841138] Mem abort info:
[ 3925.841138]   ESR = 0x0000000096000046
[ 3925.841139]   ESR = 0x0000000096000006
[ 3925.841141]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.841142]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.841144]   SET = 0, FnV = 0
[ 3925.841145]   SET = 0, FnV = 0
[ 3925.841146]   EA = 0, S1PTW = 0
[ 3925.841147]   EA = 0, S1PTW = 0
[ 3925.841148]   FSC = 0x06: level 2 translation fault
[ 3925.841149]   FSC = 0x06: level 2 translation fault
[ 3925.841150] Data abort info:
[ 3925.841152]   ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
[ 3925.841153] Data abort info:
[ 3925.841154]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 3925.841155]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 3925.841156]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.841158]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 3925.841160] user pgtable: 4k pages, 48-bit VAs, pgdp=00000008861f5000
[ 3925.841161]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.841163] [0000000000000008] pgd=08000009061c5003
[ 3925.841164] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000889c0e000
[ 3925.841165] , p4d=08000009061c5003
[ 3925.841167] [0000000000000008] pgd=0800000884304003
[ 3925.841169] , pud=08000008881b6003
[ 3925.841170] , p4d=0800000884304003
[ 3925.841171] , pmd=0000000000000000
[ 3925.841173] , pud=080000088abd9003
[ 3925.841174] 
[ 3925.841175] , pmd=0000000000000000
[ 3925.841175] Internal error: Oops: 0000000096000046 [#2] PREEMPT SMP
[ 3925.841177] 
[ 3925.841178] Modules linked in: ipv6
[ 3925.841184] CPU: 3 PID: 614 Comm: stress-ng-vm Tainted: G      D            6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.841189] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.841191] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.841194] pc : folio_activate_fn+0x70/0x26c
[ 3925.841208] lr : folio_batch_move_lru+0xb4/0x1c0
[ 3925.841214] sp : ffff80008326b3c0
[ 3925.841216] x29: ffff80008326b3c0 x28: 0020000884ee5fc3 x27: ffff80008326b688
[ 3925.841224] x26: ffff800081352080 x25: ffff8000801c4abc x24: ffff8000801c42f0
[ 3925.841232] x23: ffff000b0df54c50 x22: ffff0008001c3000 x21: 0000000000000001
[ 3925.841239] x20: 0000000000000001 x19: fffffc0020120f00 x18: ffff80008326b648
[ 3925.841247] x17: 72646461206c6175 x16: ffff800080d13670 x15: 000000000000000f
[ 3925.841255] x14: 0000000000000000 x13: 1fffe00100042901 x12: ffff80008326b648
[ 3925.841263] x11: 6874697720646574 x10: ffff000800214808 x9 : ffff00080021480c
[ 3925.841270] x8 : 0000000000000000 x7 : 0000000000000012 x6 : ffff800a8ced1000
[ 3925.841277] x5 : ffff000b0df54c50 x4 : 0000000000000000 x3 : 0000000000000000
[ 3925.841284] x2 : 0000000000000000 x1 : dead000000000100 x0 : 0000000000000000
[ 3925.841291] Call trace:
[ 3925.841293]  folio_activate_fn+0x70/0x26c
[ 3925.841298]  folio_batch_move_lru+0xb4/0x1c0
[ 3925.841304]  folio_activate+0xa4/0xec
[ 3925.841309]  folio_mark_accessed+0x78/0x158
[ 3925.841314]  mark_page_accessed+0x20/0x2c
[ 3925.841319]  unmap_page_range+0x4c0/0x958
[ 3925.841326]  unmap_single_vma.isra.0+0x48/0x84
[ 3925.841331]  unmap_vmas+0x64/0x118
[ 3925.841335]  exit_mmap+0xbc/0x26c
[ 3925.841339]  __mmput+0x30/0x148
[ 3925.841346]  mmput+0x50/0x5c
[ 3925.841350]  do_exit+0x28c/0x8c4
[ 3925.841358]  make_task_dead+0x84/0x17c
[ 3925.841362]  arm64_force_sig_fault+0x0/0x70
[ 3925.841370]  die_kernel_fault+0x1bc/0x3a4
[ 3925.841373]  __do_kernel_fault+0x130/0x180
[ 3925.841377]  do_page_fault+0xc0/0x490
[ 3925.841381]  do_translation_fault+0x9c/0xa8
[ 3925.841385]  do_mem_abort+0x44/0x94
[ 3925.841388]  el1_abort+0x40/0x64
[ 3925.841393]  el1h_64_sync_handler+0xa4/0xe4
[ 3925.841398]  el1h_64_sync+0x64/0x68
[ 3925.841402]  pgtable_trans_huge_withdraw+0x40/0x68
[ 3925.841407]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3925.841412]  __handle_mm_fault+0x5cc/0xc8c
[ 3925.841420]  handle_mm_fault+0x68/0x280
[ 3925.841425]  do_page_fault+0x140/0x490
[ 3925.841428]  do_mem_abort+0x44/0x94
[ 3925.841432]  el0_da+0x30/0x88
[ 3925.841437]  el0t_64_sync_handler+0xb4/0x12c
[ 3925.841442]  el0t_64_sync+0x190/0x194
[ 3925.841448] Code: a9408a63 53134e94 52000294 53082000 (f9000462) 
[ 3925.841451] ---[ end trace 0000000000000000 ]---
[ 3925.841453] note: stress-ng-vm[614] exited with irqs disabled
[ 3925.841453] Internal error: Oops: 0000000096000006 [#3] PREEMPT SMP
[ 3925.841457] note: stress-ng-vm[614] exited with preempt_count 3
[ 3925.841458] Modules linked in:
[ 3925.841459] Fixing recursive fault but reboot is needed!
[ 3925.841460]  ipv6
[ 3925.841465] ------------[ cut here ]------------
[ 3925.841463] CPU: 1 PID: 612 Comm: stress-ng-vm Tainted: G      D            6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.841467] Voluntary context switch within RCU read-side critical section!
[ 3925.841469] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.841472] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.841476] pc : pgtable_trans_huge_withdraw+0x24/0x68
[ 3925.841488] lr : __split_huge_pmd+0x7f0/0xa78
[ 3925.841494] sp : ffff800083243bb0
[ 3925.841496] x29: ffff800083243c20 x28: ffff0008067ae3c0 x27: 0000ffff87c00000
[ 3925.841491] WARNING: CPU: 3 PID: 614 at /kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x3b0/0x408
[ 3925.841505] x26: 0000000000000002 x25: ffff000802ad2140
[ 3925.841508] Modules linked in:
[ 3925.841509]  x24: 0000000000000000
[ 3925.841512]  ipv6
[ 3925.841513] 
[ 3925.841514] 
[ 3925.841515] x23: ffff0008040550c0 x22: ffff800080f5b000
[ 3925.841518] CPU: 3 PID: 614 Comm: stress-ng-vm Tainted: G      D            6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.841520]  x21: ffff0008063ec1f0
[ 3925.841523] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.841523] x20: ffff0008063ec1f0 x19: ffff0008040550c0
[ 3925.841525] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.841529]  x18: 0060000936000fc1
[ 3925.841529] pc : rcu_note_context_switch+0x3b0/0x408
[ 3925.841532] x17: 0000000000000000 x16: ffff800080d13670 x15: 0001400000000000
[ 3925.841536] lr : rcu_note_context_switch+0x3b0/0x408
[ 3925.841539] x14: 0001600000000000 x13: 05ee004000000000
[ 3925.841541] sp : ffff80008326af20
[ 3925.841543]  x12: 0000000000000200
[ 3925.841544] x29: ffff80008326af20
[ 3925.841545] 
[ 3925.841546]  x28: ffff000809c70000
[ 3925.841546] x11: 0000000000000020
[ 3925.841548]  x27: ffff80008326b688
[ 3925.841549]  x10: 05ee000000000000
[ 3925.841551] 
[ 3925.841552]  x9 : ffff80008136f000
[ 3925.841553] x26: ffff800080e84df8
[ 3925.841554] 
[ 3925.841555]  x25: 0000000000000000
[ 3925.841555] x8 : ffff800083243d58
[ 3925.841557]  x24: ffff80008007d660
[ 3925.841557]  x7 : 0001000000000000
[ 3925.841559] 
[ 3925.841559]  x6 : 0000000000000003
[ 3925.841561] x23: 0000000000000000
[ 3925.841562] 
[ 3925.841563]  x22: ffff000809c70000
[ 3925.841564] x5 : 05ee000ffff87c00
[ 3925.841565]  x21: ffff000809c70000
[ 3925.841566]  x4 : 0000000000000002
[ 3925.841567] 
[ 3925.841568]  x3 : 0000000000000000
[ 3925.841568] x20: ffff800081086e80
[ 3925.841570] 
[ 3925.841570]  x19: ffff000b0df57e80
[ 3925.841571] x2 : 0001000000000000
[ 3925.841572]  x18: fffffffffffeca90
[ 3925.841573]  x1 : fffffc002018fb00
[ 3925.841574] 
[ 3925.841576] x17: 72646461206c6175
[ 3925.841576]  x0 : 0000000000000000
[ 3925.841578]  x16: 7472697620746120 x15: 0000000000000048
[ 3925.841580] Call trace:
[ 3925.841582] 
[ 3925.841584] x14: fffffffffffecad8 x13: 216e6f6974636573
[ 3925.841583]  pgtable_trans_huge_withdraw+0x24/0x68
[ 3925.841587]  x12: 206c616369746972
[ 3925.841590] x11: 6320656469732d64
[ 3925.841589]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3925.841592]  x10: 6165722055435220 x9 : 206e696874697720
[ 3925.841594]  __handle_mm_fault+0x5cc/0xc8c
[ 3925.841597] x8 : 6863746977732074 x7 : 7865746e6f632079 x6 : 0000000000000000
[ 3925.841601]  handle_mm_fault+0x68/0x280
[ 3925.841605] x5 : ffff000b0df4fb88 x4 : 0000000000000000
[ 3925.841607]  do_page_fault+0x140/0x490
[ 3925.841609]  x3 : 0000000000000027
[ 3925.841611]  do_mem_abort+0x44/0x94
[ 3925.841612] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000809c70000
[ 3925.841615]  el0_da+0x30/0x88
[ 3925.841620] Call trace:
[ 3925.841621]  rcu_note_context_switch+0x3b0/0x408
[ 3925.841622]  el0t_64_sync_handler+0xb4/0x12c
[ 3925.841627]  __schedule+0xa0/0x9c4
[ 3925.841629]  el0t_64_sync+0x190/0x194
[ 3925.841634] Code: b9402820 34000260 f9400820 aa0003e3 (f8408c62) 
[ 3925.841632]  do_task_dead+0x48/0x4c
[ 3925.841640] ---[ end trace 0000000000000000 ]---
[ 3925.841642] note: stress-ng-vm[612] exited with preempt_count 1
[ 3925.841643]  make_task_dead+0x148/0x17c
[ 3925.841647]  arm64_force_sig_fault+0x0/0x70
[ 3925.841652]  die_kernel_fault+0x1bc/0x3a4
[ 3925.841657]  __do_kernel_fault+0x130/0x180
[ 3925.841661]  do_page_fault+0xc0/0x490
[ 3925.841664]  do_translation_fault+0x9c/0xa8
[ 3925.841668]  do_mem_abort+0x44/0x94
[ 3925.841671]  el1_abort+0x40/0x64
[ 3925.841677]  el1h_64_sync_handler+0xa4/0xe4
[ 3925.841683]  el1h_64_sync+0x64/0x68
[ 3925.841688]  folio_activate_fn+0x70/0x26c
[ 3925.841693] ------------[ cut here ]------------
[ 3925.841694]  folio_batch_move_lru+0xb4/0x1c0
[ 3925.841695] WARNING: CPU: 1 PID: 612 at /mm/slab_common.c:994 free_large_kmalloc+0x6c/0xa0
[ 3925.841699]  folio_activate+0xa4/0xec
[ 3925.841703] Modules linked in: ipv6
[ 3925.841703]  folio_mark_accessed+0x78/0x158
[ 3925.841707] 
[ 3925.841708]  mark_page_accessed+0x20/0x2c
[ 3925.841709] CPU: 1 PID: 612 Comm: stress-ng-vm Tainted: G      D            6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.841712] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.841714] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.841712]  unmap_page_range+0x4c0/0x958
[ 3925.841718] pc : free_large_kmalloc+0x6c/0xa0
[ 3925.841719]  unmap_single_vma.isra.0+0x48/0x84
[ 3925.841722] lr : kfree+0x68/0x6c
[ 3925.841723]  unmap_vmas+0x64/0x118
[ 3925.841725] sp : ffff800083243730
[ 3925.841727] x29: ffff800083243730
[ 3925.841728]  exit_mmap+0xbc/0x26c
[ 3925.841729]  x28: ffff0008067ae3c0 x27: 0000ffff87c00000
[ 3925.841732]  __mmput+0x30/0x148
[ 3925.841733] 
[ 3925.841735] x26: ffff800080e84df8
[ 3925.841735]  mmput+0x50/0x5c
[ 3925.841737]  x25: 0000000000000000 x24: 0000000000000000
[ 3925.841743] x23: ffff800080e88700
[ 3925.841741]  do_exit+0x28c/0x8c4
[ 3925.841745]  x22: 0000000000000001 x21: 0000000000000000
[ 3925.841747]  make_task_dead+0x84/0x17c
[ 3925.841750] 
[ 3925.841751] x20: ffff0008067b3400
[ 3925.841750]  arm64_force_sig_fault+0x0/0x70
[ 3925.841753]  x19: fffffc002019ecc0 x18: ffff800083243718
[ 3925.841755]  die_kernel_fault+0x1bc/0x3a4
[ 3925.841757] 
[ 3925.841759] x17: 72646461206c6175
[ 3925.841760]  __do_kernel_fault+0x130/0x180
[ 3925.841761]  x16: ffff800080d13670 x15: 000000000000000f
[ 3925.841764]  do_page_fault+0xc0/0x490
[ 3925.841765] 
[ 3925.841767] x14: 0000000000000000 x13: 1fffe00100510fc1
[ 3925.841768]  do_translation_fault+0x9c/0xa8
[ 3925.841771]  x12: ffff800083243718
[ 3925.841772]  do_mem_abort+0x44/0x94
[ 3925.841775] x11: 6874697720646574 x10: ffff000802887e08
[ 3925.841776]  el1_abort+0x40/0x64
[ 3925.841779]  x9 : ffff000802887e0c
[ 3925.841782]  el1h_64_sync_handler+0xa4/0xe4
[ 3925.841784] x8 : 0000000000000000 x7 : 0000000000000000
[ 3925.841787]  el1h_64_sync+0x64/0x68
[ 3925.841788]  x6 : 0000000000000003
[ 3925.841791] x5 : 0000000007531da1
[ 3925.841791]  pgtable_trans_huge_withdraw+0x40/0x68
[ 3925.841793]  x4 : 0000000000010364 x3 : ffffffffffffffff
[ 3925.841795]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3925.841797] 
[ 3925.841799] x2 : fffffc002019ecc0 x1 : ffff0008067b3400
[ 3925.841800]  __handle_mm_fault+0x5cc/0xc8c
[ 3925.841803]  x0 : 0000000000000000
[ 3925.841806] Call trace:
[ 3925.841806]  handle_mm_fault+0x68/0x280
[ 3925.841808]  free_large_kmalloc+0x6c/0xa0
[ 3925.841811]  do_page_fault+0x140/0x490
[ 3925.841812]  kfree+0x68/0x6c
[ 3925.841814]  do_mem_abort+0x44/0x94
[ 3925.841818]  el0_da+0x30/0x88
[ 3925.841815]  __audit_free+0x8c/0x170
[ 3925.841823]  el0t_64_sync_handler+0xb4/0x12c
[ 3925.841824]  do_exit+0x76c/0x8c4
[ 3925.841827]  el0t_64_sync+0x190/0x194
[ 3925.841830]  make_task_dead+0x84/0x17c
[ 3925.841830] ---[ end trace 0000000000000000 ]---
[ 3925.841833]  arm64_force_sig_fault+0x0/0x70
[ 3925.841839]  die_kernel_fault+0x1bc/0x3a4
[ 3925.841842]  __do_kernel_fault+0x130/0x180
[ 3925.841846]  do_page_fault+0xc0/0x490
[ 3925.841850]  do_translation_fault+0x9c/0xa8
[ 3925.841854]  do_mem_abort+0x44/0x94
[ 3925.841857]  el1_abort+0x40/0x64
[ 3925.841863]  el1h_64_sync_handler+0xa4/0xe4
[ 3925.841868]  el1h_64_sync+0x64/0x68
[ 3925.841871]  pgtable_trans_huge_withdraw+0x24/0x68
[ 3925.841877]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3925.841882]  __handle_mm_fault+0x5cc/0xc8c
[ 3925.841888]  handle_mm_fault+0x68/0x280
[ 3925.841894]  do_page_fault+0x140/0x490
[ 3925.841897]  do_mem_abort+0x44/0x94
[ 3925.841903]  el0_da+0x30/0x88
[ 3925.841909]  el0t_64_sync_handler+0xb4/0x12c
[ 3925.841915]  el0t_64_sync+0x190/0x194
[ 3925.841919] ---[ end trace 0000000000000000 ]---
[ 3925.841922] object pointer: 0x00000000332820ed
[ 3925.842202] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
[ 3925.842206] Mem abort info:
[ 3925.842209]   ESR = 0x0000000096000006
[ 3925.842212]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.842215]   SET = 0, FnV = 0
[ 3925.842217]   EA = 0, S1PTW = 0
[ 3925.842220]   FSC = 0x06: level 2 translation fault
[ 3925.842222] Data abort info:
[ 3925.842225]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 3925.842227]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 3925.842230]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.842232] user pgtable: 4k pages, 48-bit VAs, pgdp=000000088435b000
[ 3925.842237] [0000000000000038] pgd=0800000884352003, p4d=0800000884352003, pud=0800000884354003, pmd=0000000000000000
[ 3925.842248] Internal error: Oops: 0000000096000006 [#4] PREEMPT SMP
[ 3925.842251] Modules linked in: ipv6
[ 3925.842257] CPU: 2 PID: 157 Comm: systemd-journal Tainted: G      D W          6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.842263] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.842265] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.842270] pc : memcg_slab_post_alloc_hook+0x88/0x204
[ 3925.842278] lr : kmem_cache_alloc+0x12c/0x234
[ 3925.842282] sp : ffff800082453b80
[ 3925.842284] x29: ffff800082453b80 x28: 0000000000000000 x27: 0001000000000000
[ 3925.842292] x26: fffffc0000000000 x25: ffff800081352080 x24: 0000000000000001
[ 3925.842298] x23: ffff800082453bf0 x22: ffff0008041b4880 x21: 0000000000000000
[ 3925.842306] x20: ffff00080004bcc0 x19: ffff00080004bcc0 x18: 0000000000000000
[ 3925.842314] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 3925.842320] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 3925.842329] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000001
[ 3925.842337] x8 : 0000000000000001 x7 : ffff0008072ba000 x6 : ffff800a8cebd000
[ 3925.842344] x5 : ffff0008019b63c0 x4 : 0000000000801a65 x3 : 0000000000000001
[ 3925.842352] x2 : ffffffffffffffff x1 : 0000000000000025 x0 : ffff000801a65af0
[ 3925.842359] Call trace:
[ 3925.842362]  memcg_slab_post_alloc_hook+0x88/0x204
[ 3925.842367]  kmem_cache_alloc+0x12c/0x234
[ 3925.842371]  __sigqueue_alloc+0x80/0x120
[ 3925.842379]  __send_signal_locked+0x1c0/0x2d4
[ 3925.842386]  send_signal_locked+0xdc/0x130
[ 3925.842391]  force_sig_info_to_task+0x98/0x158
[ 3925.842396]  force_sig_fault+0x64/0x98
[ 3925.842401]  arm64_notify_die+0x7c/0xcc
[ 3925.842407]  force_signal_inject+0x94/0xe0
[ 3925.842412]  do_el0_undef+0xa4/0x170
[ 3925.842417]  el0_undef+0x2c/0x84
[ 3925.842424]  el0t_64_sync_handler+0xd8/0x12c
[ 3925.842429]  el0t_64_sync+0x190/0x194
[ 3925.842434] Code: 9a82039c f9400381 7215003f 9a9f139c (f9401f81) 
[ 3925.842436] ---[ end trace 0000000000000000 ]---
[ 3925.842440] note: systemd-journal[157] exited with irqs disabled
[ 3925.842442] note: systemd-journal[157] exited with preempt_count 1
[ 3925.847040] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[ 3925.847051] Mem abort info:
[ 3925.847053]   ESR = 0x0000000096000046
[ 3925.847055]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.847059]   SET = 0, FnV = 0
[ 3925.847061]   EA = 0, S1PTW = 0
[ 3925.847063]   FSC = 0x06: level 2 translation fault
[ 3925.847065] Data abort info:
[ 3925.847066]   ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
[ 3925.847069]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 3925.847071]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.847074] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000889daa000
[ 3925.847077] [0000000000000008] pgd=080000088667a003, p4d=080000088667a003, pud=08000008866cf003, pmd=0000000000000000
[ 3925.847089] Internal error: Oops: 0000000096000046 [#5] PREEMPT SMP
[ 3925.847093] Modules linked in: ipv6
[ 3925.847102] CPU: 7 PID: 611 Comm: stress-ng-vm Tainted: G      D W          6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.847106] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.847108] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.847113] pc : pgtable_trans_huge_withdraw+0x40/0x68
[ 3925.847129] lr : __split_huge_pmd+0x7f0/0xa78
[ 3925.847136] sp : ffff8000827e3bb0
[ 3925.847138] x29: ffff8000827e3c20 x28: ffff000802265580 x27: 0000ffff85a00000
[ 3925.847144] x26: 0000000000000002 x25: ffff00080b7a0e60 x24: 0000000000000000
[ 3925.847150] x23: ffff000801abd580 x22: ffff800080f5b000 x21: ffff000800392168
[ 3925.847156] x20: ffff000800392168 x19: ffff000801abd580 x18: 0060000936000fc1
[ 3925.847161] x17: 0000000000000000 x16: ffff800080d13670 x15: 0001400000000000
[ 3925.847167] x14: 0001600000000000 x13: 05f4004000000000 x12: 0000000000000200
[ 3925.847172] x11: 0000000000000020 x10: 05f4000000000000 x9 : ffff80008136f000
[ 3925.847178] x8 : ffff8000827e3d58 x7 : 0001000000000000 x6 : 0000000000000003
[ 3925.847183] x5 : 05f4000ffff85a00 x4 : 0000000000000002 x3 : 0000000000000000
[ 3925.847188] x2 : fffffc00202a5cc8 x1 : fffffc002000e480 x0 : fffffc0020273300
[ 3925.847194] Call trace:
[ 3925.847196]  pgtable_trans_huge_withdraw+0x40/0x68
[ 3925.847201]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3925.847206]  __handle_mm_fault+0x5cc/0xc8c
[ 3925.847214]  handle_mm_fault+0x68/0x280
[ 3925.847219]  do_page_fault+0x140/0x490
[ 3925.847225]  do_mem_abort+0x44/0x94
[ 3925.847228]  el0_da+0x30/0x88
[ 3925.847238]  el0t_64_sync_handler+0xb4/0x12c
[ 3925.847243]  el0t_64_sync+0x190/0x194
[ 3925.847249] Code: d1002042 f9000822 b4000102 a9408803 (f9000462) 
[ 3925.847253] ---[ end trace 0000000000000000 ]---
[ 3925.847256] note: stress-ng-vm[611] exited with preempt_count 1
[ 3925.847297] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
[ 3925.847299] Mem abort info:
[ 3925.847301]   ESR = 0x0000000096000006
[ 3925.847303]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.847305]   SET = 0, FnV = 0
[ 3925.847306]   EA = 0, S1PTW = 0
[ 3925.847308]   FSC = 0x06: level 2 translation fault
[ 3925.847310] Data abort info:
[ 3925.847312]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 3925.847314]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 3925.847316]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.847318] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000889daa000
[ 3925.847320] [0000000000000038] pgd=080000088667a003, p4d=080000088667a003, pud=08000008866cf003, pmd=0000000000000000
[ 3925.847327] Internal error: Oops: 0000000096000006 [#6] PREEMPT SMP
[ 3925.847330] Modules linked in: ipv6
[ 3925.847333] CPU: 7 PID: 611 Comm: stress-ng-vm Tainted: G      D W          6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3925.847337] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3925.847338] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3925.847342] pc : kmem_cache_free+0x68/0x2c0
[ 3925.847347] lr : __khugepaged_exit+0xd8/0x1a8
[ 3925.847352] sp : ffff8000827e36f0
[ 3925.847354] x29: ffff8000827e36f0 x28: ffff000802265580 x27: 0000ffff85a00000
[ 3925.847360] x26: ffff800080e84df8 x25: 0000000000000000 x24: ffff000802265ca0
[ 3925.847365] x23: ffff000801abd620 x22: ffff80008024c5c4 x21: 0000000000000000
[ 3925.847371] x20: ffff00080b5d4000 x19: ffff0008000f9b40 x18: ffff8000827e3718
[ 3925.847376] x17: 72646461206c6175 x16: ffff800080d13670 x15: 000000000000000f
[ 3925.847382] x14: 0000000000000000 x13: 1fffe00100551921 x12: ffff8000827e3718
[ 3925.847388] x11: 6874697720646574 x10: ffff000802a8c908 x9 : ffff000802a8c90c
[ 3925.847393] x8 : 0000000000000000 x7 : ffff000801abd620 x6 : ffff800a8cf21000
[ 3925.847399] x5 : 0000000000000000 x4 : dead000000000122 x3 : ffff000801abd968
[ 3925.847404] x2 : 0000000000000000 x1 : ffffffffffffffff x0 : 0000000000000094
[ 3925.847409] Call trace:
[ 3925.847411]  kmem_cache_free+0x68/0x2c0
[ 3925.847415]  __khugepaged_exit+0xd8/0x1a8
[ 3925.847419]  __mmput+0x118/0x148
[ 3925.847426]  mmput+0x50/0x5c
[ 3925.847429]  do_exit+0x28c/0x8c4
[ 3925.847435]  make_task_dead+0x84/0x17c
[ 3925.847438]  arm64_force_sig_fault+0x0/0x70
[ 3925.847444]  die_kernel_fault+0x1bc/0x3a4
[ 3925.847448]  __do_kernel_fault+0x130/0x180
[ 3925.847451]  do_page_fault+0xc0/0x490
[ 3925.847454]  do_translation_fault+0x9c/0xa8
[ 3925.847457]  do_mem_abort+0x44/0x94
[ 3925.847460]  el1_abort+0x40/0x64
[ 3925.847465]  el1h_64_sync_handler+0xa4/0xe4
[ 3925.847470]  el1h_64_sync+0x64/0x68
[ 3925.847473]  pgtable_trans_huge_withdraw+0x40/0x68
[ 3925.847478]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3925.847482]  __handle_mm_fault+0x5cc/0xc8c
[ 3925.847487]  handle_mm_fault+0x68/0x280
[ 3925.847492]  do_page_fault+0x140/0x490
[ 3925.847495]  do_mem_abort+0x44/0x94
[ 3925.847498]  el0_da+0x30/0x88
[ 3925.847503]  el0t_64_sync_handler+0xb4/0x12c
[ 3925.847507]  el0t_64_sync+0x190/0x194
[ 3925.847511] Code: f94002a0 7215001f 9a9f12b5 d503201f (f9401ea1) 
[ 3925.847513] ---[ end trace 0000000000000000 ]---
[ 3925.847515] Fixing recursive fault but reboot is needed!
[ 3925.848960] Mem abort info:
[ 3925.848961]   ESR = 0x0000000096000006
[ 3925.857730] Mem abort info:
[ 3925.866493]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.869272]   ESR = 0x0000000096000046
[ 3925.873009]   SET = 0, FnV = 0
[ 3925.878304]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3925.881343]   EA = 0, S1PTW = 0
[ 3925.884471]   SET = 0, FnV = 0
[ 3925.889332]   FSC = 0x06: level 2 translation fault
[ 3925.892199]   EA = 0, S1PTW = 0
[ 3925.897679] Data abort info:
[ 3925.897681]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 3925.902716]   FSC = 0x06: level 2 translation fault
[ 3925.908011]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 3925.914433] Data abort info:
[ 3925.925014]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.931284]   ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
[ 3925.934742] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000889c30000
[ 3925.943590]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 3925.949233] [0000000000000008] pgd=0800000889d0e003
[ 3925.956174]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3925.961305] , p4d=0800000889d0e003
[ 3925.965639] user pgtable: 4k pages, 48-bit VAs, pgdp=000000088b7b0000
[ 3925.968941] , pud=0800000888a3b003
[ 3925.976056] [0000000000000008] pgd=080000088a506003
[ 3925.983171] , pmd=0000000000000000
[ 3925.990287] , p4d=080000088a506003
[ 3925.997402] 
[ 3925.997403] Internal error: Oops: 0000000096000006 [#7] PREEMPT SMP
[ 3926.004518] , pud=0800000886634003
[ 3926.011627] Modules linked in: ipv6
[ 3926.018744] , pmd=0000000000000000
[ 3926.025854] CPU: 4 PID: 613 Comm: stress-ng-vm Tainted: G      D W          6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3926.025858] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3926.025860] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3926.032975] 
[ 3926.040083] pc : pgtable_trans_huge_withdraw+0x24/0x68
[ 3926.095627] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
[ 3926.102819] lr : __split_huge_pmd+0x7f0/0xa78
[ 3926.111587] Mem abort info:
[ 3926.114362] sp : ffff80008325bbb0
[ 3926.117142]   ESR = 0x0000000096000006
[ 3926.120873] x29: ffff80008325bc20
[ 3926.124607]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3926.129899]  x28: ffff000808344740
[ 3926.135194]   SET = 0, FnV = 0
[ 3926.138233]  x27: 0000ffff85800000
[ 3926.141272]   EA = 0, S1PTW = 0
[ 3926.144397] 
[ 3926.147523]   FSC = 0x06: level 2 translation fault
[ 3926.152381] x26: 0000000000000002
[ 3926.157241] Data abort info:
[ 3926.160106]  x25: ffff0008062725a0
[ 3926.165574]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 3926.168440]  x24: 0000000000000000
[ 3926.173474]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 3926.178938] 
[ 3926.184233]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3926.189264] x23: ffff0008040cda40
[ 3926.195685] user pgtable: 4k pages, 48-bit VAs, pgdp=000000090613f000
[ 3926.200978]  x22: ffff800080f5b000
[ 3926.205839] [0000000000000038] pgd=080000089adc5003
[ 3926.212257]  x21: ffff0008047af160
[ 3926.215644] , p4d=080000089adc5003
[ 3926.220503] 
[ 3926.223890] , pud=080000088a9da003
[ 3926.227274] x20: ffff0008047af160
[ 3926.230661] , pmd=0000000000000000
[ 3926.234046]  x19: ffff0008040cda40
[ 3926.235526] 
[ 3926.238910]  x18: 0060000936000fc1
[ 3928.359668] x17: 0000000000000000 x16: ffff800080d13670 x15: 0001400000000000
[ 3928.366785] x14: 0001600000000000 x13: 05f2004000000000 x12: 0000000000000200
[ 3928.373903] x11: 0000000000000020 x10: 05f2000000000000 x9 : ffff80008136f000
[ 3928.381021] x8 : ffff80008325bd58 x7 : 0001000000000000 x6 : 0000000000000003
[ 3928.388138] x5 : 05f2000ffff85800 x4 : 0000000000000002 x3 : 0000000000000000
[ 3928.395255] x2 : 0001000000000000 x1 : fffffc002011ebc0 x0 : 0000000000000000
[ 3928.402372] Call trace:
[ 3928.404807]  pgtable_trans_huge_withdraw+0x24/0x68
[ 3928.409584]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3928.413842]  __handle_mm_fault+0x5cc/0xc8c
[ 3928.417926]  handle_mm_fault+0x68/0x280
[ 3928.421749]  do_page_fault+0x140/0x490
[ 3928.425484]  do_mem_abort+0x44/0x94
[ 3928.428958]  el0_da+0x30/0x88
[ 3928.431917]  el0t_64_sync_handler+0xb4/0x12c
[ 3928.436174]  el0t_64_sync+0x190/0x194
[ 3928.439824] Code: b9402820 34000260 f9400820 aa0003e3 (f8408c62) 
[ 3928.445898] ---[ end trace 0000000000000000 ]---
[ 3928.450499] Internal error: Oops: 0000000096000046 [#8] PREEMPT SMP
[ 3928.450511] note: stress-ng-vm[613] exited with preempt_count 1
[ 3928.456749] Modules linked in: ipv6
[ 3928.456753] CPU: 5 PID: 610 Comm: stress-ng-vm Tainted: G      D W          6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3928.476442] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3928.482082] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3928.489025] pc : pgtable_trans_huge_withdraw+0x40/0x68
[ 3928.494150] lr : __split_huge_pmd+0x7f0/0xa78
[ 3928.498493] sp : ffff80008327bbb0
[ 3928.501793] x29: ffff80008327bc20 x28: ffff0008072b63c0 x27: 0000ffff85600000
[ 3928.508913] x26: 0000000000000002 x25: ffff000806271d20 x24: 0000000000000000
[ 3928.516031] x23: ffff0008042376c0 x22: ffff800080f5b000 x21: ffff000808065158
[ 3928.523149] x20: ffff000808065158 x19: ffff0008042376c0 x18: 0060000936000fc1
[ 3928.530267] x17: 0000000000000000 x16: ffff800080d13670 x15: 0001400000000000
[ 3928.537384] x14: 0001600000000000 x13: 05f0004000000000 x12: 0000000000000200
[ 3928.544503] x11: 0000000000000020 x10: 05f0000000000000 x9 : ffff80008136f000
[ 3928.551621] x8 : ffff80008327bd58 x7 : 0001000000000000 x6 : 0000000000000003
[ 3928.558738] x5 : 05f0000ffff85600 x4 : 0000000000000002 x3 : 0000000000000000
[ 3928.565856] x2 : 0000000000000000 x1 : fffffc0020201940 x0 : fffffc002018e880
[ 3928.572974] Call trace:
[ 3928.575408]  pgtable_trans_huge_withdraw+0x40/0x68
[ 3928.580184]  do_huge_pmd_wp_page+0x1d8/0x36c
[ 3928.584440]  __handle_mm_fault+0x5cc/0xc8c
[ 3928.588524]  handle_mm_fault+0x68/0x280
[ 3928.592347]  do_page_fault+0x140/0x490
[ 3928.596083]  do_mem_abort+0x44/0x94
[ 3928.599558]  el0_da+0x30/0x88
[ 3928.602514]  el0t_64_sync_handler+0xb4/0x12c
[ 3928.606771]  el0t_64_sync+0x190/0x194
[ 3928.610421] Code: d1002042 f9000822 b4000102 a9408803 (f9000462) 
[ 3928.616495] ---[ end trace 0000000000000000 ]---
[ 3928.621097] Internal error: Oops: 0000000096000006 [#9] PREEMPT SMP
[ 3928.621106] note: stress-ng-vm[610] exited with preempt_count 1
[ 3928.627347] Modules linked in: ipv6
[ 3928.633316] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
[ 3928.636723] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D W          6.6.32-ti-01301-gdb8871293143-dirty #1
[ 3928.636728] Hardware name: Texas Instruments J784S4 EVM (DT)
[ 3928.636729] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3928.636733] pc : kmem_cache_free+0x68/0x2c0
[ 3928.645498] Mem abort info:
[ 3928.655380] lr : kfree_skbmem+0x8c/0xc4
[ 3928.661025]   ESR = 0x0000000096000006
[ 3928.667960] sp : ffff800080003b00
[ 3928.667962] x29: ffff800080003b00
[ 3928.672131]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 3928.674908]  x28: ffff80008134c830 x27: 0000000000000000
[ 3928.674913] x26: 00000000ffffffff x25: ffff000806476400 x24: ffff00080b627116
[ 3928.678738]   SET = 0, FnV = 0
[ 3928.682466] 
[ 3928.682468] x23: ffff000802313c00 x22: ffff800080a06754
[ 3928.685768]   EA = 0, S1PTW = 0
[ 3928.689066]  x21: 0000000000000000
[ 3928.689069] x20: ffff0008062ef400
[ 3928.694363]   FSC = 0x06: level 2 translation fault
[ 3928.699653]  x19: ffff0008000f9480 x18: 0000000000000001
[ 3928.699657] x17: ffff800a8ce95000 x16: ffff800080000000
[ 3928.706773] Data abort info:
[ 3928.709808]  x15: 00000000c0a8ee83
[ 3928.711290]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 3928.716493] 
[ 3928.716494] x14: ffff800080003a50 x13: 0000000000000070
[ 3928.719623]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 3928.723005]  x12: 0000000000000080
[ 3928.723008] x11: 0000000000000001 x10: ffff800080d05af8 x9 : 0000000000000001
[ 3928.726313]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3928.731167] 
[ 3928.731169] x8 : 0000000000000002 x7 : 0000000000000000
[ 3928.736465] user pgtable: 4k pages, 48-bit VAs, pgdp=000000088b7b0000
[ 3928.741667]  x6 : ffff000806397b60
[ 3928.741670] x5 : ffff000806476430
[ 3928.744548] [0000000000000038] pgd=080000088a506003
[ 3928.747921]  x4 : fffffc00202d89f4 x3 : 0000000000000000
[ 3928.747925] x2 : 0000000000000001
[ 3928.753393] , p4d=080000088a506003
[ 3928.754868]  x1 : ffffffffffffffff x0 : 00000000000000af
[ 3928.760080] , pud=0800000886634003
[ 3928.765109] Call trace:
[ 3928.765110]  kmem_cache_free+0x68/0x2c0
[ 3928.765115]  kfree_skbmem+0x8c/0xc4
[ 3928.768503] , pmd=0000000000000000
[ 3928.775611]  kfree_skb_reason+0x50/0xb0
[ 3928.780909] 
[ 3928.782386]  arp_process+0x258/0x798
[ 3928.848164]  arp_rcv+0x110/0x148
[ 3928.851380]  __netif_receive_skb_list_core+0x1ec/0x214
[ 3928.856505]  netif_receive_skb_list_internal+0x1d8/0x2c4
[ 3928.861802]  napi_complete_done+0x68/0x1b0
[ 3928.865885]  am65_cpsw_nuss_rx_poll+0x474/0x994
[ 3928.870402]  __napi_poll+0x38/0x178
[ 3928.873879]  net_rx_action+0x128/0x270
[ 3928.877615]  __do_softirq+0x100/0x26c
[ 3928.881263]  ____do_softirq+0x10/0x1c
[ 3928.884912]  call_on_irq_stack+0x24/0x4c
[ 3928.888821]  do_softirq_own_stack+0x1c/0x2c
[ 3928.892989]  irq_exit_rcu+0xc0/0xdc
[ 3928.896465]  el1_interrupt+0x38/0x68
[ 3928.900028]  el1h_64_irq_handler+0x18/0x24
[ 3928.904112]  el1h_64_irq+0x64/0x68
[ 3928.907499]  default_idle_call+0x28/0x3c
[ 3928.911407]  do_idle+0x20c/0x264
[ 3928.914624]  cpu_startup_entry+0x38/0x3c
[ 3928.918533]  kernel_init+0x0/0x1dc
[ 3928.921922]  arch_post_acpi_subsys_init+0x0/0x8
[ 3928.926439]  start_kernel+0x500/0x608
[ 3928.930087]  __primary_switched+0xbc/0xc4
[ 3928.934088] Code: f94002a0 7215001f 9a9f12b5 d503201f (f9401ea1) 
[ 3928.940164] ---[ end trace 0000000000000000 ]---
[ 3928.944764] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[ 3928.951620] SMP: stopping secondary CPUs
[ 3930.022528] SMP: failed to stop secondary CPUs 0-3,6
[ 3930.027478] Kernel Offset: disabled
[ 3930.030952] CPU features: 0x0,80000200,28020000,1000420b
[ 3930.036247] Memory Limit: none
[ 3930.039290] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

From 9b8a4c2e21e14ded4e6b4bac55719297d395da4a Mon Sep 17 00:00:00 2001
From: Dusan <Dusan.Stanisic@rt-rk.com>
Date: Mon, 9 Sep 2024 12:34:07 +0000
Subject: [PATCH 1/7] Modifying U-Boot to reduce RAM memory 32GB->16GB

---
 arch/arm/dts/k3-j784s4-ddr-evm-lp4-4266.dtsi | 2 +-
 arch/arm/dts/k3-j784s4-ddr.dtsi              | 2 ++
 arch/arm/dts/k3-j784s4-evm.dts               | 4 ++--
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm/dts/k3-j784s4-ddr-evm-lp4-4266.dtsi b/arch/arm/dts/k3-j784s4-ddr-evm-lp4-4266.dtsi
index 0e16d2f2..b7fb907f 100644
--- a/arch/arm/dts/k3-j784s4-ddr-evm-lp4-4266.dtsi
+++ b/arch/arm/dts/k3-j784s4-ddr-evm-lp4-4266.dtsi
@@ -14,7 +14,7 @@
 #define MULTI_DDR_CFG_INTRLV_SIZE 12
 #define MULTI_DDR_CFG_ECC_ENABLE 0
 #define MULTI_DDR_CFG_HYBRID_SELECT 24
-#define MULTI_DDR_CFG_EMIFS_ACTIVE 15
+#define MULTI_DDR_CFG_EMIFS_ACTIVE 3
 
 #define DDRSS0_CTL_00_DATA 0x00000B00
 #define DDRSS0_CTL_01_DATA 0x00000000
diff --git a/arch/arm/dts/k3-j784s4-ddr.dtsi b/arch/arm/dts/k3-j784s4-ddr.dtsi
index fc74c539..4538521c 100644
--- a/arch/arm/dts/k3-j784s4-ddr.dtsi
+++ b/arch/arm/dts/k3-j784s4-ddr.dtsi
@@ -4446,6 +4446,7 @@
 		};
 
 		memorycontroller2: memorycontroller@29d0000 {
+			status = "disabled";
 			compatible = "ti,j721s2-ddrss";
 			reg = <0x0 0x029d0000 0x0 0x4000>,
 			      <0x0 0x0114000 0x0 0x100>,
@@ -6655,6 +6656,7 @@
 		};
 
 		memorycontroller3: memorycontroller@29f0000 {
+			status = "disabled";
 			compatible = "ti,j721s2-ddrss";
 			reg = <0x0 0x029f0000 0x0 0x4000>,
 			      <0x0 0x0114000 0x0 0x100>,
diff --git a/arch/arm/dts/k3-j784s4-evm.dts b/arch/arm/dts/k3-j784s4-evm.dts
index afd84a6d..e4b80a55 100644
--- a/arch/arm/dts/k3-j784s4-evm.dts
+++ b/arch/arm/dts/k3-j784s4-evm.dts
@@ -34,9 +34,9 @@
 	memory@80000000 {
 		device_type = "memory";
 		bootph-all;
-		/* 32G RAM */
+		/* 16G RAM */
 		reg = <0x00000000 0x80000000 0x00000000 0x80000000>,
-		      <0x00000008 0x80000000 0x00000007 0x80000000>;
+		      <0x00000008 0x80000000 0x00000003 0x80000000>;
 	};
 
 	reserved_memory: reserved-memory {
-- 
2.43.0

  • Hello Dušan .

    Just want to double confirm, you are using the TI EVM and you reduced the memory to 16GB and you are facing the memtester issues?

    SDK 10.0 should not have any issues wit DDR consistency.

    Do you see the issue with 32 GB as well?
    Can you share any use case that helps reproduce this issue?

    - Keerthy

  • I realized I made a slight mistake in my previous explanation. We are using a custom board based on the TDA4VH-Q1, and the only difference between our board and the J784S4 EVM is the number of DDR slots. The EVM has four DDR slots, whereas our custom board has two, but both use the same DDR modules. Our DDR configuration follows the adjustments I previously shared, based on Section 1 of the TI forum.

    For reference, we previously tested the EVM board by disabling two DDR slots, but we only ran memtester, which completed successfully. Since we applied the same modifications to our custom board, we are unsure if additional changes are required.

    We also tested multiple stress-ng configurations on the EVM board (32GB RAM), and all tests ran without issues. However, executing certain heavy workloads triggers a kernel panic on our custom board (16GB RAM).

    Regarding the use cases on the custom board, we observed that high-impact tasks lead to kernel panics. Our customer encountered this issue while developing a CMake application with parallel tasks. They also ran the same process on the EVM board, where it was completed successfully. We attempted the same command on our custom board, and in both cases, the system crashed with a kernel panic. The logs indicate memory-related errors, which suggests a possible issue with the DDR configuration.

    Best regards,
    Dušan Stanišić

  • Hello Dušan Stanišić,

    Thanks for the clarification. This points towards the DDR configuration as well. I am looping our DDR expert.

    We will get back to you in a day or two.

    Regards,
    Keerthy

  • Hi Dušan Stanišić,

    A few questions:

    • Are you using the default TI EVM DDR DTSI file or did you generate a new one?  
    • Are you observing this issue on multiple boards or just a few? 
    • Are you able to monitor the device temperature under the heavy load? (is there any temperature dependency if you were to cool the device in a temp chamber while testing at the heavier load)

    A few initial debug steps:

    • Can you try the attached DTSI file below (k3-j784s4-ddr-evm-lp4.dtsi)?
    • Can you try testing with a slower DDR frequency (DTSI file k3-j784s4-ddr-evm-lp4_3200.dtsi below)?
    • As a future option, we could try to test with DDRSS0 only, but we wouldn't be able to test with DDRSS1 only. Thus the conclusions may be limited.

    k3-j784s4-ddr-evm-lp4.dtsi

    k3-j784s4-ddr-evm-lp4_3200.dtsi

    Regards,
    Kevin

  • Hello Kevin,

    The first patch you provided is valid, and everything works as intended. After some debugging, I noticed that the DDRSS Register Configuration Tool correctly generated the DDR configuration for RTOS and JTAG. However, for Linux, we didn't initially recognize the same code structure in the dtsi file you attached.

    Upon revisiting my previous post for modifying DDR, I realized that the provided solution modified only specific sections of the k3-j784s4-ddr-evm-l4.dtsi file instead of the entire file. This partial modification was the root cause of the issue.

    Best regards,
    Dušan Stanišić