This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM62X: RAM stress test using memtester tool

Part Number: PROCESSOR-SDK-AM62X
Other Parts Discussed in Thread: SK-AM62B-P1

Tool/software:

Hi ti ,

Please find the RAM inside my system using free command

# free

                        total        used        free      shared  buff/cache   available
Mem:        1970924       62260     1795772       13560      112892     1826180

when i run test with 512M its getting passed without issues 

But when i run ram test with 1G getting kernel panic and halted. is it expcted ? 

root@am62xx-evm:~# memtester 1G 1 
memtester version 4.5.1 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1024MB (1073741824 bytes)
got  1024MB (1073741824 bytes), trying mlock ...[  173.124184] Unable to handle kernel NULL pointer 8
[  173.132997] Mem abort info:
[  173.135779]   ESR = 0x0000000096000046
[  173.139517]   EC = 0x25: DABT (current EL), IL = 32 bits
[  173.144817]   SET = 0, FnV = 0
[  173.147862]   EA = 0, S1PTW = 0
[  173.150992]   FSC = 0x06: level 2 translation fault
[  173.155858] Data abort info:
[  173.158726]   ISV = 0, ISS = 0x00000046
[  173.162550]   CM = 0, WnR = 1
[  173.165507] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000086563000
[  173.171934] [0000000000000008] pgd=0800000081d9a003, p4d=0800000081d9a003, pud=080000008169e003, 0
[  173.182540] Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP
[  173.188795] Modules linked in: rpmsg_ctrl rpmsg_char dwc3 crct10dif_ce overlay ti_k3_r5_remotepro6
[  173.211086] CPU: 0 PID: 500 Comm: memtester Tainted: G           O       6.1.33-g40c32565ca #1
[  173.219682] Hardware name: Texas Instruments AM625 SK TESSS (DT)
[  173.225675] pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  173.232626] pc : get_page_from_freelist+0x228/0xf10
[  173.237512] lr : get_page_from_freelist+0x1d0/0xf10
[  173.242382] sp : ffff800009ff3840
[  173.245685] x29: ffff800009ff3840 x28: 0000000000000801 x27: ffff8000091e80c0
[  173.252814] x26: ffffffffffffffff x25: 0000000000000000 x24: fffffc000021f900
[  173.259943] x23: 0000000000000000 x22: 0000000000000000 x21: ffff0000772f0718
[  173.267071] x20: ffff0000772f0700 x19: ffff0000772f0700 x18: 000000000004773e
[  173.274199] x17: 0000000000000000 x16: ffff800008ca3330 x15: 0000000000000000
[  173.281328] x14: 0000000000000001 x13: 0000000000000002 x12: 0000000000000010
[  173.288455] x11: fffffc0000d68000 x10: 0000000000000200 x9 : fffffc0000348008
[  173.295584] x8 : 0000000000000000 x7 : 0000000000000001 x6 : 00000000000000c0
[  173.302712] x5 : 00000000ffffffff x4 : 0000000000000000 x3 : dead000000000122
[  173.309840] x2 : 0000000000000000 x1 : fffffc000021f908 x0 : ffff0000772f0718
[  173.316969] Call trace:
[  173.319409]  get_page_from_freelist+0x228/0xf10
[  173.323933]  __alloc_pages+0x134/0xcec
[  173.327676]  __pte_alloc_one.constprop.0+0x28/0x90
[  173.332464]  do_huge_pmd_anonymous_page+0x200/0x7f0
[  173.337336]  __handle_mm_fault+0x420/0xc00
[  173.341426]  handle_mm_fault+0xec/0x280
[  173.345254]  __get_user_pages+0x200/0x3a0
[  173.349261]  populate_vma_page_range+0x58/0x7c
[  173.353695]  __mm_populate+0xb4/0x190
[  173.357348]  do_mlock+0xcc/0x250
[  173.360570]  __arm64_sys_mlock+0x18/0x30
[  173.364485]  invoke_syscall+0x48/0x114
[  173.368232]  el0_svc_common.constprop.0+0xd4/0xfc
[  173.372929]  do_el0_svc+0x30/0xd0
[  173.376238]  el0_svc+0x2c/0x84
[  173.379294]  el0t_64_sync_handler+0xbc/0x140
[  173.383558]  el0t_64_sync+0x18c/0x190
[  173.387220] Code: d2802443 f2fbd5a3 d1002038 a9400022 (f9000440) 
[  173.393301] ---[ end trace 0000000000000000 ]---
[  173.397908] note: memtester[500] exited with irqs disabled
[  173.403496] note: memtester[500] exited with preempt_count 2
[  173.409290] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
[  173.418058] Mem abort info:
[  173.420840]   ESR = 0x0000000096000006
[  173.424577]   EC = 0x25: DABT (current EL), IL = 32 bits
[  173.429877]   SET = 0, FnV = 0
[  173.432920]   EA = 0, S1PTW = 0
[  173.436050]   FSC = 0x06: level 2 translation fault
[  173.440914] Data abort info:
[  173.443784]   ISV = 0, ISS = 0x00000006
[  173.447607]   CM = 0, WnR = 0
[  173.450564] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000081ef3000
[  173.456991] [0000000000000038] pgd=0800000081eda003, p4d=0800000081eda003, pud=0800000081ed9003, 0
[  173.467590] Internal error: Oops: 0000000096000006 [#2] PREEMPT SMP
[  173.473842] Modules linked in: rpmsg_ctrl rpmsg_char dwc3 crct10dif_ce overlay ti_k3_r5_remotepro6
[  173.496113] CPU: 0 PID: 150 Comm: systemd-journal Tainted: G      D    O       6.1.33-g40c32565ca1
[  173.505229] Hardware name: Texas Instruments AM625 SK TESSS (DT)
[  173.511220] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  173.518170] pc : kmem_cache_alloc+0x17c/0x3f0
[  173.522521] lr : kmem_cache_alloc+0x80/0x3f0
[  173.526783] sp : ffff80000961bbd0
[  173.530086] x29: ffff80000961bbd0 x28: ffff000001cdd140 x27: ffff8000090ea000
[  173.537215] x26: 0000000000000000 x25: ffff000000f91c80 x24: 0000000000000001
[  173.544343] x23: ffff000000014a00 x22: ffff0000016c2380 x21: ffff000001cdd0a0
[  173.551471] x20: 0000000000000a20 x19: 0000000000000000 x18: ffff80000961bd68
[  173.558599] x17: 0000000000000000 x16: ffff800008ca3330 x15: 0000000000000000
[  173.565728] x14: ffff000001f2a20c x13: 1fffe000003e5441 x12: ffff80000961bd68
[  173.572856] x11: 0000000000000000 x10: ffff000001f2a208 x9 : 0000000000000001
[  173.579984] x8 : 0000000000000001 x7 : ffff80006e33c000 x6 : 0000000000000618
[  173.587112] x5 : 0000000000000919 x4 : ffff0000772f0a60 x3 : 0000000000029700
[  173.594240] x2 : 0000000000000028 x1 : ffffffffffffffff x0 : 0000000000000000
[  173.601368] Call trace:
[  173.603805]  kmem_cache_alloc+0x17c/0x3f0
[  173.607807]  __sigqueue_alloc+0x80/0x130
[  173.611724]  __send_signal_locked.part.0+0x1ec/0x2e0
[  173.616680]  send_signal_locked+0x84/0x14c
[  173.620769]  force_sig_info_to_task+0xec/0x130
[  173.625205]  force_sig_fault+0x50/0x7c
[  173.628946]  arm64_force_sig_fault+0x40/0x70
[  173.633213]  do_page_fault+0x380/0x3d0
[  173.636959]  do_translation_fault+0xac/0xc0
[  173.641135]  do_mem_abort+0x44/0x94
[  173.644618]  el0_da+0x30/0x90
[  173.647582]  el0t_64_sync_handler+0xf8/0x140
[  173.651846]  el0t_64_sync+0x18c/0x190
[  173.655504] Code: 9a810273 f9400260 7217001f 9a9f1273 (f9401e60) 
[  173.661583] ---[ end trace 0000000000000000 ]---
[  173.666187] note: systemd-journal[150] exited with irqs disabled
[  173.672221] note: systemd-journal[150] exited with preempt_count 1

Kindly help me on this as it is crititcal to us

Thanks,

Naresh

  • Hi Naresh,

    I just tested this on SK-AM62B-P1 evm with kernel 6.1 but don't see the problem.

    Which Processor SDK version do you use on your board?

  • Hi Bin Liu,

    I am using Am6231 soc

    Thanks ,

    Naresh

  • Which Processor SDK version do you use on your board?

  • Hi Bin Liu,

    I am using am62x processor sdk version 09.00.00.03

    May i know kernel halt and hanging is expected behaviour when I test ram stress test with 1G .?

    Thanks ,

    Naresh

  • May i know kernel halt and hanging is expected behaviour when I test ram stress test with 1G .?

    No, the kernel NULL pointer issue is not expected in memtester test. Here is the test my on SK-AM62B-P1 evm. it took a while to complete for 1G DDR.

    root@am62xx-evm:~# memtester 1G 1
    memtester version 4.5.1 (64-bit)
    Copyright (C) 2001-2020 Charles Cazabon.
    Licensed under the GNU General Public License version 2 (only).
    
    pagesize is 4096
    pagesizemask is 0xfffffffffffff000
    want 1024MB (1073741824 bytes)
    got  1024MB (1073741824 bytes), trying mlock ...locked.
    Loop 1/1:
      Stuck Address       : settesok
      Random Value        : ok
      Compare XOR         : ok
      Compare SUB         : ok
      Compare MUL         : ok
      Compare DIV         : ok
      Compare OR          : ok
      Compare AND         : ok
      Sequential Increment: ok
      Solid Bits          : ok
      Block Sequential    : ok
      Checkerboard        : ok
      Bit Spread          : ok
      Bit Flip            : ok
      Walking Ones        : ok
      Walking Zeroes      : ok
    
    Done.
    root@am62xx-evm:~#
    root@am62xx-evm:~# uname -a
    Linux am62xx-evm 6.1.83-00004-g22245e3f0d4d-dirty #121 SMP PREEMPT Tue Aug 27 10:50:25 CDT 2024 aarch64 aarch64 aarch64 GNU/Linux

  • Hi Bin Liu,

    Thank you for your valuable response.

    For me , when I run memtester with 1G immediately it's throughing kernel fault , null pointer exception and hanging .

    May i know to how resolve the issue ?

    And how to enable the watchdog timer in kernel menuconfig , so that I can set the watchdog timer through sysfs filesystem.

    Thanks ,

    Naresh

  • Hi Bin Liu,

    Here

    May i know here what is the total RAM memory and available memory ?

    Kindly help me out how to resolve Null pointer exception issue while doing RAM stress test.

    I just enabled the watchdog driver in kenrel.

    below observations i found  for watchdog timer , i couldnot able to control with watchdog timer as it seems to be inactive state .

    root@am62xx-evm:~# cd /sys/class/watchdog/
    root@am62xx-evm:/sys/class/watchdog# cd watchdog0
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# ls        
    bootstatus                      nowayout                        state
    dev                             power                           status
    identity                        pretimeout                      subsystem
    max_timeout                     pretimeout_available_governors  timeout
    min_timeout                     pretimeout_governor             uevent
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat state 
    inactive
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat timeout 
    60
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat min_timeout 
    1
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat max_timeout 
    65535
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat identity 
    Software Watchdog
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat pretimeout
    0
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat pretimeout_available_governors 
    panic
    noop
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat pretimeout_governor 
    panic
    root@am62xx-evm:/sys/class/watchdog/watchdog0# cat bootstatus 
    0
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# sudo systemctl stop watchdog
    Failed to stop watchdog.service: Unit watchdog.service not loaded.
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# sudo systemctl status watchdog
    Unit watchdog.service could not be found.
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# mount | grep sysfs
    sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# ls
    bootstatus nowayout state
    dev power status
    identity pretimeout subsystem
    max_timeout pretimeout_available_governors timeout
    min_timeout pretimeout_governor uevent
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# echo 1 > /dev/watchdog
    [ 175.793148] watchdog: watchdog0: watchdog did not stop!
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# echo 10 > timeout
    -sh: timeout: Permission denied
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# sudo echo 10 > timeout
    -sh: timeout: Permission denied
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0#
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0# dmesg | grep -i watchdog
    [ 175.793148] watchdog: watchdog0: watchdog did not stop!
    
    
    root@am62xx-evm:/sys/class/watchdog/watchdog0#

    Thanks,

    Naresh

  • Hi Naresh,

    May i know here what is the total RAM memory and available memory ?

    root@am62xx-evm:~# free
                   total        used        free      shared  buff/cache   available
    Mem:         1972248      209560     1496344       68780      266344     1621072
    Swap:              0           0           0
    root@am62xx-evm:~# memtester 1G 1
    memtester version 4.5.1 (64-bit)
    Copyright (C) 2001-2020 Charles Cazabon.
    Licensed under the GNU General Public License version 2 (only).
    
    pagesize is 4096
    pagesizemask is 0xfffffffffffff000
    want 1024MB (1073741824 bytes)
    got  1024MB (1073741824 bytes), trying mlock ...locked.
    Loop 1/1:
      Stuck Address       : setting   3^C
    root@am62xx-evm:~# memtester 1.2G 1
    memtester version 4.5.1 (64-bit)
    Copyright (C) 2001-2020 Charles Cazabon.
    Licensed under the GNU General Public License version 2 (only).
    
    pagesize is 4096
    pagesizemask is 0xfffffffffffff000
    
    Usage: memtester [-p physaddrbase [-d device]] <mem>[B|K|M|G] [loops]
    root@am62xx-evm:~# memtester 1200M
    memtester version 4.5.1 (64-bit)
    Copyright (C) 2001-2020 Charles Cazabon.
    Licensed under the GNU General Public License version 2 (only).
    
    pagesize is 4096
    pagesizemask is 0xfffffffffffff000
    want 1200MB (1258291200 bytes)
    got  1200MB (1258291200 bytes), trying mlock ...locked.
    Loop 1:
      Stuck Address       : testing   3^C
    root@am62xx-evm:~# memtester 1500M
    memtester version 4.5.1 (64-bit)
    Copyright (C) 2001-2020 Charles Cazabon.
    Licensed under the GNU General Public License version 2 (only).
    
    pagesize is 4096
    pagesizemask is 0xfffffffffffff000
    want 1500MB (1572864000 bytes)
    got  1500MB (1572864000 bytes), trying mlock ...locked.
    Loop 1:
      Stuck Address       : setting   4^C

    Kindly help me out how to resolve Null pointer exception issue while doing RAM stress test.

    I am unable to think of any reason at this moment...

    below observations i found  for watchdog timer , i couldnot able to control with watchdog timer as it seems to be inactive state .

    I am not a watchdog expert. Please create a new e2e thread for this topic, so that the expert can provide support.