This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM625-Q1: DDR4 RAM stress test failure while using memtester tool

Part Number: AM625-Q1
Other Parts Discussed in Thread: SK-AM62B-P1

Tool/software:

Greetings,

We were stress testing the DDR4 RAM (Part No: MT40A1G16TB-062EIT:F) on our custom board based on the AM62-Q1 SoC using the memtester tool in Linux. The Linux kernel version and the memory settings are:

Fullscreen
1
2
3
4
5
6
7
root@am62xx-evm:~# uname -a
Linux am62xx-evm 6.1.836.1.80-******-g55992697949d #1 SMP PREEMPT Mon May 6 12:53:20 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
root@am62xx-evm:~# free
total used free shared buff/cache available
Mem: 1968568 247412 1474800 74644 246356 1572044
Swap: 0 0 0
root@am62xx-evm:~#
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

The results we observed were:

For the 1GB test for 1 cycle:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@am62xx-evm:/# memtester 1G 1
memtester version 4.5.1 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1024MB (1073741824 bytes)
got 1024MB (1073741824 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : testing 3FAILURE: possible bad address line at offset 0x153c4558.
Skipping to next test...
Random Value : ok
Compare XOR : ok
FAILURE: 0x1041e82a881663ce != 0x1041e82a881463ce at offset 0x115a4d78.
Compare SUB : Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

For the 1.5GB test for 1 cycle:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@am62xx-evm:/# memtester 1500M 1
memtester version 4.5.1 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1500MB (1572864000 bytes)
got 1500MB (1572864000 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : testing 1FAILURE: possible bad address line at offset 0x24460c10.
Skipping to next test...
Random Value : ok
FAILURE: 0x80758605128b4e9 != 0x80758604128b4e9 at offset 0x21ecebc0.
FAILURE: 0x80758605128b4e9 != 0x80758604128b4e9 at offset 0x22086bc0.
FAILURE: 0x80758604528b4e9 != 0x80758604128b4e9 at offset 0x2466cc10.
FAILURE: 0x80758604528b4e9 != 0x80758604128b4e9 at offset 0x249e4c10.
FAILURE: 0x82758604128b4e9 != 0x80758604128b4e9 at offset 0x28128558.
Compare XOR : FAILURE: 0x89a43f3704365584 != 0x89843f3704365584 at offset 0x281c0558.
Compare SUB : Compare MUL : ok
Compare DIV : ok
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

We compared the results with that of SK-AM62B-P1:

For the 1GB test for 1 cycle:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@am62xx-evm:/# memtester 1G 1
memtester version 4.5.1 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1024MB (1073741824 bytes)
got 1024MB (1073741824 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

For the 1.5GB test for 1 cycle:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@am62xx-evm:/# memtester 1500M 1
memtester version 4.5.1 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1500MB (1572864000 bytes)
got 1500MB (1572864000 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Queries:
1) What might be the reasons for the failures during the memtest?
2) What are the methods to mitigate the same?
3) Do we need to do RAM tuning?

  • Greetings TI community,

     Would like to have your guidance for the query. Thank you for your time and consideration.

    Regards,

    Visweshwar Selvaraj

  • Hi Visweshwar, are you using the device tree file in the SDK from the EVM, or did you use the DDR register configuration tool: https://dev.ti.com/sysconfig/?product=Processor_DDR_Config&device=AM62x to customize the DDR configuration based on the device you chose and board design?  Can you post the .dtsi file that you are using for DDR configuration?

    Did you perform any board simulations on the DDR interface?

    Were all of the guidelines followed in https://www.ti.com/lit/pdf/sprad06?

    Are the failures on 1 or multiple boards?

    Regards,

    James

  • Hi JJD,

    1) We are using the device tree file from ti-uboot as we have utilized the same DDR4 RAM (Part No: MT40A1G16TB-062EIT:F) module as used in the EVM.

    The ti-uboot device tree: ti-u-boot-2023.04

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    // SPDX-License-Identifier: GPL-2.0+
    /*
    * This file was generated with the
    * AM62x SysConfig DDR Subsystem Register Configuration Tool v0.09.07
    * Tue Feb 28 2023 14:47:40 GMT-0600 (Central Standard Time)
    * DDR Type: DDR4
    * Frequency = 800MHz (1600MTs)
    * Density: 16Gb
    * Number of Ranks: 1
    */
    #define DDRSS_PLL_FHS_CNT 6
    #define DDRSS_PLL_FREQUENCY_1 400000000
    #define DDRSS_PLL_FREQUENCY_2 400000000
    #define DDRSS_CTL_0_DATA 0x00000A00
    #define DDRSS_CTL_1_DATA 0x00000000
    #define DDRSS_CTL_2_DATA 0x00000000
    #define DDRSS_CTL_3_DATA 0x00000000
    #define DDRSS_CTL_4_DATA 0x00000000
    #define DDRSS_CTL_5_DATA 0x00000000
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    The ti-linux device tree: ti-linux-6.1.y

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    // SPDX-License-Identifier: GPL-2.0
    /*
    *
    * AM625 Minimal dts file
    * Copyright (C) 2021-2022 Texas Instruments Incorporated - https://www.ti.com/
    *
    */
    /dts-v1/;
    #include <dt-bindings/leds/common.h>
    #include <dt-bindings/gpio/gpio.h>
    #include <dt-bindings/net/ti-dp83867.h>
    #include "k3-pinctrl.h"
    #include "k3-am625.dtsi"
    / {
    compatible = "ti,am625-sk","ti,am625";
    model = " ********* ";
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    2) Based on the thread below we perceived that customization of the DDR configuration files was not required for the AM62x SoC and DDR4 RAM module.

    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1294007/am625-ddr4-tuning-tool/4921666?tisearch=e2e-sitesearch&keymatch=am625%252525252525252520memtester#4921666

    3) No, we have not performed any board simulations for the same but we had followed the guidelines mentioned above while designing the boards. We are also facing this issue on multiple boards.

  • Can you check the following voltage rails while performing the tests:

      
    Processor Voltage Rails:
    • VDDS_DDR = VDDS_DDR_C = 1.2V
    • VDDA_PLL0 = 1.8V (AM62x)
    • VDDA_DDR_PLL0 = VDD_CORE = 0.75 or 0.85V (AM62 AMC package only)

    Memory voltage rails:

    • VTT = 0.6V
    • DDR_VREFCA = 0.6V
    • DDR_VPP = 2.5V

    Was the EVM design followed also, including number of layers, ground reference layers, routing layers, location of decoupling caps?

    I looks like you have subtle, infrequent,  single bit errors, so the details of the differences between the EVM and your custom board may be important.  

    Can you implement the patch on this page in sections "Adding debug statements to u-boot" and "Getting DDR register dump after initialization" and post the register dump here?  That will provide me some training results.

    Regards,

    James

  • Hi James,

    We checked the voltage rails while running the memtester tool and found some voltage fluctuations. Hence, we used external wires to source more current to stabilize the voltage. When the voltage was stabilized, we ran the memtester tool again and the results had no failures.

    Regards,

    Visweshwar Selvaraj