This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3359: Linux hangs

Part Number: AM3359

Hello, we developed a custom board using AM3359 and we get this weird issue where the CPU is completely stuck and the device becomes unusable, LEDs stop blinking, the serial doesn't output anything and doesn't accept any input, only a reset makes it work again.

This happens especially when trasfering big files via SSH (read and write) or applying software updates (eMMC block intensive write), while most of the time it works fine. The hang also happens only when the CPU frequency is higher and after many tries this seems the pattern (locking the frequency with userspace governor):

- 300MHz -> never hangs

- 600MHz -> rarely hangs

- 720MHz -> almost always hangs

- 800MHz -> almost always hangs

We thought it could be a problem with the eMMC memory but we tried trasfering files via FTP and it works fine and never hangs.

I tried writing a good portion of the eMMC and reading it back with dd with a big file and still the CPU never was stuck.

So we thought it might be related to the hardware crypto coprocessor as SSH encrypts the traffic. I tried removing the kernel modules with these commands:

# modprobe -r omap-aes-driver
# modprobe -r omap-sham

but still it happens.

For now I set the CPU frequency to 300MHz when applying updates and use the hardware watchdog timer to reset the board in case it should happen while running normally, but this solution is obviously sub-optimal.

Also using U-Boot this never happens, even while intesively writing or reading the eMMC but the device tree is pretty much identical.

It would be really helpful if anyone knows what could trigger this kind of issue.

I am using the latest SDK (v06.03).

This is the Linux device tree:

/*
 * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License version 2 as
 * published by the Free Software Foundation.
 */
/dts-v1/;

#include "am33xx.dtsi"
#include "am335x-bone-common.dtsi"

/ {
	model = "TI AM335x";
	compatible = "ti,am335x-bone-black", "ti,am335x-bone", "ti,am33xx";

	memory@80000000 {
		device_type = "memory";
		reg = <0x80000000 0x20000000>; /* 512 MB */
	};

	chosen {
		stdout-path = &uart0;
		tick-timer = &timer2;
	};

	leds {
		pinctrl-names = "default";
		pinctrl-0 = <&user_leds_s0>;

		compatible = "gpio-leds";

		led2 {
			label = "led1";
			gpios = <&gpio3 16 GPIO_ACTIVE_HIGH>;
			default-state = "off";

		};

		led3 {
			label = "led2";
			gpios = <&gpio3 20 GPIO_ACTIVE_HIGH>;
			default-state = "off";

		};

		led4 {
			label = "led3";
			gpios = <&gpio3 19 GPIO_ACTIVE_HIGH>;
			default-state = "off";

		};

		led5 {
			label = "led4";
			gpios = <&gpio3 15 GPIO_ACTIVE_HIGH>;
			default-state = "off";

		};
	};
};

&user_leds_s0 {
	pinctrl-single,pins = <
		AM33XX_IOPAD(0x994, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* (B13) mcasp0_fsx.gpio3_15 */
		AM33XX_IOPAD(0x9a4, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* (C13) mcasp0_fsr.gpio3_19 */
		AM33XX_IOPAD(0x998, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* (D12) mcasp0_axr0.gpio3_16 */
		AM33XX_IOPAD(0x9a8, PIN_OUTPUT_PULLDOWN | MUX_MODE7) /* (D13) mcasp0_axr1.gpio3_20 */
	>;
};


&ldo3_reg {
	regulator-min-microvolt = <1800000>;
	regulator-max-microvolt = <1800000>;
	regulator-always-on;
};

&mmc1 {
	vmmc-supply = <&vmmcsd_fixed>;
	cd-inverted;
};

&mmc2 {
	vmmc-supply = <&vmmcsd_fixed>;
	pinctrl-names = "default";
	pinctrl-0 = <&emmc_pins>;
	bus-width = <8>;
	status = "okay";
};

&lcdc {
	status = "disabled";
};

&rtc {
	system-power-controller;
};

&i2c2 {
	status = "disabled";
};

&baseboard_eeprom {
	status = "disabled";
};

&usb0 {
	status = "okay";
	dr_mode = "host";
};

&usb1 {
	status = "disabled";
	dr_mode = "peripheral";
};

&usb1_phy {
	status = "disabled";
};

&cpsw_emac0 {
	phy_id = <&davinci_mdio>, <1>;
	phy-mode = "mii";
};

&cpsw_emac1 {
	phy_id = <&davinci_mdio>, <0>;
	phy-mode = "mii";
};

&cpsw_default {
	pinctrl-single,pins = <
		/* Slave 1 */
		AM33XX_IOPAD(0x910, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxerr.mii1_rxerr */
		AM33XX_IOPAD(0x914, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txen.mii1_txen */
		AM33XX_IOPAD(0x918, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxdv.mii1_rxdv */
		AM33XX_IOPAD(0x91c, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd3.mii1_txd3 */
		AM33XX_IOPAD(0x920, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd2.mii1_txd2 */
		AM33XX_IOPAD(0x924, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd1.mii1_txd1 */
		AM33XX_IOPAD(0x928, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd0.mii1_txd0 */
		AM33XX_IOPAD(0x92c, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_txclk.mii1_txclk */
		AM33XX_IOPAD(0x930, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxclk.mii1_rxclk */
		AM33XX_IOPAD(0x934, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd3.mii1_rxd3 */
		AM33XX_IOPAD(0x938, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd2.mii1_rxd2 */
		AM33XX_IOPAD(0x93c, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd1.mii1_rxd1 */
		AM33XX_IOPAD(0x940, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd0.mii1_rxd0 */
		
		/* Slave 0 */
		AM33XX_IOPAD(0x874, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_wpn.mii2_rxerr */
		AM33XX_IOPAD(0x840, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a0.mii2_txen */
		AM33XX_IOPAD(0x844, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a1.mii2_rxdv */
		AM33XX_IOPAD(0x848, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a2.mii2_txd3 */
		AM33XX_IOPAD(0x84c, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a3.mii2_txd2 */
		AM33XX_IOPAD(0x850, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a4.mii2_txd1 */
		AM33XX_IOPAD(0x854, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a5.mii2_txd0 */
		AM33XX_IOPAD(0x858, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a6.mii2_txclk */
		AM33XX_IOPAD(0x85c, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a7.mii2_rxclk */
		AM33XX_IOPAD(0x860, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a8.mii2_rxd3 */
		AM33XX_IOPAD(0x864, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a9.mii2_rxd2 */
		AM33XX_IOPAD(0x868, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a10.mii2_rxd1 */
		AM33XX_IOPAD(0x86c, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a11.mii2_rxd0 */
	>;
};

&cpsw_sleep {
	pinctrl-single,pins = <
		/* Slave 1 reset value */
		AM33XX_IOPAD(0x910, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x914, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x918, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x91c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x920, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x924, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x928, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x92c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x930, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x934, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x938, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x93c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x940, PIN_INPUT_PULLDOWN | MUX_MODE7)
		
		/* Slave 0 reset value */
		AM33XX_IOPAD(0x874, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x840, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x844, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x848, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x84c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x850, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x854, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x858, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x85c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x860, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x864, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x868, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x86c, PIN_INPUT_PULLDOWN | MUX_MODE7)
	>;
};

&mac {
	slaves = <2>;
};

&pruss_soc_bus {
	status = "disabled";

	pruss: pruss@4a300000 {
		status = "disabled";
	};
};

U-Boot device tree:

/*
 * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License version 2 as
 * published by the Free Software Foundation.
 */
/dts-v1/;

#include "am33xx.dtsi"
#include "am335x-bone-common.dtsi"

/ {
	model = "TI AM335x";
	compatible = "ti,am335x-bone-black", "ti,am335x-bone", "ti,am33xx";
	
	memory@80000000 {
		device_type = "memory";
		reg = <0x80000000 0x20000000>; /* 512 MB */
	};
	
	chosen {
		stdout-path = &uart0;
		tick-timer = &timer2;
	};

	config {
		u-boot,mmc-env-offset = <0x560000>;
		u-boot,mmc-env-offset-redundant = <0x580000>;
	};

	leds {
		pinctrl-names = "default";
		pinctrl-0 = <&user_leds_s0>;

		compatible = "gpio-leds";

		led@2 {
			label = "bled1";
			gpios = <&gpio3 16 GPIO_ACTIVE_HIGH>;
			default-state = "off";
			/delete-property/ linux,default-trigger;
		};

		led@3 {
			label = "bled2";
			gpios = <&gpio3 20 GPIO_ACTIVE_HIGH>;
			default-state = "off";
			/delete-property/ linux,default-trigger;
		};

		led@4 {
			label = "bled3";
			gpios = <&gpio3 19 GPIO_ACTIVE_HIGH>;
			default-state = "off";
			/delete-property/ linux,default-trigger;
		};

		led@5 {
			label = "bled4";
			gpios = <&gpio3 15 GPIO_ACTIVE_HIGH>;
			default-state = "off";
			/delete-property/ linux,default-trigger;
		};
	};
};

&ldo3_reg {
	regulator-min-microvolt = <1800000>;
	regulator-max-microvolt = <1800000>;
	regulator-always-on;
};

&mmc1 {
	vmmc-supply = <&vmmcsd_fixed>;
	cd-gpios = <&gpio0 6 GPIO_ACTIVE_HIGH>;
};

&mmc2 {
	vmmc-supply = <&vmmcsd_fixed>;
	pinctrl-names = "default";
	pinctrl-0 = <&emmc_pins>;
	bus-width = <8>;
	status = "okay";
};

&rtc {
	system-power-controller;
};

&usb0 {
	status = "okay";
	dr_mode = "host";
};

&usb1 {
	status = "okay";
	dr_mode = "peripheral";
};

&cpsw_emac0 {
	phy_id = <&davinci_mdio>, <1>;
	phy-mode = "mii";
};

&cpsw_emac1 {
	phy_id = <&davinci_mdio>, <0>;
	phy-mode = "mii";
};

cpsw_default: &cpsw_default {
	pinctrl-single,pins = <
		/* Slave 1 */
		AM33XX_IOPAD(0x910, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxerr.mii1_rxerr */
		AM33XX_IOPAD(0x914, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txen.mii1_txen */
		AM33XX_IOPAD(0x918, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxdv.mii1_rxdv */
		AM33XX_IOPAD(0x91c, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd3.mii1_txd3 */
		AM33XX_IOPAD(0x920, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd2.mii1_txd2 */
		AM33XX_IOPAD(0x924, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd1.mii1_txd1 */
		AM33XX_IOPAD(0x928, PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* mii1_txd0.mii1_txd0 */
		AM33XX_IOPAD(0x92c, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_txclk.mii1_txclk */
		AM33XX_IOPAD(0x930, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxclk.mii1_rxclk */
		AM33XX_IOPAD(0x934, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd3.mii1_rxd3 */
		AM33XX_IOPAD(0x938, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd2.mii1_rxd2 */
		AM33XX_IOPAD(0x93c, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd1.mii1_rxd1 */
		AM33XX_IOPAD(0x940, PIN_INPUT_PULLUP | MUX_MODE0)	/* mii1_rxd0.mii1_rxd0 */
		
		/* Slave 0 */
		AM33XX_IOPAD(0x874, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_wpn.mii2_rxerr */
		AM33XX_IOPAD(0x840, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a0.mii2_txen */
		AM33XX_IOPAD(0x844, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a1.mii2_rxdv */
		AM33XX_IOPAD(0x848, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a2.mii2_txd3 */
		AM33XX_IOPAD(0x84c, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a3.mii2_txd2 */
		AM33XX_IOPAD(0x850, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a4.mii2_txd1 */
		AM33XX_IOPAD(0x854, PIN_OUTPUT_PULLDOWN | MUX_MODE1)	/* gpmc_a5.mii2_txd0 */
		AM33XX_IOPAD(0x858, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a6.mii2_txclk */
		AM33XX_IOPAD(0x85c, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a7.mii2_rxclk */
		AM33XX_IOPAD(0x860, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a8.mii2_rxd3 */
		AM33XX_IOPAD(0x864, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a9.mii2_rxd2 */
		AM33XX_IOPAD(0x868, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a10.mii2_rxd1 */
		AM33XX_IOPAD(0x86c, PIN_INPUT_PULLUP | MUX_MODE1)	/* gpmc_a11.mii2_rxd0 */
	>;
};

cpsw_sleep: &cpsw_sleep {
	pinctrl-single,pins = <
		/* Slave 1 reset value */
		AM33XX_IOPAD(0x910, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x914, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x918, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x91c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x920, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x924, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x928, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x92c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x930, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x934, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x938, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x93c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x940, PIN_INPUT_PULLDOWN | MUX_MODE7)
		
		/* Slave 0 reset value */
		AM33XX_IOPAD(0x874, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x840, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x844, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x848, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x84c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x850, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x854, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x858, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x85c, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x860, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x864, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x868, PIN_INPUT_PULLDOWN | MUX_MODE7)
		AM33XX_IOPAD(0x86c, PIN_INPUT_PULLDOWN | MUX_MODE7)
	>;
};

&mac {
	slaves = <2>;
	active_slave = <0>;
	pinctrl-names = "default", "sleep";
	pinctrl-0 = <&cpsw_default>;
	pinctrl-1 = <&cpsw_sleep>;
};

We are also using a 25MHz OSC and 303MHz DDR3 RAM.

If more info is necessary please tell me.

Thank you

  • I would like to correct the statement regarding U-Boot, it very rarely happens also in U-Boot that it gets stuck, especially at Starting kernel..., but after the watchdog resets the board it boots successfully many times in a row.

  • Hi Davide,

    There is also DDR3 to suspect. Did you try increasing the DDR3 frequency ? Have you cross verified if DDR configurations are fine(https://www.ti.com/lit/an/sprack4/sprack4.pdf?ts=1592724397876 and https://processors.wiki.ti.com/index.php/AM335x_EMIF_Configuration_tips) ?

    Do you see any crash logs before hang ? You can perform DDR stress test(mtest) in uboot. You can try using tmpfs in linux to get the files into DDR3(through ftp/ssh) instead of eMMC and perform the above test and see if it hangs

  • I tried doing the DDR tuning but still the same happens, although I'm not sure I set all the parameters correctly, especially the PCB delay per inch and the trace lengths of CK and DQS (those I should be able to verify tomorrow).

    Copying files via SSH to tmpfs still hangs, while with FTP it doesn't.
    Setting the DDR freqency to 400MHz before the tuning, Linux wouldn't even boot, the board would just hang, after the tuning the result is the same as with 303MHz, so that's a step forward.

    Running mtest in U-Boot like this (we have 512MB of RAM): "mtest 0x80500000 0x9FFFF000" it shows a bunch of failures:
    ...
    FAILURE (read/write): @ 0x8050d9b8: expected 0xf8965d4e, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9bc: expected 0xf8965d4d, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9c0: expected 0xf8965d4c, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9c4: expected 0xf8965d4b, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9c8: expected 0xf8965d4a, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9cc: expected 0xf8965d49, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9d0: expected 0xf8965d48, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9d4: expected 0xf8965d47, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9d8: expected 0xf8965d46, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9dc: expected 0xf8965d45, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9e0: expected 0xf8965d44, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9e4: expected 0xf8965d43, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9e8: expected 0xf8965d42, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9ec: expected 0xf8965d41, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9f0: expected 0xf8965d40, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9f4: expected 0xf8965d3f, actual 0x00000000)

    FAILURE (read/write): @ 0x8050d9f8: expected 0xf8965d3e, actual 0x00000000)
    ...

    But if I run it like this it shows no errors: "mtest 0x8050d9f4 0x80510000".
    Also I tried many memtester in Linux and it failed just once reading all zeros like above.
    But when running "memtester -p 0x9FFF0000 20" or "memtester -p 0xA0000000 20" (which is out of the 512MB) the same exact hang happens. So then I tried to reduce the total amount to 256MB and indeed in Linux it was showing only 256MB from 0x80000000, but still the hang happens while transfering files.
    It seems it hangs when accessing certain RAM addresses but I couldn't track which except the above two.


    There is also no crash log and it's impossible to recover any kernel log as the board gets stuck.

  • Hi,

    One pointer for performing mtest from uboot, we need to account for the uboot code and its sections + size. So generally uboot code would be loaded at address 0x80800000(can be found out using mkimage -l u-boot.img command in host). It would be better to perform mtest either in code under CONFIG_SPL or in command line starting from kernel load address which would be e.g. 0x81000000(or 0x82000000) and till the end of memory.

  • Hello,

    so we tried using exactly the same Linux build and filesystem + U-Boot with only the device trees changed on the beaglebone black and it never happens on that. But still if I try writing outside of the RAM boundaries it hangs in the same way. I know it shouldn't be done but is it possible that on our custom board Linux would write on a certain address outside of boundaries for some reason?

    Also running mtest on both our board and the beaglebone black from 0x82000000 to 0xA0000000 reports exactly the same failures, so I suspect mtest could be wrong, is this also possible?

    Also could you explain a bit better what the step 3B of section 2.2.3 means about: "Input the trace lengths per byte" (https://www.ti.com/lit/an/sprack4/sprack4.pdf?ts=1592826954231 )

  • Hi Davide,

    I have not looked into the section2.2.3 and will revert back after checking, meanwhile:

    Are you using the same part number of DDR in custom Board as in BeagleBone Black ?

    I see the dts file is for BeagleBone and also if you check here in this file : board/ti/am335x/board.c, there is a code which loads the DDR configuration reading EEPROM. May be it is picking up the correct DDR configuration for Beaglebone Black.

    mtest I just realized is not enabled by default. Did you enable it ?

  • The part number is mt41k256m16ha-125, so I set the correct values in the spreadsheet from the part datasheet. What we are still trying to figure out is the impedance values.

    The dts is based on the beaglebone black because the board is similar to it. We don't have the I2C EEPROM though so we set the configuration directly with our values.

    I did indeed enable mtest on both the beaglebone black and our board, both running the exact same U-Boot, SPL and Linux with just the device tree and board.c changes.

  • Hi Davide,

    "Input the trace lengths per byte" means here since data lines could be 16-bits maximum as per EMIF of AM335x, it can be split as byte0(DQS0-DQS7) and byte1(DQS8-DQS15). There could be 2x8-bit DDR connected, in which case there would different trace lengths for clock and data bytes. There could be 1x16-bit DDR3 connected in which case there could be different trace lengths for data bytes alone and not for clock(since there would be only 1 pair of clock, going to a single DDR). You can suspect the trace lengths values since even though the part number is similar, these trace length values vary with the PCB. Another experiment is, you can try your custom Board DDR3 configuration to Beagle Bone Black, if not already done and check if it hangs.

    If you had enabled mtest, can you please confirm on the values of these MACROS : CONFIG_SYS_MEMTEST_START(should be overwritten if start address given), CONFIG_SYS_MEMTEST_END(should be overwritten if end address is given) and CONFIG_SYS_MEMTEST_SCRATCH. Since if any of these values overlap with the uboot or the range of the test, then it could give error. Otherwise I don't see a reason for the test to fail or be wrong. You can try giving a pattern and an iteration count of 1.

    Generally I have seen if you write to an address equal to end_address + offset, then data would be written to start_address + offset. So it has high chances that it can corrupt and hang. But still a userspace command like ssh hanging, I am not yet sure to point out that the issue is due to memory address going out of range(because it is given proper range entry in dts) 

  • Ok so the trace lengths are correct. We are now suspecting the impedance values, do you have any suggestion about those? For now we left the default ones.

    For mtest i set these constants

    CONFIG_SYS_MEMTEST_START: 0x82000000

    CONFIG_SYS_MEMTEST_END: 0xA0000000

    CONFIG_SYS_MEMTEST_SCRATCH: 0x9FFF0000

    Would the problem be that the test is overwriting the scratch area?

    I noticed it starts back from start address, so probably I was indeed overwriting something. I too don't think it's going out of range.

  • Hi Davide,

    You should be getting the impedance values from the PCB fabricator and match to it(TI would have a range/recommendation of impedance values to match in the PCB hardware).

  • Hello,

    the team working on the PCB still have problems understanding what certain values are.

    Would it be possible to get a more extensive explanation of section 2.2.1 1C and 1E (https://www.ti.com/lit/an/sprack4/sprack4.pdf?ts=1593091955921)?

    Also in step 1E they measured the impedance values with the CAD software but the max values result higher than the maximum of the spreadsheet which is 80. And still changing the values doesn't really do anything, so we suspect it's something else.

    Are the values from steps 1C and 1E extremely important or they wouldn't solve our problem anyway?

    Thank you

  • Hi,

    Please go through : https://www.ti.com/lit/an/sprabn2a/sprabn2a.pdf?ts=1593148322385&ref_url=https%253A%252F%252Fwww.google.com%252F section "2.6 General DDR Guidelines" and section "7.7.2 mDDR(LPDDR), DDR2, DDR3, DDR3L Memory Interface" in datasheet 

    If it is still not clear, I would suggest you to open a new query on the exact clarification so that it reaches out to people who can answer and you would be getting quicker response

  • The PCB team will go through the datasheet and check if there is anything wrong. Thank you for the help