This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SK-AM68: Intermittent boot failure and Ethernet timeout issues

Part Number: SK-AM68

Tool/software:

I've noticed intermittent occurrences where the SK-AM68 dev board running SDK 09.02.00.05 software (https://www.ti.com/tool/download/PROCESSOR-SDK-LINUX-AM68A/09.02.00.05) does not completely boot.  I've seen this occur after it has been powered off for a while as well as on warm resets (ie. running Linux and then issuing `reboot`.)  I ran into this again today and captured the output below.  In this instance, it was a warm reset:

[89489.170895] reboot: Restarting system

U-Boot SPL 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000) SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)') EEPROM not available at 0x50, trying to read at 0x51 SPL initial stack usage: 13472 bytes Trying to boot from MMC2 sdhci_transfer_data: Transfer data timeout spl: mmc init failed with error: -110 SPL: failed to boot from all boot devices ### ERROR ### Please RESET the board ###

I pressed the reset button on the board and saw it boot, but then it hit an Ethernet timeout.  Since I'm loading the kernel and rootfs over Ethernet via NFS, this prevents the system from booting. 

Here's the trace of that:

U-Boot SPL 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)
SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
EEPROM not available at 0x50, trying to read at 0x51
SPL initial stack usage: 13472 bytes
Trying to boot from MMC2
Authentication passed
Authentication passed
Authentication passed
Loading Environment from nowhere... OK
Authentication passed
Authentication passed
Starting ATF on ARM64 core...

NOTICE:  BL31: v2.10.0(release):v2.10.0-367-g00f1ec6b87-dirty
NOTICE:  BL31: Built : 16:09:05, Feb  9 2024
I/TC: 
I/TC: OP-TEE version: 4.1.0-51-g012cdca49 (gcc version 11.4.0 (GCC)) #1 Tue Jan 30 10:48:03 UTC 2024 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check optee.readthedocs.io/.../porting_guidelines.html
I/TC: Primary CPU initializing
I/TC: GIC redistributor base address not provided
I/TC: Assuming default GIC group status and modifier
I/TC: SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
I/TC: HUK Initialized
I/TC: Activated SA2UL device
I/TC: Enabled firewalls for SA2UL TRNG device
I/TC: SA2UL TRNG initialized
I/TC: SA2UL Drivers initialized
I/TC: Primary CPU switching to normal world boot

U-Boot SPL 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)
SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
Trying to boot from MMC2
Authentication passed
Authentication passed


U-Boot 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)

SoC:   J721S2 SR1.0 HS-FS
Model: Texas Instruments AM68 SK
Board: AM68-SK-SOM rev E2
DRAM:  2 GiB (effective 8 GiB)
Core:  74 devices, 32 uclasses, devicetree: separate
Flash: 0 Bytes
MMC:   mmc@4fb0000: 1
Loading Environment from nowhere... OK
In:    serial@2880000
Out:   serial@2880000
Err:   serial@2880000
am65_cpsw_nuss ethernet@46000000: K3 CPSW: nuss_ver: 0x6BA02102 cpsw_ver: 0x6BA82102 ale_ver: 0x00293904 Ports:1 mdio_freq:1000000
Net:   eth0: ethernet@46000000port@1
Hit any key to stop autoboot:  0 
switch to partitions #0, OK
mmc1 is current device
SD/MMC found on device 1
Failed to load 'boot.scr'
2326 bytes read in 2 ms (1.1 MiB/s)
Loaded env from uEnv.txt
Importing environment from mmc1 ...
Running uenvcmd ...
syntax error
k3-navss-ringacc ringacc@2b800000: Ring Accelerator probed rings:286, gp-rings[96,20] sci-dev-id:272
k3-navss-ringacc ringacc@2b800000: dma-ring-reset-quirk: disabled
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
ethernet@46000000port@1 Waiting for PHY auto negotiation to complete......... TIMEOUT !
am65_cpsw_nuss_port ethernet@46000000port@1: phy_startup failed
am65_cpsw_nuss_port ethernet@46000000port@1: am65_cpsw_start end error
k3_r5f_rproc r5f@41000000: Core 1 is already in use. No rproc commands work
Failed to load '/lib/firmware/j721s2-mcu-r5f0_1-fw'
474508 bytes read in 26 ms (17.4 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 2 with data@addr=0x82000000 474508 bytes: Success!
330964 bytes read in 20 ms (15.8 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 3 with data@addr=0x82000000 330964 bytes: Success!
Failed to load '/lib/firmware/j721s2-main-r5f1_0-fw'
Failed to load '/lib/firmware/j721s2-main-r5f1_1-fw'
15212688 bytes read in 292 ms (49.7 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 6 with data@addr=0x82000000 15212688 bytes: Success!
9703328 bytes read in 65 ms (142.4 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 7 with data@addr=0x82000000 9703328 bytes: Success!
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
ethernet@46000000port@1 Waiting for PHY auto negotiation to complete.. done
link up on port 1, speed 1000, full duplex
*** ERROR: `ipaddr' not set
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
link up on port 1, speed 1000, full duplex
*** ERROR: `ipaddr' not set
Working FDT set to 88000000
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
link up on port 1, speed 1000, full duplex
*** ERROR: `ipaddr' not set
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
link up on port 1, speed 1000, full duplex
*** ERROR: `ipaddr' not set
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
link up on port 1, speed 1000, full duplex
*** ERROR: `ipaddr' not set
Bad Linux ARM64 Image magic!

At this point, I was at the U-Boot console prompt, so I entered the `reset` command.  The board booted completely as expected with no other physical intervention on my part.  Ethernet came up fine and I eventually was presented with the Linux login prompt.

Here's the trace up to the point it starts loading the kernel.

=> reset
resetting ...

U-Boot SPL 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)
SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
EEPROM not available at 0x50, trying to read at 0x51
SPL initial stack usage: 13472 bytes
Trying to boot from MMC2
Authentication passed
Authentication passed
Authentication passed
Loading Environment from nowhere... OK
Authentication passed
Authentication passed
Starting ATF on ARM64 core...

NOTICE:  BL31: v2.10.0(release):v2.10.0-367-g00f1ec6b87-dirty
NOTICE:  BL31: Built : 16:09:05, Feb  9 2024
I/TC: 
I/TC: OP-TEE version: 4.1.0-51-g012cdca49 (gcc version 11.4.0 (GCC)) #1 Tue Jan 30 10:48:03 UTC 2024 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check optee.readthedocs.io/.../porting_guidelines.html
I/TC: Primary CPU initializing
I/TC: GIC redistributor base address not provided
I/TC: Assuming default GIC group status and modifier
I/TC: SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
I/TC: HUK Initialized
I/TC: Activated SA2UL device
I/TC: Enabled firewalls for SA2UL TRNG device
I/TC: SA2UL TRNG initialized
I/TC: SA2UL Drivers initialized
I/TC: Primary CPU switching to normal world boot

U-Boot SPL 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)
SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
Trying to boot from MMC2
Authentication passed
Authentication passed


U-Boot 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)

SoC:   J721S2 SR1.0 HS-FS
Model: Texas Instruments AM68 SK
Board: AM68-SK-SOM rev E2
DRAM:  2 GiB (effective 8 GiB)
Core:  74 devices, 32 uclasses, devicetree: separate
Flash: 0 Bytes
MMC:   mmc@4fb0000: 1
Loading Environment from nowhere... OK
In:    serial@2880000
Out:   serial@2880000
Err:   serial@2880000
am65_cpsw_nuss ethernet@46000000: K3 CPSW: nuss_ver: 0x6BA02102 cpsw_ver: 0x6BA82102 ale_ver: 0x00293904 Ports:1 mdio_freq:1000000
Net:   eth0: ethernet@46000000port@1
Hit any key to stop autoboot:  0 
switch to partitions #0, OK
mmc1 is current device
SD/MMC found on device 1
Failed to load 'boot.scr'
2326 bytes read in 3 ms (756.8 KiB/s)
Loaded env from uEnv.txt
Importing environment from mmc1 ...
Running uenvcmd ...
syntax error
k3-navss-ringacc ringacc@2b800000: Ring Accelerator probed rings:286, gp-rings[96,20] sci-dev-id:272
k3-navss-ringacc ringacc@2b800000: dma-ring-reset-quirk: disabled
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
ethernet@46000000port@1 Waiting for PHY auto negotiation to complete........ done
link up on port 1, speed 1000, full duplex
BOOTP broadcast 1
BOOTP broadcast 2
BOOTP broadcast 3
DHCP client bound to address 192.168.128.92 (757 ms)
k3_r5f_rproc r5f@41000000: Core 1 is already in use. No rproc commands work
Failed to load '/lib/firmware/j721s2-mcu-r5f0_1-fw'
474508 bytes read in 25 ms (18.1 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 2 with data@addr=0x82000000 474508 bytes: Success!
330964 bytes read in 20 ms (15.8 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 3 with data@addr=0x82000000 330964 bytes: Success!
Failed to load '/lib/firmware/j721s2-main-r5f1_0-fw'
Failed to load '/lib/firmware/j721s2-main-r5f1_1-fw'
15212688 bytes read in 292 ms (49.7 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 6 with data@addr=0x82000000 15212688 bytes: Success!
9703328 bytes read in 65 ms (142.4 MiB/s)
Warning: Did not detect image signing certificate. Skipping authentication to prevent boot failure. This will fail on Security Enforcing(HS-SE) devices
Load Remote Processor 7 with data@addr=0x82000000 9703328 bytes: Success!
am65_cpsw_nuss_port ethernet@46000000port@1: K3 CPSW: rflow_id_base: 2
link up on port 1, speed 1000, full duplex
Using ethernet@46000000port@1 device
File transfer via NFS from server 192.168.128.11; our IP address is 192.168.128.92
Filename '/srv/nfs/katana-kernel//boot/Image'.
Load address: 0x82000000
Loading: #################################################################
	 #################################################################
	 #################################################################

Are these known issues?  Any idea why I'm running into these intermittent boot and Ethernet timeout problems during boot?

  • Hi,

    sdhci_transfer_data: Transfer data timeout spl: mmc init failed with error: -110

    Seems like reading the next boot binary from SD card is timing out. Can you prepare a fresh SD and try if the same issue is observed?

    - Keerthy

  • Keerthy,

    Thanks for the reply!  For reference, my current microSD card is a: SanDisk Ultra SDSQUNS-016G-GN3MN 16GB 80MB/s UHS-I Class 10 microSDHC Card

    I've swapped that with a: SanDisk Ultra SDSQUAR-032G-GN6MA 32GB 98MB/s UHS-I Class 10 microSDHC Card A1 U1

    With the 32GB SanDisk part, I set up a configuration to issue a reboot at the end of the "start)" definition in `/etc/init.d/edgeai-launcher.sh`.  After 38 reboot iterations, I encountered the following (cutting and pasting from the last few shutdown trace lines to the error):

    [ 21.996031] systemd-shutdown[1]: Unmounting file systems.
    [ 22.002319] systemd-shutdown[1]: All filesystems unmounted.
    [ 22.007988] systemd-shutdown[1]: Deactivating swaps.
    [ 22.013014] systemd-shutdown[1]: All swaps deactivated.
    [ 22.018237] systemd-shutdown[1]: Detaching loop devices.
    [ 22.026204] systemd-shutdown[1]: All loop devices detached.
    [ 22.032227] systemd-shutdown[1]: Stopping MD devices.
    [ 22.038123] systemd-shutdown[1]: All MD devices stopped.
    [ 22.043456] systemd-shutdown[1]: Detaching DM devices.
    [ 22.049952] systemd-shutdown[1]: All DM devices detached.
    [ 22.055366] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
    [ 22.068150] systemd-shutdown[1]: Syncing filesystems and block devices.
    [ 22.074971] systemd-shutdown[1]: Rebooting.
    [ 22.123177] reboot: Restarting system

    U-Boot SPL 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)
    SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
    EEPROM not available at 0x50, trying to read at 0x51
    Timeout during frequency handshake
    ### ERROR ### Please RESET the board ###

  • Hi,

    I've created a systemd service to constantly reboot and the board (and count each reboot) to see if I can replicate your issue.

    Does this issue only occur with NFS or does it occur if you run everything on an SD carad as well?

    Best,
    Jared

  • Jared,

    Good question.  I've been running NFS exclusively for a while now so I haven't tried it with kernel/dtb/rootfs on the microSD.  I figured this was all upstream of NFS executing so haven't bothered with putting everything on an SD card.  Additionally, I have seen this occur straight up with a fresh power cycle (board is off for a bit, then boot.)  We were troubleshooting some image sensor board issues last month so had a lot of fresh power cycle iterations and I know there were several instances of this timeout error occurring.

    Thanks for trying to replicate things!  I was troubleshooting some different issues yesterday, and was able to warm reboot with my original microSD 16GB card over 100 times without seeing this particular timeout even occur.  So, your mileage may vary.

  • Hi,

    I was able to see the error when booting from the SD card after 15 reboots, but the error message is different:

    U-Boot SPL 2023.04-ti-gf9b966c67473 (Mar 19 2024 - 20:31:40 +0000)
    SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.4--v09.02.04 (Kool Koala)')
    EEPROM not available at 0x50, trying to read at 0x51
    SPL initial stack usage: 13472 bytes
    Trying to boot from MMC2
    Authentication passed
    Authentication passed
    Authentication passed
    Loading Environment from nowhere... OK
    Authentication passed
    ti_sci system-controller@44083000: Message not acknowledgedAuthentication failed!
    ### ERROR ### Please RESET the board ###

    I've found that this issue was seen in 8.6, but was unable to be replicated.

    I'll try adding some debug statements to ti_sci to see if there's any more information.

    Best,
    Jared

  • Hi,

    I added sysfw traces, but didn't get much helpful information from them.

    I assumed that there was a timeout error that was occurring with ti_sci, so I doubled the max_rx_timeout_rs. I no longer see my original error, but I am seeing your error now:

    U-Boot SPL 2024.04-dirty (Oct 10 2024 - 17:44:23 -0500)
    SYSFW ABI: 4.0 (firmware rev 0x000a '10.0.8--v10.00.08 (Fiery Fox)')
    EEPROM not available at 0x50, trying to read at 0x51
    Timeout during frequency handshake
    ### ERROR ### Please RESET the board ###

    The frequency handshake timing out means there's an issue with the DDR.

    Best,
    Jared

  • Thanks for the update!

  • Hi,

    Increasing the time before timeout for the ddr has caused other issues. I will continue testing.

    Best,
    Jared

  • Any updates on this?

  • Hi,

    It looks like there is an issue with the DDR configuration. Our development team is currently working on the issue.

    Best,
    Jared

  • Hi,

    Can you try using this DDR configuration and rebuild U-Boot?

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/k3_2D00_j721s2_2D00_ddr_2D00_evm_2D00_lp4_2D00_4266.dtsi

    Best,
    Jared