This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM623: ATF random fail at one sample

Part Number: AM623


Tool/software:

Hello

we have 1 board where we sometimes get ATF fail during boot up, we have not seen this issue at other boards so far. Usually it takes many reboot cycles (e.g. 1000 cycles) before the fail occurs. This ATF issue is similar to what has been reported in E2E thread below.

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1386509/am625-boot-atf-error

The system freezes always at same position, CPU internal watchdog should generate reset after 3min but CPU watchdog does not work, it is completely frozen until manual power off on cycle. We've tested DDR by Linux memtester (with inline ECC and also without inline ECC setting) without a fail, so it doesn't seem to me there is something wrong with DRAM, the sample works fine when it successfully boot up to uboot / Linux. I have not found any production issue on the sample, everything looks normal, 25MHz input clock is fine. The console output when system freezes is below (reproducible, always fails like that, our SW is based on SDK 09.01.00.08):

U-Boot SPL 2023.04-00002-g2362508993 (Jun 10 2024 - 14:06:58 +0200)
SYSFW ABI: 3.1 (firmware rev 0x0009 '9.1.8--v09.01.08 (Kool Koala)')
ECC is enabled, priming DDR which will take several seconds.
SPL initial stack usage: 13384 bytes
Trying to boot from MMC1
Authentication passed
Authentication passed
Authentication passed
Authentication passed
Authentication passed
Starting ATF on ARM64 core...

NOTICE: BL31:

We used JTAG to read out the position of fail and we found out it freezes at address 0x9e788200

Do you have an idea what might be wrong with the CPU / board?

The CPU is HS-FS, so before writing keys and unlocked.

Thanks!

Best regards

Libor

  • Hi Libor,
    1/. Was the lockup observed only on one specific board or multiple boards?
    2/. how many total boards under testing?
    Best,
    -Hong

  • Hi Hong,

    1 board has this issue only. Other boards are working fine. But just limited amount of boards have been tested intensively (e.g. 10 to 20 samples), this rare boot up fail would not be detected by production tester.

    Best regards

    Libor

  • Hi Libor,
    The log points to SDK 9.x is the baseline SW in your testing.
    I'd recommend to review the link on PLL programming sequence update, and undertake one of two options.
    - upgrade the baseline SW to SDK 10.0 or
    - pickup TIFS/DM/SPL updates, and integrate with SDK 9.x
    https://software-dl.ti.com/processor-sdk-linux/esd/AM62X/10_00_07_04/exports/docs/devices/AM62X/linux/Release_Specific_Migration_Guide.html#pll-programing-sequence-update-to-avoid-pll-instability

    Best,
    -Hong

  • Hi Hong,

    we will check the SW patch and try to confirm that it fixes our ATF issue, I will let you know. We have tried also previous SDK 8.06 and 9.0 during the last few days, interesting is that the board works fine there, we have seen the issue with just SDK 9.01 so far, just FYI.

    Thanks!

    Best regards

    Libor

  • Hi Libor,
    Thanks for your update and sharing the ongoing test result.
    Please let us know how test going with the PLL programming patch in SDK 10.0.
    Best,
    -Hong

  • Hi Hong,

    we tried unmodified SDK 9.02, so not everything is working with our system but it is sufficient for test, unfortunately we didn't see the fail there too, so we have the issue with SDK 9.01 only where we have no official SW patch for the PLL instability. So we tried to add 9.02 patch to 9.01 which is probably not right way to do but we just wanted to confirm the issue is really related to the PLL instability. We created fast reboot test from uboot, the test failed again but a bit differently, we can see more characters (longer log) on console, I added the log below just FYI. So we consider all the testing as sufficient proof (there is probably nothing else to do anyway) that it is SW config related issue and we can fix that by SW update. The issue can be closed. Thanks a lot for your support!

    Best regards

    Libor

    U-Boot SPL 2023.04-dirty (Oct 15 2024 - 07:19:13 +0200)
    SYSFW ABI: 3.1 (firmware rev 0x0009 '9.2.8--v09.02.08 (Kool Koala)')
    SPL initial stack usage: 13384 bytes
    Trying to boot from MMC1
    Authentication passed
    Authentication passed
    Authentication passed
    Authentication passed
    Authentication passed
    Starting ATF on ARM64 core...

    NOTICE: BL31: v2.10.0(release):v2.10.0-367-g00f1ec6b87-dirty
    NOTICE: BL31: