This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3351: Kernel boot failure

Part Number: AM3351
Other Parts Discussed in Thread: CSD

Hi,

My customer using AM3351 reported boot-up issue on their board.

The software version is TI-PROCESSOR-SDK-LINUX-AM335X-EVM-04.01.00.06.
The board is custom and micro-SD boot is used.

There are a several failing pattern at boot-up.
a) Repeat “Starting kernel ...”
b) Stop at “random: crng init done” just after Arago Project logo.
c) Repeat “A start job is running for…”
d) Stop at “[FAILED] Failed to start Synchronize System and HW clocks.”

I will send you all 4 type logs as well as normal start-up log in separate e-mail.
Please review them and give your advice where needs to be checked.

Thanks and regards,
Koichiro Tashiro

  • Hi Koichiro,

    Do they have only one AM335x custom board that have boot up issues? And do they have AM335x custom boards that boot up fine with the same SW as the failing custom board?

    My first suggestion is to check this custom board DDR memory. You can run the memory diagnostic test (mem_test) to check if your RAM is correct.

    http://software-dl.ti.com/processor-sdk-rtos/esd/docs/latest/rtos/index_board.html#diagnostics

    External memory (DDR): DDR timing and leveling setting can be checked out using mem_test

    You might also use u-boot memory tester and/or linux memtester.

    For u-boot memory test:

    u-boot/doc/README.memory-test
    u-boot/common/memsize.c

     

    For linux memtester:

    https://e2e.ti.com/support/legacy_forums/embedded/linux/f/354/t/567360

    http://www.ti.com/lit/an/spraca1/spraca1.pdf

    Check also:

    http://www.ti.com/lit/an/sprack4/sprack4.pdf

    Regards,
    Pavel

  • Hi Pavel,

    Thanks for your inputs and sorry for my late reply.

    Customer found the issue comes from the micro-SD card and the failure rate is ~10%.
    - If customer remove the SD card from a failing board, then put this card to a working board, the working board also fails.
    - If the failing card is re-wrote with the same booting image, the failing board boots up correctly.
    So it was concluded that the image inside the card was corrupted.

    Now customer changes their micro-SD cards with high reliability ones and wrote the same booting image.
    These boards boot up correctly with the new micro-SD cards.
    But customer found the file system inside the card still has an error if file checker tool is used.

    Here is checker log:
    ============================================================
    e2fsck 1.42 (29-Nov-2011)
    Pass 1: Checking inodes, blocks, and sizes
    Inodes that were part of a corrupted orphan linked list found.  Fix<y>? no

    Inode 16713 was part of the orphaned inode list.  IGNORED.
    Inode 16716 was part of the orphaned inode list.  IGNORED.
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information

    rootfs: ********** WARNING: Filesystem still has errors **********

        5297 inodes used (2.63%)
         680 non-contiguous files (12.8%)
           5 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 665/22/0
      131702 blocks used (16.36%)
           0 bad blocks
           1 large file

        4298 regular files
         368 directories
           0 character device files
           0 block device files
           0 fifos
           0 links
         622 symbolic links (619 fast symbolic links)
           0 sockets
    --------
        5288 files
    ************************************************************

    What is the below error means ? Is it OK to ignore this?
    Inodes that were part of a corrupted orphan linked list found.

    If customer select “yes” for fix, it will be fixed by the checker tool, but the checker is running Linux Host server and not on Sitara.
    So the fix cannot be used in the real system.

    Thanks and regards,
    Koichiro Tashiro

     

  • Koichiro,

    Koichiro Tashiro said:
    What is the below error means ? Is it OK to ignore this?
    Inodes that were part of a corrupted orphan linked list found.

    This error is NOT specific to AM335x SDK. I can not advice. What I can suggest you is to check with SD card vendor and/or Linux mainline generic forum.

    I can also suggest you to try with the AM335x SD card create script:

    {PSDK}/bin/create-sdcard.sh

    http://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Overview_Getting_Started_Guide.html#linux-sd-card-creation-guide

    Regards,
    Pavel

  • Hi Pavel.

    Customer uses the SD card create script you suggested.
    And I got additional information from customer.
    The SD card image has no errors when the SD card is created and image is written.
    After the SD card is used in customer’s system for a while, the inodes for below files are corrupted.
    These are files in TI SDK.
               inode    Path
               16713  /sys/devices/platform/ocp/48060000.mmc/mmc_host/mmc0/mmc0:5048/csd
               16716  /sys/devices/platform/ocp/48060000.mmc/mmc_host/mmc0/mmc0:5048/date 

    As I mentioned, the system boots up correctly even these inodes are corrupted.
    Q1) What are these files for?
    Q2) What happens if these files are corrupted?
    Q3) Are these files updated during operation? (customer doubt these files are corrupted during write operation)

    Thanks and regards,
    Koichiro Tashiro

  • Tashiro-san,

    Koichiro Tashiro said:
    After the SD card is used in customer’s system for a while, the inodes for below files are corrupted.
    These are files in TI SDK.
               inode    Path
               16713  /sys/devices/platform/ocp/48060000.mmc/mmc_host/mmc0/mmc0:5048/csd
               16716  /sys/devices/platform/ocp/48060000.mmc/mmc_host/mmc0/mmc0:5048/date 

    Is this the result of 'e2fsc' after removing the card from the AM335x system, and putting it into an SD card reader on the Linux PC? When I try this with an AM335x card used in an AM335x EVM for a while already it comes back clean...

    # Check SD card containing an AM335x rootfs partition using a host PC
    a0797059@jiji:~/git/u-boot (ti-u-boot-2019.01-next-dev)
    $ sudo e2fsck /dev/sda2
    e2fsck 1.44.1 (24-Mar-2018)
    rootfs: clean, 75282/479552 files, 694343/1918208 blocks

    If so, that is odd. The /sys folder should always show empty when reading the card on a different machine (not booting), as it is a virtual folder, and no actual files get written there.

    # Reading out /sys folder from SD card in SD card reader on host PC
    a0797059@jiji:~/git/u-boot (ti-u-boot-2019.01-next-dev)
    $ ll /media/a0797059/rootfs/sys
    total 8
    dr-xr-xr-x  2 root root 4096 Oct 19 15:17 ./
    drwxr-xr-x 21 root root 4096 Oct 19 21:19 ../

    What does the customer get doing the same?

    Maybe there is some issue with mounting the sysfw on the AM335x system, perhaps temporarily? It should look like this...

    # Checking sysfw mount state and location on running AM335x system
    root@am335x-evm:~# mount | grep sysfs
    sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)

    Koichiro Tashiro said:
    Q1) What are these files for?
    Q2) What happens if these files are corrupted?
    Q3) Are these files updated during operation? (customer doubt these files are corrupted during write operation)

    These are virtual files that are exposed by the Kernel and mostly used for interfacing with userspace. Yes files under /sys will get updated by the Kernel during operation all the time, but since those are purely virtual, there should not be any related MMC activity.

    How does the customer shut down their boards?

    Do they perform a 'poweroff' command before removing the SD card and/or turning off power to the board to ensure the shutdown happens in an orderly and synchronized fashion?

    Can they run the un-modified TI filesystem image from the *current* AM335x Linux SDK? The initial posts suggests they are using version 4. Please have them use current version v6.01 available at http://software-dl.ti.com/processor-sdk-linux/esd/AM335X/latest/index_FDS.html

    Regards, Andreas