This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM335X: Kernel hangs early in boot

Part Number: PROCESSOR-SDK-AM335X
Other Parts Discussed in Thread: TIDA-01568

I am having huge trouble bringing up a new custom target board. I have been struggeling with MLO and u-boot, but they are now running. Using SDK 8.02.

My target does not have RTC so access to RTC registers had to be removed (using defines and dts). Any access to RTC register when not running causing abort interrupt.

I am expecting that the kernel has similar problems but I am unable to find a solution or to verify that this is the case.

U-boot will run until "Starting kernel ..."

Before the above it also seems to correctly have read the kernel tracing 49xxxxx bytes read (matching kernel zImage size).

No messages after "Starting kernel ..."

I have attached JTAG with Code Composer Studio and placed breakpoints at 0x82000000 that was hit, and 0x8010000 that is also hit.

It continues to code at 0xC0F002E0 (that seems to be __mmap_switched). After a few instructions it crashes.

Apart from the RTC not being used we do not have a TPS chip for power supply. Could also be this but I would expect kernel to output something if this was the case.

I have not been able to load kernel with symbols, so I have been using Disassembly and the map file to figure out where it is.

Can anyone give a hint to how I can proceed or any help or clue to what can be causing this?

  • My custom target matches the description in TIDA-01568.

    Comparing to the AM335x EVM or Beagle Bone family, the differences in hardware are:

    • This board does not support the internal RTC.
    • This board does not support the PMIC IC for power management

    I have tried to modified codes similar to this list:

    1. U-Boot: Disable the board detect function and enforce the return value for "am335x-boneblack".
    2. U-Boot: Disable the I2C communication with PMIC.
    3. U-Boot: Disable the RTC related function.
    4. U-Boot: Disable the "BOOTCOUNT_LIMIT" configuration during auto booting.
    5. Kernel: Remove the RTC node, PMIC node and related codes in device tree (DTS file).
    6. Kernel: Disable the "RTC Real Time Clocking" configuration.

    Not 100% sure I have done all of the above correctly

  • Hello Thomas,
    Your change list on customizing u-boot for your customer board looks good to me.
    I'm also attaching a simple switch from u-boot loading kernel to kernel start for your reference.
    I'm looping in my colleague on kernel for input.
    Best,
    -Hong

  • Attaching the PDF from u-boot to kernel.5850.u-boot_kernel.pdf

  • Thanks a lot for the feedback. 

    I have followed all the steps in the u-boot kernel document you attached. And they all do succeed.

    Actually since I wrote I also figured out how to get kernel symbols and are now able to step through the kernel startup code.

    Im into the rest_init() function where I think the thread scheduling takes over. But right now I have no clue how to proceed from here.

    I wonder why I do not see anything on the console. Code has passed a number of pr_notice("...") calls.

  • It should be noted that early_printk does not print anything to the UART until the UART is initialized. The printout is initially stored in the buffer, and dump out until after the UART is operable.

  • For your reference on kernel console port set-up, I'm listing below an example in file "/include/configs/am335x_evm.h" (for AM335x boards),
    where console port is set-up depending on board type in u-boot environment variable definition.
    init_console=if test $board_name = A335_ICE; then setenv console ttyO3,115200n8;else setenv console ttyO0,115200n8;fi;

  • Thank you for that information, I was wondering why I did not see anything. That makes sence if it never reaches initialization of uart.

  • I have the following specified in the u-boot env.

    => env print console
    console=ttyO0,115200n8

    That should be OK as uart0 is the console port.

  • The last thing I am able to find where I can break before it goes to reboot is the vclkdev_alloc() function in clkdev.c (suspecious...)

    I will not return from the call to strlcpy(). Parameters seems to be OK addresses, but I am also thinking that it might be a legit task switch that occurs.

    Can anyone tell me where I could break in with a debugger after the kernel has started task scheduling.

    I am still not able to get any console output to help me.

  • Hi Thomas,

    When porting kernel to a new board, please first disable the modules as many as possible in your board DTS, and only keep the bare minimum nodes for the CPU to run the Linux kernel, such as the CPU and the console UART.

    Please use the following board DTS as an example:
    https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/arch/arm64/boot/dts/ti/k3-am625-skeleton.dts?h=ti-linux-5.10.y

  • This is my current full dts:

    I have tried to reduce it as much as possible, but maybe I am missing something required?

    /dts-v1/;

    #include "am33xx.dtsi"
    #include <dt-bindings/interrupt-controller/irq.h>

    / {
        model = "Prolon PID6000";
        compatible = "prolon,pid6000", "ti,am33xx";

        chosen {
            stdout-path = &uart0;
            tick-timer = &timer2;
        };

        memory {
            device_type = "memory";
            reg = <0x80000000 0x10000000>; /* 256 MB */
        };
    };

    &am33xx_pinmux {
        jtag_pins_default: jtag_pins_default {
            pinctrl-single,pins = <
                AM33XX_IOPAD(0x9d0, PIN_INPUT | MUX_MODE0) /* (C11) TMS.TMS */
                AM33XX_IOPAD(0x9d4, PIN_INPUT | MUX_MODE0) /* (B11) TDI.TDI */
                AM33XX_IOPAD(0x9d8, PIN_OUTPUT | MUX_MODE0) /* (A11) TDO.TDO */
                AM33XX_IOPAD(0x9dc, PIN_INPUT | MUX_MODE0) /* (A12) TCK.TCK */
                AM33XX_IOPAD(0x9e0, PIN_INPUT | MUX_MODE0) /* (B10) nTRST.nTRST */
                AM33XX_IOPAD(0x9e4, PIN_INPUT | MUX_MODE0) /* (C14) EMU0.EMU0 */
                AM33XX_IOPAD(0x9e8, PIN_INPUT | MUX_MODE0) /* (B14) EMU1.EMU1 */
            >;
        };

        com0_console_pins_default: com0_console_pins_default {
            pinctrl-single,pins = <
                AM33XX_IOPAD(0x970, PIN_INPUT | MUX_MODE0) /* (E15) uart0_rxd.uart0_rxd */
                AM33XX_IOPAD(0x974, PIN_OUTPUT | MUX_MODE0) /* (E16) uart0_txd.uart0_txd */
            >;
        };
    };

    &rtc {
        status = "disabled";
        ti,hwmods = "disabled";    
    };

    &uart0 {
        pinctrl-names = "default";
        pinctrl-0 = <&com0_console_pins_default>;
        status = "okay";
    };
  • Hi Thomas,

    I built your DTS file and can still boot the Beaglebone Black with it. So I don't think your DTS is the root cause. Though "ti,hwmods" is a deprecated property and will be ignored.

    How did you generate kernel .config?

    Please use './ti_config_fragments/defconfig_builder.sh' in the Processor SDK kernel, then select option 1, 2, 1. Then do 'make ARCH=arm CROSS_COMPILE=<toolchain> ti_sdk_am3x_release_defconfig' to generate the .config.

  • To make my defconfig file, I copied the am335x-evm_defconfig to my own and used menuconfig to remove and add different things. One of them being the RTC another the PMIC. (I have added my target to Rules.make)

    I have created a new .config as you suggest and build the kernel with make arc.... zImage

    I will remove the ti,hwmods (just one of my attemts found when searching for solutions)

    The new kernel is build and I have copied it to my sd-card that I am using to boot target.

    Sadly same behaviour. It stops after outputting "Starting kernel..."

    Here are the console output during boot:

    U-Boot SPL 2021.01-00001-gc59bf25a38-dirty (Aug 31 2022 - 11:37:30 +0200)
    Trying to boot from MMC1


    U-Boot 2021.01-00001-gc59bf25a38-dirty (Aug 31 2022 - 11:37:30 +0200)

    CPU : AM335X-GP rev 1.0
    Model: Prolon PID6000
    DRAM: 512 MiB
    WDT: Started with servicing (60s timeout)
    NAND: 0 MiB
    MMC: OMAP SD/MMC: 0
    Loading Environment from FAT... *** Warning - bad CRC, using default environment

    <ethaddr> not set. Validating first E-fuse MAC
    Net: eth2: ethernet@4a100000
    Hit any key to stop autoboot: 0
    switch to partitions #0, OK
    mmc0 is current device
    SD/MMC found on device 0
    Failed to load 'boot.scr'
    Failed to load 'uEnv.txt'
    switch to partitions #0, OK
    mmc0 is current device
    Scanning mmc 0:1...
    Scanning disk mmc@48060000.blk...
    Found 3 disks
    No EFI system partition
    BootOrder not defined
    EFI boot manager: Cannot load any image
    switch to partitions #0, OK
    mmc0 is current device
    SD/MMC found on device 0
    4997632 bytes read in 389 ms (12.3 MiB/s)
    57960 bytes read in 11 ms (5 MiB/s)
    ## Flattened Device Tree blob at 88000000
    Booting using the fdt blob at 0x88000000
    Loading Device Tree to 8ffee000, end 8ffff267 ... OK

    Starting kernel ...

  • After first attempt with new .config file. I have used menuconfig to.

    remove OMAP_32K_TIMER
    remove RTC_DRV_OMAP

    (the 32kHz rtc was causing problems with MLO and uboot, so just trying to make sure it is not being used).

    Still no success....

  • Now I tried the sd-card in my old target that was closer to the BBB. (new one has different IO and no RTC and PMIC).

    With the old target hardware the Kernel boots, but not very succesfully 

    Not sure this matters at all as this is my old target where I have been updating from SDK 2 -> SDK 7 without problems.
    But I would actually expect that the kernel build here would actually boot.... (could it be the board name?)

    [2.911198] 8<--- cut here ---
    [ 2.914281] Unhandled fault: external abort on non-linefetch (0x1008) at 0xe0326000
    [ 2.921975] pgd = 55ce7f33
    [ 2.924689] [e0326000] *pgd=82809811, *pte=4a326653, *ppte=4a326453
    [ 2.931003] Internal error: : 1008 [#1] PREEMPT ARM
    [ 2.935900] Modules linked in:
    [ 2.938973] CPU: 0 PID: 54 Comm: kworker/0:3 Not tainted 5.10.100-g7a7a3af903 #11
    [ 2.946486] Hardware name: Generic AM33XX (Flattened Device Tree)
    [ 2.952629] Workqueue: events deferred_probe_work_func
    [ 2.957803] PC is at sysc_probe+0xce8/0x1550
    [ 2.962096] LR is at omap_reset_deassert+0x1ac/0x318
    [ 2.967078] pc : [<c04e15fc>] lr : [<c0566df4>] psr: 60000113
    [ 2.973368] sp : c1eb9e28 ip : 00000000 fp : c0b2f23c
    [ 2.978612] r10: 00000028 r9 : 00000000 r8 : c0b2f6fc
    [ 2.983856] r7 : c1977c10 r6 : 00000000 r5 : c10df3e4 r4 : c2893340
    [ 2.990408] r3 : e0326000 r2 : 00000000 r1 : 00026000 r0 : 00000000
    [ 2.996963] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
    [ 3.004127] Control: 10c5387d Table: 80004019 DAC: 00000051
    [ 3.009895] Process kworker/0:3 (pid: 54, stack limit = 0x6717c297)
    [ 3.016188] Stack: (0xc1eb9e28 to 0xc1eba000)
    [ 3.020564] 9e20: 00000001 00000000 c10097d0 00000000 c10df3e4 c0d73b10
    [ 3.028778] 9e40: c1977c10 c0d739f0 00000001 00000001 00000030 c1003048 00000000 00000000
    [ 3.036991] 9e60: c1977c10 c10830a8 00000000 c10e3b7c c10830a8 0000000a c1e9bfa0 c0650a50
    [ 3.045204] 9e80: c1977c10 c10e3b74 00000000 00000000 c10e3b7c c064e924 00000000 c1977c10
    [ 3.053418] 9ea0: c10830a8 c064ee38 c1091bc8 00000000 00000000 c1091bf4 c1e9bfa0 c064ed38
    [ 3.061631] 9ec0: 00000000 c1eb9ef4 c064ee38 c064cc00 00000000 c183d11c c1ab0534 c1003048
    [ 3.069845] 9ee0: c1977c10 00000001 c1977c54 c064e7c0 c1977c10 c1977c10 00000001 c1003048
    [ 3.078060] 9f00: c1977c10 c1977c10 c1091e50 c064da74 c1977c10 c1091bbc c1091bbc c064df20
    [ 3.086273] 9f20: c1091bf0 c1e9c600 00000000 dfa29200 00000000 c013f790 c1eb8000 c10768e0
    [ 3.094486] 9f40: c1e9c600 c100bf8c c1e9c614 c1eb8000 c10768e0 c100bfa0 c100bf8c c013fc4c
    [ 3.102701] 9f60: 00000000 c1e9bf80 c1e9bf00 c1eb8000 00000000 c013fa18 c1e9c600 c186fed0
    [ 3.110914] 9f80: c1e9bfa0 c01445e0 00000000 c1e9bf00 c01444a0 00000000 00000000 00000000
    [ 3.119128] 9fa0: 00000000 00000000 00000000 c0100148 00000000 00000000 00000000 00000000
    [ 3.127341] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [ 3.135554] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
    [ 3.143786] [<c04e15fc>] (sysc_probe) from [<c0650a50>] (platform_drv_probe+0x48/0x98)
    [ 3.151746] [<c0650a50>] (platform_drv_probe) from [<c064e924>] (really_probe+0xf0/0x398)
    [ 3.159963] [<c064e924>] (really_probe) from [<c064ed38>] (driver_probe_device+0x5c/0xb4)
    [ 3.168180] [<c064ed38>] (driver_probe_device) from [<c064cc00>] (bus_for_each_drv+0x84/0xd0)
    [ 3.176744] [<c064cc00>] (bus_for_each_drv) from [<c064e7c0>] (__device_attach+0xf0/0x15c)
    [ 3.185046] [<c064e7c0>] (__device_attach) from [<c064da74>] (bus_probe_device+0x84/0x8c)
    [ 3.193262] [<c064da74>] (bus_probe_device) from [<c064df20>] (deferred_probe_work_func+0x7c/0xa8)
    [ 3.202270] [<c064df20>] (deferred_probe_work_func) from [<c013f790>] (process_one_work+0x1c4/0x44c)
    [ 3.211446] [<c013f790>] (process_one_work) from [<c013fc4c>] (worker_thread+0x234/0x5cc)
    [ 3.219662] [<c013fc4c>] (worker_thread) from [<c01445e0>] (kthread+0x140/0x184)
    [ 3.227093] [<c01445e0>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
    [ 3.234344] Exception stack(0xc1eb9fb0 to 0xc1eb9ff8)
    [ 3.239416] 9fa0: 00000000 00000000 00000000 00000000
    [ 3.247628] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [ 3.255840] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    [ 3.262487] Code: e3130004 1a00014c e5943014 e0833001 (e593c000)
    [ 3.268611] ---[ end trace 1047a6c1834bb3ce ]---
    [ 3.302034] mmc1: new high speed MMC card at address 0001
    [ 3.308336] mmcblk1: mmc1:0001 004G60 3.69 GiB

  • Hi Thomas,

    Just realized a dumb thing to check, I believe the AM335x SDK v8.2 adds a config file "extlinux" to the SD card first partition (where u-boot binaries sit), which makes u-boot to load zImage and board DTB from the first partition, not the boot/ folder of the second partition which is the SDK v7.x and older used to do.

    I am not sure where you updated zImage and DTB to, but please check the u-boot log for the file size to ensure the correct files are loaded.

    SD/MMC found on device 0
    4997632 bytes read in 389 ms (12.3 MiB/s)
    57960 bytes read in 11 ms (5 MiB/s)
  • It is the correct zImage and dtb loaded from rootfs/boot, in my FAT partition I have only MLO and u-boot.img.

    Except in the console output above (which is from another target, not the problematic one). I made a mistake booting on old target, it was booting with the new dtb that is not matching the old target. Maybe I should not have confused things with an older target hardware.

  • Can anyone help with somewhere to break and check how far the kernel got in booting. Maybe where the console should be initialzed and start output.

    I am quite desperate to get my target running.

    I can only see that the problem is either with

    RTC not being enabled (requires a hardware change)
    There is no PMIC but power up sequence seems fine according to TIDA-01568 (also a big hardware change)

  • I have now tested a different approach.

    Since I assume that my new MLO and u-boot are OK. I am testing with an older kernel. 

    I have taken a SDK 7 kernel that is running and tried with this on the new target.

    Still MLO and u-boot seems fine. Everything stops after "Starting kernel..."

    I can see it is the correct kernel and dtb looking at the sizes reported by u-boot.

    As I have looked a lot for things with RTC and PMIC, could it be something different MMU or ....???

    What could stop the kernel before outputting anything on the console.

  • In TIDA-01568 I have one sentence that I am not 100% sure about:

         6. Kernel: Disable the "RTC Real Time Clocking" configuration.

    Can anyone explain exactly what this means?