PROCESSOR-SDK-J721E: reboot hang after upgrade to psdk10

Part Number: PROCESSOR-SDK-J721E

Tool/software:

After upgrade to psdk10, the Linux reboot hang at:
189.454695] reboot: Restarting system

I did a few test as below:
1. In u-boot cmdline, we are able to use reset to reboot the device
2. If we kick the watchdog, the device can reboot when timeout
3. If we kick the watchdog and reboot right away, the reboot still hang at the same place and watchdog won't reset the device.
4. With some debug, we found the reboot hang at: TF-A: k3_system_reset -> ti_sci_core_reboot()

We are still using sd boot, we didn't test reboot in other boot mode yet. Please help.

Regards,

Mandy

  • Hi Mandy,

    I have redirected this query to the corresponding expert. Thanks in advance for your patience.

    Regards

    Gokul

  • Hello Mandy,

    Can you share the steps on how you kick start the watchdog in U-Boot and tried reset?

    Also what's the use case of kicking the watchdog and then immediately triggering reset?

    Best Regards,

    Keerthy 

  • Hi Keerthy,
    1. In u-boot cmd, if I type in "reset", I can see device get reboot. This works as expected.
    2. In Linux, if I run cmd: echo 1 > /dev/watchdog and wait, after 60 seconds, the device will reboot successfully.
    3. In Linux, if I run cmd: reboot, the reboot will hang as I mentioned in the first message. I debugged a little, and found it hang at:  TF-A: k3_system_reset -> ti_sci_core_reboot()
    4. In Linux, if I run cmd : echo 1 > /dev/watchdog and then type in "reboot", the device will reboot but hang at the same place(the watchdog won't reset the device even after a long waiting time).
    We have 2 issue:
    1. When device reboot in linux normally, it hang.(case 3)
    2. When watchdog start and then reboot, the watchdog won't trigger the device reset when reboot hang(case 4).

    The first issue is more important, we need to be able to reboot the device normally. For the second issue, we still want if device is hang at reboot stage, the watchdog can reboot the device as backup, but with a lower priority.
    Thanks,
    Mandy

  • Hi Mandy,

    1. When device reboot in linux normally, it hang.(case 3)

    Can you share the complete logs? Also can you confirm if it is 10.0 SDK?

    - Keerthy

  • Hi Keerthy,

    We setup layer with: ./oe-layertool-setup.sh -f configs/processor-sdk-linux/processor-sdk-linux-10_00_08_06.txt
    I think it is 10.0.
    The reboot log is as attached.
    Mandy

    reboot
             Stopping Session c1 of User root...
    [  OK  ] Removed slice Slice /system/modprobe.
    [  OK  ] Stopped target Multi-User System.
    [  OK  ] Stopped target Login Prompts.         Stopping DNS forwarder and DHCP server...
             Stopping Getty on tty1...
             Stopping Mosquitto MQTT Broker...
             Stopping Berkeley Internet Name Domain (DNS)...
             Stopping Netperf Benchmark Server...
             Stopping Telephony service...
             Stopping RTC service...[  117.070178] RTC: Launch script v1.0
    
             Stopping Serial Getty on ttyS2...
             Stopping Load/Save OS Random Seed...
             Stopping Telnet Server...
    [  OK  ] Stopped OpenSSH Key Generation.
    [  OK  ] Stopped Avahi mDNS/DNS-SD Stack.
    [  OK  ] Stopped Telephony service.
    [  OK  ] Stopped Netperf Benchmark Server.
    [  OK  ] Stopped containerd container runtime.
    [  OK  ] Stopped Getty on tty1.
    [  OK  ] Stopped Serial Getty on ttyS2.
    [  OK  ] Stopped Mosquitto MQTT Broker.
    [  OK  ] Stopped Bluetooth service.
    [  OK  ] Stopped DNS forwarder and DHCP server.
    [  OK  ] Stopped Berkeley Internet Name Domain (DNS).
    [  OK  ] Stopped RTC service.
    [  OK  ] Stopped Load/Save OS Random Seed.
    [  OK  ] Stopped Telnet Server.
    [  OK  ] Stopped Session c1 of User root.
    [  OK  ] Removed slice Slice /system/getty.
    [  OK  ] Removed slice Slice /system/serial-getty.
             Stopping User Login Management...
             Stopping Permit User Sessions...
             Stopping User Manager for UID 0...
    [  OK  ] Stopped Permit User Sessions.
    [  OK  ] Stopped User Manager for UID 0.
    [  OK  ] Stopped target Network.
    [  OK  ] Stopped target Remote File Systems.
             Stopping Network Configuration...
             Stopping User Runtime Directory /run/user/0...
    [  OK  ] Unmounted /run/user/0.
    [  OK  ] Stopped User Login Management.
    [  OK  ] Stopped Network Configuration.
    [  OK  ] Stopped User Runtime Directory /run/user/0.
    [  136.925927] kauditd_printk_skb: 18 callbacks suppressed
    [  136.925935] audit: type=1334 audit(1709054940.484:133): prog-id=64 op=UNLOAD
    [  OK  ] Removed slice User Slice of UID 0  136.938803] audit: type=1334 audit(1709054940.488:134): prog-id=66 op=UNLOAD
    0m.
    [  OK  ] Stopped target Preparation for Network.
    [  OK  ] Stopped IPv6 Packet Filtering Framework.
    [  OK  ] Stopped IPv4 Packet Filtering Framework.
    [  OK  ] Stopped target Basic System.
    [  OK  ] Stopped target Path Units.
    [  OK  ] Stopped Dispatch Password Requests to Console Directory Watch.
    [  OK  ] Stopped Forward Password Requests to Wall Directory Watch.
    [  OK  ] Stopped target Slice Units.
    [  OK  ] Removed slice User and Session Slice.
    [  OK  ] Stopped target Socket Units.
    [  OK  ] Closed Avahi mDNS/DNS-SD Stack Activation Socket.
    [  OK  ] Closed Docker Socket for the API.
    [  OK  ] Closed sshd.socket.
    [  OK  ] Closed Network Service Netlink Socket.
             Stopping D-Bus System Message Bus...
    [  OK  ] Stopped Generate network units from Kernel command line.
    [  OK  ] Stopped D-Bus System Message Bus.
    [  OK  ] Closed D-Bus System Message Bus Soc[  137.233669] audit: type=1334 audit(1709054940.796:135): prog-id=65 op=UNLOAD
    ket.
    [  OK  ] Stopped target System Initialization.
             Stopping Network Name Resolution...
             Stopping Record System Boot/Shutdown in UTMP...
    [  OK  ] Stopped Network Name Resolution.
    [  137.323930] audit: type=1334 audit(1709054940.884:136): prog-id=72 op=UNLOAD
    [  OK  ] Stopped Apply Kernel Variables.
    [  OK  ] Closed Process Core Dump Socket.
    [  OK  ] Stopped Load Kernel Modules.
    [  OK  ] Stopped Record System Boot/Shutdown in UTMP.
    [  OK  ] Stopped Create Volatile Files and Directories.
    [  OK  ] Stopped target Local File Systems.
             Unmounting /ro...
             Unmounting /run/media/BOOT-mmcblk0p1...
             Unmounting /run/media/USERAPP-mmcblk0p3...
             Unmounting /run/media/mmcblk0p2...
             Unmounting /run/media/sda1...
    [  137.553260] EXT4-fs (sda1): unmounting filesystem fbb7e497-c621-495f-a9a3-ec611e825c95.
             Unmounting /run/media/sda2...[  137.562078] EXT4-fs (sda2): unmounting filesystem 5f580e3e-7149-4d6b-b991-803ce15bd7d7.
    
             Unmounting /run/media/sdb1...
             Unmounting /run/media/sdb2...[  137.613290] EXT4-fs (sdb1): unmounting filesystem 7e49d823-432f-46c8-a5aa-3f053f942d99.
    
    [  137.622796] EXT4-fs (sdb2): unmounting filesystem 0f4d1cb7-69fa-41d7-be27-57e3fbb08be2.
             Unmounting /run/media/sdb3...
             Unmounting /run/media/sdb4...[  137.658392] EXT4-fs (sdb3): unmounting filesystem 2ef81de8-617b-4eef-b65e-2b8fa9268ae0.
    
             Unmounting Temporary Directory /tmp...[  137.688974] EXT4-fs (sdb4): unmounting filesystem 9f3e6979-6867-4287-8178-74cfbba29829.
    
             Unmounting /var/volatile...
    [  OK  ] Unmounted /ro.
    [  OK  ] Unmounted /run/media/BOOT-mmcblk0p1.
    [  OK  ] Unmounted /run/media/USERAPP-mmcblk0p3.
    [  OK  ] Unmounted /run/media/mmcblk0p2.
    [  OK  ] Unmounted /run/media/sda1.
    [  OK  ] Unmounted /run/media/sda2.
    [  OK  ] Unmounted /run/media/sdb1.
    [  OK  ] Unmounted /run/media/sdb2.
    [  OK  ] Unmounted /run/media/sdb3.
    [  OK  ] Unmounted /run/media/sdb4.
    [  OK  ] Unmounted Temporary Directory /tmp.
    [  OK  ] Unmounted /var/volatile.
    [  OK  ] Stopped target Swaps.
    [  OK  ] Reached target Unmount All Filesystems.
    [  OK  ] Stopped File System Check on /dev/mmcblk0p1.
    [  OK  ] Stopped File System Check on /dev/mmcblk0p2.
    [  OK  ] Stopped File System Check on /dev/mmcblk0p3.
    [  OK  ] Stopped File System Check on /dev/sda1.
    [  OK  ] Stopped File System Check on /dev/sda2.
    [  OK  ] Stopped File System Check on /dev/sdb1.
    [  OK  ] Stopped File System Check on /dev/sdb2.
    [  OK  ] Stopped File System Check on /dev/sdb3.
    [  OK  ] Stopped File System Check on /dev/sdb4.
    [  OK  ] Removed slice Slice /system/systemd-fsck.
    [  OK  ] Stopped target Preparation for Local File Systems.
    [  OK  ] Stopped Remount Root and Kernel File Systems.
    [  OK  ] Stopped Create Static Device Nodes in /dev.
    [  OK  ] Stopped Create Static Device Nodes in /dev gracefully.
    [  OK  ] Reached target System Shutdown.
    [  OK  ] Reached target Late Shutdown Services.
    [  OK  ] Finished System Reboot.
    [  OK  ] Reached target System Reboot.
    [  138.308835] audit: type=1334 audit(1709054941.868:137): prog-id=60 op=UNLOAD
    [  138.315981] audit: type=1334 audit(1709054941.868:138): prog-id=59 op=UNLOAD
    [  138.323076] audit: type=1334 audit(1709054941.872:139): prog-id=63 op=UNLOAD
    [  138.330193] audit: type=1334 audit(1709054941.872:140): prog-id=62 op=UNLOAD
    [  138.337264] audit: type=1334 audit(1709054941.872:141): prog-id=61 op=UNLOAD
    [  138.344356] audit: type=1334 audit(1709054941.884:142): prog-id=68 op=UNLOAD
    [  138.362116] watchdog: watchdog0: nowayout prevents watchdog being stopped!
    [  138.368995] watchdog: watchdog0: watchdog did not stop!
    [  138.389915] systemd-shutdown[1]: Using hardware watchdog 'K3 RTI Watchdog', version 0, device /dev/watchdog0
    [  138.399812] systemd-shutdown[1]: Modifying watchdog timeout is not supported, reusing the programmed timeout.
    [  138.410064] systemd-shutdown[1]: Watchdog running with a timeout of 1min.
    [  138.438429] systemd-shutdown[1]: Syncing filesystems and block devices.
    [  138.452489] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
    [  138.470763] systemd-journald[145]: Received SIGTERM from PID 1 (systemd-shutdow).
    [  138.484456] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
    [  138.500695] systemd-shutdown[1]: Unmounting file systems.
    [  138.507939] (sd-remount)[2914]: Remounting '/' read-only with options 'lowerdir=/mnt/lower,upperdir=/mnt/usr/rw_psdk7/upper,workdir=/mnt/usr/rw_psdk7/work'.
    [  138.528339] systemd-shutdown[1]: All filesystems unmounted.
    [  138.533985] systemd-shutdown[1]: Deactivating swaps.
    [  138.539137] systemd-shutdown[1]: All swaps deactivated.
    [  138.544776] systemd-shutdown[1]: Detaching loop devices.
    [  138.551593] systemd-shutdown[1]: All loop devices detached.
    [  138.557176] systemd-shutdown[1]: Stopping MD devices.
    [  138.562507] systemd-shutdown[1]: All MD devices stopped.
    [  138.567816] systemd-shutdown[1]: Detaching DM devices.
    [  138.573295] systemd-shutdown[1]: All DM devices detached.
    [  138.578721] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
    [  138.588374] watchdog: watchdog0: nowayout prevents watchdog being stopped!
    [  138.595350] systemd-shutdown[1]: Failed to disable hardware watchdog, ignoring: Device or resource busy
    [  138.604740] watchdog: watchdog0: nowayout prevents watchdog being stopped!
    [  138.611601] watchdog: watchdog0: watchdog did not stop!
    [  138.617567] systemd-shutdown[1]: Syncing filesystems and block devices.
    [  138.624306] systemd-shutdown[1]: Rebooting.
    [  138.821338] ICC_BROKER: TimeSync thread stopped
    [  138.933342] ICC_MAIN: Removing MCU_1_0 bus
    [  138.945341] ICC_MAIN: Removed MCU_1_0 bus
    [  138.949354] ICC_MAIN: Removing MCU_2_0 bus
    [  138.965363] ICC_MAIN: Removed MCU_2_0 bus
    [  138.969390] ICC_MAIN: Removing MCU_2_1 bus
    [  138.989341] ICC_MAIN: Removed MCU_2_1 bus
    [  138.993340] ICC_MAIN: Removing MCU_3_0 bus
    [  139.021339] ICC_MAIN: Removed MCU_3_0 bus
    [  139.025341] ICC_MAIN: Removing MCU_3_1 bus
    [  139.049346] ICC_DDR: Freed shared memory pool, size: 0 MiB
    [  139.054833] ICC_MAIN: Removed MCU_3_1 bus
    [  139.058904] sd 0:0:0:1: [sdb] Synchronizing SCSI cache
    [  139.064109] sd 0:0:0:0: [sda] Synchronizing SCSI cache
    [  139.282078] reboot: Restarting system
    

  • Mandy,

    Have you tried this on the EVM in case you have access?

    Linux reboot works well on the EVM.

    Best Regards,

    Keerthy