This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM5728: Poweroff kernel panic

Part Number: AM5728

Tool/software: Linux

Hi,

I have a custom AM5728 based board. We are using a Phytec Phycore 057 SOM. It utilizes the TPS6590375 PMIC. 

I recently switched from kernel 4.9 to 4.14 from git.ti.com/ti-linux-kernel/ti-linux-kernel/ti-linux-4.14.y

Since the switch to 4.14. the Poweroff is not working reliably. It seems as if the i2c bus is already disabled when the palmas_power_off is called and tries to talk to the PMIC.

Here are two different error logs. In one case it just hangs, in the other a kernel panic is thrown. I have added some tracing messages.

         Starting Power-Off...
[  129.325900] systemd-shutdown[1]: Syncing filesystems and block devices.
[  129.349152] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
[  129.364119] systemd-journald[143]: Received SIGTERM from PID 1 (systemd-shutdow).
[  129.438177] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
[  129.453152] systemd-shutdown[1]: Unmounting file systems.
[  129.488663] EXT4-fs (mmcblk1p2): re-mounted. Opts: data=ordered
[  129.497320] systemd-shutdown[1]: All filesystems unmounted.
[  129.503459] systemd-shutdown[1]: Deactivating swaps.
[  129.508819] systemd-shutdown[1]: All swaps deactivated.
[  129.514306] systemd-shutdown[1]: Detaching loop devices.
[  129.523747] systemd-shutdown[1]: All loop devices detached.
[  129.529640] systemd-shutdown[1]: Detaching DM devices.
[  129.535255] systemd-shutdown[1]: All DM devices detached.
[  129.545822] systemd-shutdown[1]: Syncing filesystems and block devices.
[  129.552919] systemd-shutdown[1]: Powering off.
[  129.587829] reboot: migrate_to_reboot_cpu
[  129.592080] reboot: syscore_shutdown
[  129.595825] syscore_shutdown++
[  129.599054] PM: Calling ledtrig_cpu_syscore_shutdown+0x0/0x8
[  129.604990] PM: Calling irq_gc_shutdown+0x0/0x54
[  129.609846] PM: Calling cpufreq_suspend+0x0/0x110
[  129.614782] syscore_shutdown--
[  129.617992] reboot: Power down
[  129.621189] local_irq_disable
[  129.624331] smp_send_stop
[  129.627078] pm_power_off
[  129.629731] palmas_power_off++
[  129.632935] palmas-override-powerhold
[  129.636776] regmap_update_bits slave: 0 addr: 0xfb

         Starting Power-Off...
[   19.719908] systemd-shutdown[1]: Syncing filesystems and block devices.
[   19.742652] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
[   19.757785] systemd-journald[144]: Received SIGTERM from PID 1 (systemd-shutdow).
[   19.827671] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
[   19.842800] systemd-shutdown[1]: Unmounting file systems.
[   19.883415] EXT4-fs (mmcblk1p2): re-mounted. Opts: data=ordered
[   19.892337] systemd-shutdown[1]: All filesystems unmounted.
[   19.898261] systemd-shutdown[1]: Deactivating swaps.
[   19.903584] systemd-shutdown[1]: All swaps deactivated.
[   19.909105] systemd-shutdown[1]: Detaching loop devices.
[   19.918531] systemd-shutdown[1]: All loop devices detached.
[   19.924381] systemd-shutdown[1]: Detaching DM devices.
[   19.930047] systemd-shutdown[1]: All DM devices detached.
[   19.940524] systemd-shutdown[1]: Syncing filesystems and block devices.
[   19.947616] systemd-shutdown[1]: Powering off.
[   19.952277] reboot: kernel_power_off++
[   19.991093] reboot: migrate_to_reboot_cpu
[   19.995342] reboot: syscore_shutdown
[   19.999112] syscore_shutdown++
[   20.002326] PM: Calling ledtrig_cpu_syscore_shutdown+0x0/0x8
[   20.008297] PM: Calling irq_gc_shutdown+0x0/0x54
[   20.013145] PM: Calling cpufreq_suspend+0x0/0x110
[   20.018095] syscore_shutdown--
[   20.021289] reboot: Power down
[   20.024492] machine_power_off++
[   20.027814] local_irq_disable
[   20.030924] smp_send_stop
[   20.033668] pm_power_off
[   20.036318] palmas_power_off++
[   20.039521] palmas-override-powerhold
[   20.043362] regmap_update_bits slave: 0 addr: 0xfb
[   21.081850] omap_i2c 48070000.i2c: controller timed out
[   21.111748] palmas 0-0058: Unable to write PRIMARY_SECONDARY_PAD2 -110
[   21.118599] regmap_update_bits slave: 0 addr: 0xa0
[   22.197868] omap_i2c 48070000.i2c: controller timed out
[   22.227777] palmas_power_off: Unable to write to DEV_CTRL_DEV_ON: -110
[   22.234626] palmas_power_off--
[   22.237856] machine_power_off--
[   22.241279] reboot: kernel_power_off--
[   22.251875] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
[   22.251875]
[   22.261462] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.14.54-02970-g9d1c806-dirty #321
[   22.270411] Hardware name: Generic DRA74X (Flattened Device Tree)
[   22.276822] [<c0210024>] (unwind_backtrace) from [<c020b840>] (show_stack+0x10/0x14)
[   22.284953] [<c020b840>] (show_stack) from [<c0a1be2c>] (dump_stack+0x7c/0x90)
[   22.292538] [<c0a1be2c>] (dump_stack) from [<c023661c>] (panic+0xdc/0x248)
[   22.299758] [<c023661c>] (panic) from [<c023aa0c>] (complete_and_exit+0x0/0x1c)
[   22.307432] [<c023aa0c>] (complete_and_exit) from [<ee0d9e78>] (0xee0d9e78)
[   22.314754] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
[   22.314754]

Any idea what might be going wrong here?

I know that ti-linux-4.14.y is under development and some features may not work but I'm running 4.14 since ~half a year and everything else is fine. But this problem is there from day one and is not going away or didn't change behaviour. So I'm pretty sure I'm having an, in some way special, case which didn't turn up in any development at TI. Maybe it's my devietree or something like this.

Regards,

Michael

  • Hi Michael,

    First I want to note that the ti-linux-4.14.y branch is being updated constantly, as I checked the latest commit is from today. You CANNOT count that the latest sources will be stable until the kernel is officially released.

    Anyway to try and avoid the i2c timeout, which is the most likely reason for getting your poweroff to hang, I'd go with increasing the I2C timeouts in drivers/i2c/busses/i2c-omap.c:
    /* timeout waiting for the controller to respond */
    #define OMAP_I2C_TIMEOUT (msecs_to_jiffies(1000))

    /* timeout for pm runtime autosuspend */
    #define OMAP_I2C_PM_TIMEOUT 1000 /* ms */

    /* timeout for making decision on bus free status */
    #define OMAP_I2C_BUS_FREE_TIMEOUT (msecs_to_jiffies(10))

    Unfortunately I cannot recommend a timeout value at this point, you should experiment on your board.

    The reason for your timeouts is explained here:
    patchwork.kernel.org/.../

    Best Regards,
    Yordan
  • Also, I suggest to check your kernel 4.14 port against mainline kernel 4.14.54 (focus on the power management sources), downloaded from kernel.org. As a test, I copied the mainline kernel image built with omap2plus_defconfig and using the TI Processor SDK Linux dts file, because I didn't have time to adapt all drivers for the evm. As a result I got the AM57xx EVM working with kernel 4.14.54:
    root@am57xx-evm:~# uname -a
    Linux am57xx-evm 4.14.54 #1 SMP Wed Jul 11 15:04:12 EEST 2018 armv7l GNU/Linux

    Reboot & poweroff commands work fine on the EVM, see the log from poweroff:
    root@am57xx-evm:~# poweroff
    Stopping Load/Save Random Seed...
    [ OK ] Stopped target Timers.
    Stopping Lightning Fast Webserver With Light System Requirements...
    Stopping Job spooling tools...
    Stopping Periodic Command Scheduler...
    [ OK ] Stopped target Login Prompts.
    Stopping Avahi mDNS/DNS-SD Stack...
    Stopping Update UTMP about System Boot/Shutdown...
    Stopping User Manager for UID 0...
    St Stopping Network Name Resolution...
    [ OK ] Stopped Daily Cleanup of Temporary Directories.
    Stopping rng-tools.service...
    Stopping Simple Network Management Protocol (SNMP) Daemon....
    [ OK ] Stopped Session c1 of user root.
    Stopping Telephony service...
    Stopping strongSwan IPsec IKEv1/IKEv2 daemon using ipsec.conf...
    Stopping Serial Getty on ttyS2...
    [ OK ] Stopped Network Time Synchronization.
    [ OK ] Stopped Telephony service.
    [ OK ] Stopped Job spooling tools.
    [ OK ] Stopped Kernel Logging Service.
    [ OK ] Stopped Periodic Command Scheduler.
    [ OK ] Stopped System Logging Service.
    [ OK ] Stopped Avahi mDNS/DNS-SD Stack.
    [ OK ] Stopped Simple Network Management Protocol (SNMP) Daemon..
    [ OK ] Stopped strongSwan IPsec IKEv1/IKEv2 daemon using ipsec.conf.
    [ OK ] Stopped Network Name Resolution.
    [ OK ] Stopped Wait for Network to be Configured.
    [ OK ] Stopped Lightning Fast Webserver With Light System Requirements.
    [ OK ] Stopped Serial Getty on ttyO2.
    [ OK ] Stopped Getty on tty1.
    [ OK ] Stopped Serial Getty on ttyS2.
    [ OK ] Stopped Load/Save Random Seed.
    [ OK ] Stopped Update UTMP about System Boot/Shutdown.
    [ OK ] Removed slice system-getty.slice.
    [ OK ] Removed slice system-serial\x2dgetty.slice.
    [ OK ] Stopped Create Volatile Files and Directories.
    [ OK ] Stopped target Local File Systems.
    Unmounting /media/ram...
    Unmounting /run/user/0...
    Unmounting /run/media/mmcblk1p1...
    Unmounting /sys/kernel/debug/tracing...
    Unmounting /run/media/mmcblk1p2...
    Unmounting /run/media/mmcblk1p3...
    Unmounting /run/media/mmcblk0p1...
    [FAILED] Failed unmounting /run/user/0.
    [ OK ] Stopped User Manager for UID 0.
    [ OK ] Unmounted /media/ram.
    [ OK ] Unmounted /sys/kernel/debug/tracing.
    [ OK ] Stopped rng-tools.service.
    [ OK ] Unmounted /run/media/mmcblk1p1.
    Stopping thttpd.service...
    [ OK ] Removed slice User Slice of root.
    Stopping Login Service...
    Stopping Permit User Sessions...
    [ OK ] Stopped Login Service.
    [ OK ] Unmounted /run/media/mmcblk1p2.
    [ OK ] Unmounted /run/media/mmcblk1p3.
    [ OK ] Unmounted /run/media/mmcblk0p1.
    [ OK ] Stopped thttpd.service.
    [ OK ] Stopped Permit User Sessions.
    [ OK ] Stopped target Network.
    Stopping Network Service...
    Stopping telnetd.service...
    Stopping tiipclad-daemon.service...
    [ OK ] Stopped target Local File Systems (Pre).
    [ OK ] Stopped Remount Root and Kernel File Systems.
    [ OK ] Stopped Create Static Device Nodes in /dev.
    [ OK ] Stopped Network Service.
    [ OK ] Stopped telnetd.service.
    [ OK ] Stopped tiipclad-daemon.service.
    Stopping weston.service...
    Stopping D-Bus System Message Bus...
    [ OK ] Stopped Apply Kernel Variables.
    [ OK ] Stopped D-Bus System Message Bus.
    [ OK ] Stopped weston.service.
    [ OK ] Stopped target Remote File Systems.
    Stopping rc.pvr.service...
    [ OK ] Stopped rc.pvr.service.
    Stopping uim-sysfs.service...
    [ OK ] Stopped uim-sysfs.service.
    [ OK ] Stopped target Basic System.
    [ OK ] Stopped target Paths.
    [ OK ] Stopped Dispatch Password Requests to Console Directory Watch.
    [ OK ] Stopped Forward Password Requests to Wall Directory Watch.
    Unmounting Temporary Directory...
    [ OK ] Stopped target Sockets.
    [ OK ] Closed D-Bus System Message Bus Socket.
    [ OK ] Closed Avahi mDNS/DNS-SD Stack Activation Socket.
    [ OK ] Closed Syslog Socket.
    [ OK ] Closed RPCbind Server Activation Socket.
    [ OK ] Closed dropbear.socket.
    [ OK ] Stopped target Slices.
    [ OK ] Removed slice User and Session Slice.
    [ OK ] Reached target Shutdown.
    [FAILED] Failed unmounting Temporary Directory.
    [ OK ] Reached target Unmount All Filesystems.
    [ OK ] Reached target Final Step.
    Starting Power-Off...
    [ OK ] Stopped target Swap.
    [ 78.675041] systemd-shutdow: 25 output lines suppressed due to ratelimiting
    [ 78.694080] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
    [ 78.719765] systemd-journald[99]: Received SIGTERM from PID 1 (systemd-shutdow).
    [ 88.721809] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
    [ 88.740219] systemd-shutdown[1]: Sending SIGKILL to PID 743 (rngd).
    [ 88.754227] systemd-shutdown[1]: Unmounting file systems.
    [ 88.760582] systemd-shutdown[1]: Remounting '/run/user/0' read-only with options 'size=185632k,mode=700'.
    [ 88.771404] systemd-shutdown[1]: Unmounting /run/user/0.
    [ 88.802543] systemd-shutdown[1]: Remounting '/var/volatile' read-only with options 'size=51200k'.
    [ 88.812595] systemd-shutdown[1]: Unmounting /var/volatile.
    [ 88.872512] systemd-shutdown[1]: Remounting '/tmp' read-only with options ''.
    [ 88.880681] systemd-shutdown[1]: Unmounting /tmp.
    [ 88.932473] systemd-shutdown[1]: Remounting '/' read-only with options 'data=ordered'.
    [ 90.397055] EXT4-fs (mmcblk0p2): re-mounted. Opts: data=ordered
    [ 90.423021] EXT4-fs (mmcblk0p2): re-mounted. Opts: data=ordered
    [ 90.429934] EXT4-fs (mmcblk0p2): re-mounted. Opts: data=ordered
    [ 90.482497] reboot: Power down

    Best Regards,
    Yordan
  • Hi Yordan,

    thanks for the ideas. The increased timeout didn't change a thing, just takes longer to panic.

    I pulled out the AM5728-IDK I have laying around to have a clean slate. I see the problem with the mainline kernel as well as the ti-linux branch using "ti_sdk_dra7x_debug_defconfig" and "am572x-idk.dtb". 

    [   37.009358] systemd-shutdown[1]: Powering off.
    [   37.014050] reboot: kernel_power_off++
    [   37.042324] reboot: migrate_to_reboot_cpu
    [   37.046573] reboot: syscore_shutdown
    [   37.050324] syscore_shutdown++
    [   37.053578] PM: Calling ledtrig_cpu_syscore_shutdown+0x0/0x8
    [   37.059521] PM: Calling irq_gc_shutdown+0x0/0x54
    [   37.064381] syscore_shutdown--
    [   37.067581] reboot: Power down
    [   37.070781] machine_power_off++
    [   37.074099] local_irq_disable
    [   37.077204] smp_send_stop
    [   37.079948] pm_power_off
    [   37.082601] palmas_power_off++
    [   37.085807] palmas-override-powerhold
    [   37.089648] regmap_update_bits slave: 0 addr: 0xfb 
    ---> stuck

    alternatively the write gives an error and I get a kernel panic.

    4.9. is turning off the board just fine.

    I tinkered around a bit in the shutdown code and came across the call to "local_irq_disable();" in "void machine_power_off(void)".

    If I comment out this call the pmic is successfuly switching off the board. So I suspect the I2C driver is relying on an irq which gets disabled before the communication with the PMIC is happening.

    So if I have time I will dig some more (quick look showed no big changes in i2c-omap, so it might be something else). But for the moment I'm fine with commenting out the "local_irq_disable();" call which is no problem for me as the PMIC is switching of the power anyways.

    Regards,

    Michael

  • Hi Michael,

    Thanks for sharing. As I said you could also try with the omap2plus_defconfig. I tested my GP EVM with this configuration and couldn't see a kernel panic or hang when executing poweroff.

    Also thank you for sharing your hack. I am sure this will be useful for other community members as well.

    I am closing this thread for now.

    Best Regards,
    Yordan
  • I could reproduce the issue with the latest ti-linux-4.14.y.

    Reverting any of the below patches is allowing me to shutdown the pmic:

    0d5e04e sched/core: Require cpu_active() in select_task_rq(), for user tasks
    e4c55e0e sched/core: Fix rules for running on online && !active CPUs


    More debug revealed that power_off call schedules couple i2c writes to PMIC.
    The isr is getting triggered but with the above two patches in place the bottom half
    of the I2C isr never gets scheduled and hence i2c times out.

    So still need to figure out the right way to deal with this.