This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM4372: sdhci-omap timout during reboot

Part Number: AM4372
Other Parts Discussed in Thread: AM4378

Hi Andrew

Do you have any updates to the https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1114937/am4372-sdhci-omap-timout-during-reboot/ topic?

In the meantime we've updated Linux kernel to version 5.10.145 from TI GIT (SRCREV = "76979ae4875941639fe73982a10d69c8cfc533d9") and issue is reproducing also on this version.

Best Regards

Paweł

  • Hi Pawel,

    can you provide the output of `cat /sys/kernel/debug/mmc1/ios`?

    It might be a bit far-fetched for this particular scenario (since during normal operation you don't seem to have issues) but I'd still be curious to see if limiting the eMMC interface clock frequency to let's say 25MHz and reducing the bus width (all via DTS entry) have any impact on the observed failure rate.

    Regards, Andreas

  • Hi Andreas

    At this moment clock is 52MHz and bus width 4 bits:

    # cat /sys/kernel/debug/mmc1/ios
    clock: 52000000 Hz
    vdd: 21 (3.3 ~ 3.4 V)
    bus mode: 2 (push-pull)
    chip select: 0 (don't care)
    power mode: 2 (on)
    bus width: 2 (4 bits)
    timing spec: 1 (mmc high-speed)
    signal voltage: 0 (3.30 V)
    driver type: 0 (driver type B)

    Best Regards

    Paweł

  • Hi Andreas

    After changing clock for mmc1 to 26MHz issue is also reproducing:

    [ 64.668415] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in t
    [ 74.723248] mmc1: Timeout waiting for hardware cmd interrupt.
    [ 74.729046] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [ 74.735532] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00003101
    [ 74.742013] mmc1: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
    [ 74.748493] mmc1: sdhci: Argument: 0x00010000 | Trn mode: 0x00000000
    [ 74.754971] mmc1: sdhci: Present: 0x01f70000 | Host ctl: 0x00000000
    [ 74.761450] mmc1: sdhci: Power: 0x00000000 | Blk gap: 0x00000000
    [ 74.767928] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000000
    [ 74.774406] mmc1: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
    [ 74.780885] mmc1: sdhci: Int enab: 0x007f0003 | Sig enab: 0x007f0003
    [ 74.787364] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [ 74.793842] mmc1: sdhci: Caps: 0x05e10080 | Caps_1: 0x00000000
    [ 74.800320] mmc1: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
    [ 74.806798] mmc1: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0x00000000
    [ 74.813276] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
    [ 74.819751] mmc1: sdhci: Host ctl2: 0x00000000
    [ 74.824223] mmc1: sdhci: ============================================
    [ 74.859991] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in t

    # cat /sys/kernel/debug/mmc1/ios
    clock: 26000000 Hz
    vdd: 21 (3.3 ~ 3.4 V)
    bus mode: 2 (push-pull)
    chip select: 0 (don't care)
    power mode: 2 (on)
    bus width: 2 (4 bits)
    timing spec: 1 (mmc high-speed)
    signal voltage: 0 (3.30 V)
    driver type: 0 (driver type B)

    Best Regards

    Paweł

  • Hi Andreas, 

    I'm also involved with this issue, the hardware part of  device is on my side. Below schematic how it is connected eMMC with Sitara on MMC1 port. I checked the power supply and observe the clk/data signals and it looks not bad when device is booted. The voltage are also nominal without any glitches. Also if my observation are correct if I see the problem with reboot command "timeout post regarding to mmc1"  the signals clk/data are turned off. But if the reboot command goes fine the signals clk/data are still observed until the system is reseted.

    My colleague from software team added some delay for reboot, because I needed more time to observe the signals and it seems to accidentally patched this problem but this is not the solution.

    What could I checked from my side, any suggestion ?

  • Hi Damina and Pawl,

    Have you tried to reproduce the issue on AM437x GP EVM? If so it would be helpful for us locally debugging it.

    Roughly how many reboot iterations is it when the issue happens in about an hour?

    How often do you see the time out issue happens when mounting SD card? If run mount/umount SD card test in a loop, do you see the timeout issue happen sooner?

  • Hi Bin

    We will make a setup and run the reboot test on AM437x GP EVM next week.

    Usually it takes about 1 hour to reproduce during test of reboot in the loop on our board.

    For SD card we will prepare similar test and let you know.

    Best Regards

    Paweł

  • Hi Pawel,

    Thanks for collecting the date on the EVM.

    I tested SD card mount/umount stress test on the AM437x GP EVM last night. It runs over night for 35K+ iterations, but the timeout wasn't triggered.

    I used the SDK v8.2.0.24 prebuilt base filesystem, using NFS root filesystem, then run the following script to mount/umount the SD card 2nd partition.

    #!/bin/sh
    
    mount /dev/mmcblk0p2 /mnt || exit 1
    ls /mnt
    umount /mnt || exit 2
    

    (I have another script to run this script in a loop.)

    Now I just stopped the test, and run 'reboot -f' with Linux base filesystem running on the SD card. Let's see if this test can trigger the timeout issue.

  • Hi Bin

    Thanks for the update, so far both issues were reproduced when system on our board was running from eMMC. We're planning to prepare the same setup for AM437x GP EVM.

    One more observation, today when issue was reproduced, there was an additional line at the end:

    [ 64.025605] mmc1: card 0001 removed

    Sending all processes the TERM signal...
    Sending all processes the KILL signal...
    Deactivating swap...
    Unmounting local filesystems...
    [ 22.765354] EXT4-fs (dm-0): re-mounted. Opts: (null)
    Rebooting... [ 33.122542] mmc1: Timeout waiting for hardware cmd interrupt.
    [ 33.128355] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [ 33.134843] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00003101
    [ 33.141325] mmc1: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
    [ 33.147804] mmc1: sdhci: Argument: 0x00010000 | Trn mode: 0x00000000
    [ 33.154287] mmc1: sdhci: Present: 0x01f70000 | Host ctl: 0x00000000
    [ 33.160766] mmc1: sdhci: Power: 0x00000000 | Blk gap: 0x00000000
    [ 33.167247] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000000
    [ 33.173726] mmc1: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
    [ 33.180207] mmc1: sdhci: Int enab: 0x007f0003 | Sig enab: 0x007f0003
    [ 33.186687] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [ 33.193166] mmc1: sdhci: Caps: 0x05e10080 | Caps_1: 0x00000000
    [ 33.199646] mmc1: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
    [ 33.206126] mmc1: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0x00000000
    [ 33.212605] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
    [ 33.219083] mmc1: sdhci: Host ctl2: 0x00000000
    [ 33.223555] mmc1: sdhci: ============================================
    [ 33.259633] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 33.297324] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 43.362522] mmc1: Timeout waiting for hardware cmd interrupt.
    [ 43.368320] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [ 43.374804] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00003101
    [ 43.381286] mmc1: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
    [ 43.387767] mmc1: sdhci: Argument: 0x00010000 | Trn mode: 0x00000000
    [ 43.394248] mmc1: sdhci: Present: 0x01f70000 | Host ctl: 0x00000000
    [ 43.400728] mmc1: sdhci: Power: 0x00000000 | Blk gap: 0x00000000
    [ 43.407207] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000000
    [ 43.413686] mmc1: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
    [ 43.420167] mmc1: sdhci: Int enab: 0x007f0003 | Sig enab: 0x007f0003
    [ 43.426647] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [ 43.433126] mmc1: sdhci: Caps: 0x05e10080 | Caps_1: 0x00000000
    [ 43.439606] mmc1: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
    [ 43.446085] mmc1: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0x00000000
    [ 43.452564] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
    [ 43.459041] mmc1: sdhci: Host ctl2: 0x00000000
    [ 43.463513] mmc1: sdhci: ============================================
    [ 43.499242] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 43.536926] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 53.602515] mmc1: Timeout waiting for hardware cmd interrupt.
    [ 53.608308] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [ 53.614794] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00003101
    [ 53.621276] mmc1: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
    [ 53.627756] mmc1: sdhci: Argument: 0x00010000 | Trn mode: 0x00000000
    [ 53.634236] mmc1: sdhci: Present: 0x01f70000 | Host ctl: 0x00000000
    [ 53.640715] mmc1: sdhci: Power: 0x00000000 | Blk gap: 0x00000000
    [ 53.647194] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000000
    [ 53.653673] mmc1: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
    [ 53.660153] mmc1: sdhci: Int enab: 0x007f0003 | Sig enab: 0x007f0003
    [ 53.666632] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [ 53.673112] mmc1: sdhci: Caps: 0x05e10080 | Caps_1: 0x00000000
    [ 53.679591] mmc1: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
    [ 53.686070] mmc1: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0x00000000
    [ 53.692549] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
    [ 53.699026] mmc1: sdhci: Host ctl2: 0x00000000
    [ 53.703497] mmc1: sdhci: ============================================
    [ 53.739221] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 53.776899] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 63.842515] mmc1: Timeout waiting for hardware cmd interrupt.
    [ 63.848307] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [ 63.854791] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00003101
    [ 63.861273] mmc1: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
    [ 63.867754] mmc1: sdhci: Argument: 0x00010000 | Trn mode: 0x00000000
    [ 63.874235] mmc1: sdhci: Present: 0x01f70000 | Host ctl: 0x00000000
    [ 63.880713] mmc1: sdhci: Power: 0x00000000 | Blk gap: 0x00000000
    [ 63.887193] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000000
    [ 63.893673] mmc1: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
    [ 63.900153] mmc1: sdhci: Int enab: 0x007f0003 | Sig enab: 0x007f0003
    [ 63.906632] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [ 63.913112] mmc1: sdhci: Caps: 0x05e10080 | Caps_1: 0x00000000
    [ 63.919592] mmc1: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
    [ 63.926071] mmc1: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0x00000000
    [ 63.932551] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
    [ 63.939028] mmc1: sdhci: Host ctl2: 0x00000000
    [ 63.943499] mmc1: sdhci: ============================================
    [ 63.979238] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 64.016913] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [ 64.025605] mmc1: card 0001 removed

  • Hi Pawel,

    One more observation, today when issue was reproduced, there was an additional line at the end:

    [ 64.025605] mmc1: card 0001 removed

    Is this line printed in the log in the normal cases when reboot successfully?

  • Hi Bin

    Reboot was unsuccessful. Device was halted and required power off, the same like in previous cases.

  • Hi Pawel,

    I meant when the timeout issue doesn't happen, is this line printed in the Linux shutdown log?

  • Hi Bin

    No, during successful reboot it is not printed:

    Sending all processes the TERM signal...
    logout
    Sending all processes the KILL signal...
    Deactivating swap...
    Unmounting local filesystems...
    [ 1587.586922] EXT4-fs (dm-0): re-mounted. Opts: (null)
    Rebooting... [ 1587.885399] reboot: Restarting system
    CCCCCCCC

  • Thanks Pawel,

    Can you please apply the following kernel debug patch and re-run the test on your platform? It should dump the kernel call stack which likely will show how this card remove routine it triggered. Please provide the kernel shutdown log if the timeout issue happens and the call stack is dumped in the log.

    diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
    index 4383c262b3f5..5aafea731a38 100644
    --- a/drivers/mmc/core/bus.c
    +++ b/drivers/mmc/core/bus.c
    @@ -406,6 +406,7 @@ void mmc_remove_card(struct mmc_card *card)
                    } else {
                            pr_info("%s: card %04x removed\n",
                                    mmc_hostname(card->host), card->rca);
    +                       dump_stack();
                    }
                    device_del(&card->dev);
                    of_node_put(card->dev.of_node);

  • No problem, here is the shutdown log with the call stack:

    Sending all processes the TERM signal...
    Sending all processes the KILL signal...
    Deactivating swap...
    Unmounting local filesystems...
    [   40.692484] EXT4-fs (dm-0): re-mounted. Opts: (null)
    Rebooting... [   51.043986] mmc1: Timeout waiting for hardware cmd interrupt.
    [   51.049801] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   51.056287] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   51.062769] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   51.069249] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   51.075729] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   51.082209] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   51.088688] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   51.095168] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   51.101648] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   51.108128] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   51.114608] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   51.121088] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   51.127568] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   51.134047] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   51.140524] mmc1: sdhci: Host ctl2: 0x00000000
    [   51.144997] mmc1: sdhci: ============================================
    [   51.181055] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   51.218764] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   61.284005] mmc1: Timeout waiting for hardware cmd interrupt.
    [   61.289804] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   61.296289] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   61.302770] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   61.309252] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   61.315733] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   61.322213] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   61.328693] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   61.335173] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   61.341655] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   61.348134] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   61.354614] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   61.361094] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   61.367574] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   61.374053] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   61.380530] mmc1: sdhci: Host ctl2: 0x00000000
    [   61.385003] mmc1: sdhci: ============================================
    [   61.420751] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   61.458450] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   71.523985] mmc1: Timeout waiting for hardware cmd interrupt.
    [   71.529781] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   71.536267] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   71.542748] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   71.549228] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   71.555709] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   71.562188] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   71.568667] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   71.575146] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   71.581626] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   71.588105] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   71.594585] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   71.601065] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   71.607544] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   71.614023] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   71.620500] mmc1: sdhci: Host ctl2: 0x00000000
    [   71.624971] mmc1: sdhci: ============================================
    [   71.660715] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   71.698414] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   81.763987] mmc1: Timeout waiting for hardware cmd interrupt.
    [   81.769779] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   81.776262] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   81.782743] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   81.789224] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   81.795704] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   81.802184] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   81.808664] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   81.815142] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   81.821621] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   81.828100] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   81.834579] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   81.841059] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   81.847538] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   81.854018] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   81.860495] mmc1: sdhci: Host ctl2: 0x00000000
    [   81.864966] mmc1: sdhci: ============================================
    [   81.900700] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   81.938397] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   81.947059] mmc1: card 0001 removed
    [   81.950578] Printing stack
    [   81.950605] CPU: 0 PID: 68 Comm: kworker/0:4 Not tainted 5.10.153-g3eee621d16 #1
    [   81.961391] Hardware name: Generic AM43 (Flattened Device Tree)
    [   81.967456] Workqueue: events_freezable mmc_rescan
    [   81.972341] [<c010d058>] (unwind_backtrace) from [<c010a218>] (show_stack+0x10/0x14)
    [   81.980221] [<c010a218>] (show_stack) from [<c072269c>] (mmc_remove_card+0xb8/0xbc)
    [   81.987985] [<c072269c>] (mmc_remove_card) from [<c0724060>] (mmc_detect+0x3c/0x6c)
    [   81.995728] [<c0724060>] (mmc_detect) from [<c07217a8>] (mmc_rescan+0xec/0x53c)
    [   82.003122] [<c07217a8>] (mmc_rescan) from [<c014139c>] (process_one_work+0x1b8/0x450)
    [   82.011138] [<c014139c>] (process_one_work) from [<c014168c>] (worker_thread+0x58/0x5cc)
    [   82.019332] [<c014168c>] (worker_thread) from [<c0146224>] (kthread+0x140/0x184)
    [   82.026815] [<c0146224>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
    [   82.034105] Exception stack(0xc2c4ffb0 to 0xc2c4fff8)
    [   82.039200] ffa0:                                     00000000 00000000 00000000 00000000
    [   82.047460] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [   82.055725] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000

  • Hi Pawel,

    Thanks for the log. It shows the mmc removal is called in a kernel WORK. Can you please add the following kernel patch and capture the failure again? The log would show the call stack which schedules the kernel WORK. The previous kernel debug patch is no longer needed, you can remove it if you want to.

    diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
    index eb82f6aac951..722651e445b6 100644
    --- a/drivers/mmc/core/core.c
    +++ b/drivers/mmc/core/core.c
    @@ -1466,6 +1466,7 @@ void _mmc_detect_change(struct mmc_host *host, unsigned long delay, bool cd_irq)
                    __pm_wakeup_event(host->ws, 5000);
    
            host->detect_change = 1;
    +       dump_stack();
            mmc_schedule_delayed_work(&host->detect, delay);
     }
    

  • Hi Pawel,

    Just an update on my side - I have done testing the following 3 scenarios on AM437x GP EVM, each ran for 12-24 hours, but didn't observe any reboot issue, no mmc timeout.

    - rootfs on NFS, mount/umount SD card 2nd partition (ext4);

    - rootfs on SD card, reboot -f;

    - rootfs on eMMC, reboot -f.

    I think we probably should look at your system in hardware perspective as well, but not sure where to start yet.

  • Hi Bin

    Thank you for the update, our test on AM437x GP EVM is planned for tomorrow. Regarding hardware I think we can provide you through the local support, the board schematic, Linux kernel config and dts file.

    I've added dump_stack() to the _mmc_detect_change function in core.c, but I see calls of this function only during startup. I don't see it in shutdown for both failure and normal modes:

    CCCCCCCC
    U-Boot SPL 2021.10-gd80bb749fa (Oct 04 2021 - 15:09:26 +0000)
    Trying to boot from MMC1
    
    
    U-Boot 2021.10-gd80bb749fa (Oct 04 2021 - 15:09:26 +0000)
    
    CPU  : AM437X-GP rev 1.2
    Model: xxx
    DRAM:  1 GiB
    MMC:   OMAP SD/MMC: 0
    Loading Environment from FAT... OK
    Loading Environment from nowhere... OK
    In:    serial@44e09000
    Out:   serial@44e09000
    Err:   serial@44e09000
    Net:   eth2: ethernet@4a100000
    Hit any key to stop autoboot:  0 
    The rootfspart variable was imported from FAT (2)
    12907008 bytes read in 730 ms (16.9 MiB/s)
    72376 bytes read in 7 ms (9.9 MiB/s)
    Kernel image @ 0x82000000 [ 0x000000 - 0xc4f200 ]
    ## Flattened Device Tree blob at 88000000
       Booting using the fdt blob at 0x88000000
       Loading Device Tree to bdf55000, end bdf69ab7 ... OK
    
    Starting kernel ...
    
    [    0.000000] Booting Linux on physical CPU 0x0
    [    0.000000] Linux version 5.10.153-g3eee621d16 (oe-user@oe-host) (arm-poky-linux-gnueabi-gcc (GCC) 9.5.0, GNU ld (GNU Binutils) 2.34.0.20200910) #1 PREEMPT Fri Dec 2 16:30:46 UTC 2022
    [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c53c7d
    [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
    [    0.000000] OF: fdt: Machine model: xxx
    [    0.000000] Memory policy: Data cache writeback
    [    0.000000] efi: UEFI not found.
    [    0.000000] cma: Reserved 48 MiB at 0xbac00000
    [    0.000000] Zone ranges:
    [    0.000000]   Normal   [mem 0x0000000080000000-0x00000000afffffff]
    [    0.000000]   HighMem  [mem 0x00000000b0000000-0x00000000bfffffff]
    [    0.000000] Movable zone start for each node
    [    0.000000] Early memory node ranges
    [    0.000000]   node   0: [mem 0x0000000080000000-0x00000000bfffffff]
    [    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x00000000bfffffff]
    [    0.000000] CPU: All CPU(s) started in SVC mode.
    [    0.000000] AM437x ES1.2 (neon)
    [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 260416
    [    0.000000] Kernel command line: console=ttyS0,115200n8 root=/dev/mmcblk1p2 rootwait ro
    [    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
    [    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
    [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
    [    0.000000] Memory: 961520K/1048576K available (9216K kernel code, 672K rwdata, 3088K rodata, 13312K init, 277K bss, 37904K reserved, 49152K cma-reserved, 212920K highmem)
    [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
    [    0.000000] rcu: Preemptible hierarchical RCU implementation.
    [    0.000000] rcu:     RCU event tracing is enabled.
    [    0.000000]  Trampoline variant of Tasks RCU enabled.
    [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
    [    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
    [    0.000000] L2C: platform modifies aux control register: 0x0e030000 -> 0x3e430000
    [    0.000000] L2C: DT/platform modifies aux control register: 0x0e030000 -> 0x3e430000
    [    0.000000] L2C-310 enabling early BRESP for Cortex-A9
    [    0.000000] OMAP L2C310: ROM does not support power control setting
    [    0.000000] L2C-310 dynamic clock gating disabled, standby mode disabled
    [    0.000000] L2C-310 cache controller enabled, 16 ways, 256 kB
    [    0.000000] L2C-310: CACHE_ID 0x410000c9, AUX_CTRL 0x4e430000
    [    0.000000] TI gptimer clocksource: always-on /ocp@44000000/interconnect@44c00000/segment@200000/target-module@31000
    [    0.000011] sched_clock: 32 bits at 25MHz, resolution 40ns, wraps every 85899345900ns
    [    0.000030] clocksource: dmtimer: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 76450417870 ns
    [    0.000160] OMAP clocksource: 32k_counter at 32768 Hz
    [    0.000176] clocksource: 32k_counter: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 58327039986419 ns
    [    0.000456] TI gptimer clockevent: 25000000 Hz at /ocp@44000000/interconnect@48000000/segment@0/target-module@40000
    [    0.001733] Console: colour dummy device 80x30
    [    0.001797] Calibrating delay loop... 1594.16 BogoMIPS (lpj=7970816)
    [    0.087723] pid_max: default: 32768 minimum: 301
    [    0.087955] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
    [    0.087984] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
    [    0.088984] CPU: Testing write buffer coherency: ok
    [    0.089053] CPU0: Spectre v2: using BPIALL workaround
    [    0.090156] Setting up static identity map for 0x80100000 - 0x80100060
    [    0.090330] rcu: Hierarchical SRCU implementation.
    [    0.090449] EFI services will not be available.
    [    0.091004] devtmpfs: initialized
    [    0.104130] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
    [    0.104425] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
    [    0.104453] futex hash table entries: 256 (order: -1, 3072 bytes, linear)
    [    0.108287] pinctrl core: initialized pinctrl subsystem
    [    0.109256] DMI not present or invalid.
    [    0.110083] NET: Registered protocol family 16
    [    0.111922] DMA: preallocated 256 KiB pool for atomic coherent allocations
    [    0.120434] thermal_sys: Registered thermal governor 'fair_share'
    [    0.120448] thermal_sys: Registered thermal governor 'bang_bang'
    [    0.120465] thermal_sys: Registered thermal governor 'step_wise'
    [    0.120473] thermal_sys: Registered thermal governor 'user_space'
    [    0.120926] cpuidle: using governor ladder
    [    0.120965] cpuidle: using governor menu
    [    0.945259] No ATAGs?
    [    0.945298] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
    [    0.945310] hw-breakpoint: maximum watchpoint size is 4 bytes.
    [    0.973740] iommu: Default domain type: Translated 
    [    0.976144] SCSI subsystem initialized
    [    0.976793] pps_core: LinuxPPS API ver. 1 registered
    [    0.976810] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
    [    0.976838] PTP clock support registered
    [    0.976907] EDAC MC: Ver: 3.0.0
    [    0.979709] clocksource: Switched to clocksource dmtimer
    [    0.987552] NET: Registered protocol family 2
    [    0.988385] IP idents hash table entries: 16384 (order: 5, 131072 bytes, linear)
    [    0.989897] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
    [    0.989959] TCP established hash table entries: 8192 (order: 3, 32768 bytes, linear)
    [    0.990033] TCP bind hash table entries: 8192 (order: 3, 32768 bytes, linear)
    [    0.990126] TCP: Hash tables configured (established 8192 bind 8192)
    [    0.990313] UDP hash table entries: 512 (order: 1, 8192 bytes, linear)
    [    0.990345] UDP-Lite hash table entries: 512 (order: 1, 8192 bytes, linear)
    [    0.990519] NET: Registered protocol family 1
    [    0.991204] RPC: Registered named UNIX socket transport module.
    [    0.991224] RPC: Registered udp transport module.
    [    0.991232] RPC: Registered tcp transport module.
    [    0.991239] RPC: Registered tcp NFSv4.1 backchannel transport module.
    [    1.080690] Initialise system trusted keyrings
    [    1.081004] workingset: timestamp_bits=14 max_order=18 bucket_order=4
    [    1.084999] squashfs: version 4.0 (2009/01/31) Phillip Lougher
    [    1.085937] NFS: Registering the id_resolver key type
    [    1.086015] Key type id_resolver registered
    [    1.086024] Key type id_legacy registered
    [    1.086111] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
    [    1.086133] nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver Registering...
    [    1.086177] ntfs: driver 2.1.32 [Flags: R/O].
    [    1.086981] Key type asymmetric registered
    [    1.087003] Asymmetric key parser 'x509' registered
    [    1.087065] bounce: pool size: 64 pages
    [    1.087124] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 244)
    [    1.087138] io scheduler mq-deadline registered
    [    1.087147] io scheduler kyber registered
    [    1.217042] OMAP GPIO hardware version 0.1
    [    1.312731] ti-sysc: probe of 44e31000.target-module failed with error -16
    [    1.358012] ti-sysc: probe of 48040000.target-module failed with error -16
    [    1.393839] omap-mailbox 480c8000.mailbox: omap mailbox rev 0x400
    [    1.513227] debugfs: Directory '49000000.dma' with parent 'dmaengine' already present!
    [    1.513269] edma 49000000.dma: TI EDMA DMA engine driver
    [    1.524852] pinctrl-single 44e10800.pinmux: 199 pins, size 796
    [    1.575227] Serial: 8250/16550 driver, 6 ports, IRQ sharing enabled
    [    1.578206] omap8250 44e09000.serial: No clock speed specified: using default: 48000000
    [    1.579169] printk: console [ttyS0] disabled
    [    1.579352] 44e09000.serial: ttyS0 at MMIO 0x44e09000 (irq = 56, base_baud = 3000000) is a 8250
    [    2.279539] printk: console [ttyS0] enabled
    [    2.284887] omap8250 481a6000.serial: No clock speed specified: using default: 48000000
    [    2.294026] 481a6000.serial: ttyS3 at MMIO 0x481a6000 (irq = 68, base_baud = 3000000) is a 8250
    [    2.303845] omap8250 481a8000.serial: No clock speed specified: using default: 48000000
    [    2.313009] 481a8000.serial: ttyS4 at MMIO 0x481a8000 (irq = 69, base_baud = 3000000) is a 8250
    [    2.322774] omap8250 481aa000.serial: No clock speed specified: using default: 48000000
    [    2.331906] 481aa000.serial: ttyS5 at MMIO 0x481aa000 (irq = 70, base_baud = 3000000) is a 8250
    [    2.344004] omap_rng 48310000.rng: Random Number Generator ver. 20
    [    2.350529] random: crng init done
    [    2.374717] loop: module loaded
    [    2.387688] tpm_tis_spi spi1.0: 2.0 TPM (device-id 0x1B, rev-id 22)
    [    2.449767] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6, bus freq 1000000
    [    2.466397] davinci_mdio 4a101000.mdio: phy[2]: device 4a101000.mdio:02, driver unknown
    [    2.475331] cpsw-switch 4a100000.switch: initialized cpsw ale version 1.4
    [    2.482263] cpsw-switch 4a100000.switch: ALE Table size 1024
    [    2.488058] cpsw-switch 4a100000.switch: cpts: overflow check period 500 (jiffies)
    [    2.495823] cpsw-switch 4a100000.switch: CPTS: ref_clk_freq:250000000 calc_mult:2147483648 calc_shift:29 error:0 nsec/sec
    [    2.506954] cpsw-switch 4a100000.switch: Detected MACID = 48:70:1e:bd:aa:6f
    [    2.514909] cpsw-switch 4a100000.switch: initialized (regs 0x4a100000, pool size 256) hw_ver:0019010F 1.15 (0)
    [    2.526630] rtc-pcf2123 spi2.0: spiclk 1000 KHz.
    [    2.532094] rtc-pcf2123 spi2.0: registered as rtc1
    [    2.537016] i2c /dev entries driver
    [    2.542927] device-mapper: ioctl: 4.43.0-ioctl (2020-10-01) initialised: dm-devel@redhat.com
    [    2.553146] cpuidle: enable-method property 'ti,am4372' found operations
    [    2.561734] sdhci: Secure Digital Host Controller Interface driver
    [    2.567952] sdhci: Copyright(c) Pierre Ossman
    [    2.573741] sdhci-pltfm: SDHCI platform and OF driver helper
    [    2.581544] omap_gpio 44e07000.gpio: Could not set line 6 debounce to 200000 microseconds (-22)
    [    2.593354] sdhci-omap 481d8000.mmc: supply vqmmc not found, using dummy regulator
    [    2.604996] NET: Registered protocol family 10
    [    2.610651] sdhci-omap 47810000.mmc: allocated mmc-pwrseq
    [    2.616211] sdhci-omap 47810000.mmc: supply vqmmc not found, using dummy regulator
    [    2.623918] sdhci-omap 48060000.mmc: Got CD GPIO
    [    2.628752] sdhci-omap 48060000.mmc: supply vqmmc not found, using dummy regulator
    [    2.650892] Segment Routing with IPv6
    [    2.654713] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
    [    2.670660] NET: Registered protocol family 17
    [    2.676075] Key type dns_resolver registered
    [    2.687579] omap_voltage_late_init: Voltage driver support not added
    [    2.694135] _mmc_detec_change
    [    2.694155] CPU: 0 PID: 59 Comm: kworker/u2:2 Not tainted 5.10.153-g3eee621d16 #1
    [    2.706193] Loading compiled-in X.509 certificates
    [    2.711072] Hardware name: Generic AM43 (Flattened Device Tree)
    [    2.717043] Workqueue: events_unbound async_run_entry_fn
    [    2.730027] _mmc_detec_change
    [    2.730052] CPU: 0 PID: 53 Comm: kworker/u2:1 Not tainted 5.10.153-g3eee621d16 #1
    [    2.741149] _mmc_detec_change
    [    2.741161] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.10.153-g3eee621d16 #1
    [    2.760197] [<c010d058>] (unwind_backtrace) from [<c010a218>] (show_stack+0x10/0x14)
    [    2.768016] [<c010a218>] (show_stack) from [<c09f1724>] (_mmc_detect_change+0x48/0x68)
    [    2.777538] Hardware name: Generic AM43 (Flattened Device Tree)
    [    2.783582] Hardware name: Generic AM43 (Flattened Device Tree)
    [    2.789552] Workqueue: events_unbound async_run_entry_fn
    [    2.794922] Workqueue: events_unbound async_run_entry_fn
    [    2.800364] [<c010d058>] (unwind_backtrace) from [<c010a218>] (show_stack+0x10/0x14)
    [    2.808214] [<c09f1724>] (_mmc_detect_change) from [<c07222f8>] (mmc_add_host+0x4c/0x6c)
    [    2.816445] [<c010d058>] (unwind_backtrace) from [<c010a218>] (show_stack+0x10/0x14)
    [    2.824279] [<c010a218>] (show_stack) from [<c09f1724>] (_mmc_detect_change+0x48/0x68)
    [    2.832487] [<c07222f8>] (mmc_add_host) from [<c07371b0>] (__sdhci_add_host+0x180/0x318)
    [    2.840799] [<c010a218>] (show_stack) from [<c09f1724>] (_mmc_detect_change+0x48/0x68)
    [    2.848806] [<c09f1724>] (_mmc_detect_change) from [<c07222f8>] (mmc_add_host+0x4c/0x6c)
    [    2.857019] tps65218-pwrbutton: Failed to locate of_node [id: -2]
    [    2.863185] [<c07371b0>] (__sdhci_add_host) from [<c07441dc>] (sdhci_omap_probe+0x308/0x614)
    [    2.871696] [<c09f1724>] (_mmc_detect_change) from [<c07222f8>] (mmc_add_host+0x4c/0x6c)
    [    2.880141] tps65218-gpio: Failed to locate of_node [id: -2]
    [    2.886243] [<c07222f8>] (mmc_add_host) from [<c07371b0>] (__sdhci_add_host+0x180/0x318)
    [    2.894472] [<c07441dc>] (sdhci_omap_probe) from [<c06191a0>] (platform_drv_probe+0x48/0x98)
    [    2.903001] [<c07222f8>] (mmc_add_host) from [<c07371b0>] (__sdhci_add_host+0x180/0x318)
    [    2.911231] [<c07371b0>] (__sdhci_add_host) from [<c07441dc>] (sdhci_omap_probe+0x308/0x614)
    [    2.919763] [<c06191a0>] (platform_drv_probe) from [<c0617044>] (really_probe+0xf0/0x398)
    [    2.928038] [<c07371b0>] (__sdhci_add_host) from [<c07441dc>] (sdhci_omap_probe+0x308/0x614)
    [    2.936584] [<c07441dc>] (sdhci_omap_probe) from [<c06191a0>] (platform_drv_probe+0x48/0x98)
    [    2.945408] [<c0617044>] (really_probe) from [<c0617458>] (driver_probe_device+0x5c/0xb4)
    [    2.953698] [<c07441dc>] (sdhci_omap_probe) from [<c06191a0>] (platform_drv_probe+0x48/0x98)
    [    2.962316] [<c06191a0>] (platform_drv_probe) from [<c0617044>] (really_probe+0xf0/0x398)
    [    2.970584] [<c0617458>] (driver_probe_device) from [<c0617500>] (__driver_attach_async_helper+0x50/0x54)
    [    2.980273] [<c06191a0>] (platform_drv_probe) from [<c0617044>] (really_probe+0xf0/0x398)
    [    2.988538] [<c0617044>] (really_probe) from [<c0617458>] (driver_probe_device+0x5c/0xb4)
    [    2.997069] [<c0617500>] (__driver_attach_async_helper) from [<c014a554>] (async_run_entry_fn+0x44/0x140)
    [    3.006785] [<c0617044>] (really_probe) from [<c0617458>] (driver_probe_device+0x5c/0xb4)
    [    3.015089] [<c0617458>] (driver_probe_device) from [<c0617500>] (__driver_attach_async_helper+0x50/0x54)
    [    3.024768] [<c0617458>] (driver_probe_device) from [<c0617500>] (__driver_attach_async_helper+0x50/0x54)
    [    3.034464] [<c014a554>] (async_run_entry_fn) from [<c014139c>] (process_one_work+0x1b8/0x450)
    [    3.043184] [<c0617500>] (__driver_attach_async_helper) from [<c014a554>] (async_run_entry_fn+0x44/0x140)
    [    3.052929] [<c0617500>] (__driver_attach_async_helper) from [<c014a554>] (async_run_entry_fn+0x44/0x140)
    [    3.062597] [<c014139c>] (process_one_work) from [<c014168c>] (worker_thread+0x58/0x5cc)
    [    3.071032] [<c014a554>] (async_run_entry_fn) from [<c014139c>] (process_one_work+0x1b8/0x450)
    [    3.079752] [<c014168c>] (worker_thread) from [<c0146224>] (kthread+0x140/0x184)
    [    3.087469] [<c014a554>] (async_run_entry_fn) from [<c014139c>] (process_one_work+0x1b8/0x450)
    [    3.096188] [<c014139c>] (process_one_work) from [<c014168c>] (worker_thread+0x58/0x5cc)
    [    3.104399] [<c0146224>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
    [    3.111737] [<c014139c>] (process_one_work) from [<c014168c>] (worker_thread+0x58/0x5cc)
    [    3.119934] [<c014168c>] (worker_thread) from [<c0146224>] (kthread+0x140/0x184)
    [    3.127404] Exception stack(0xc27fdfb0 to 0xc27fdff8)
    [    3.132803] [<c014168c>] (worker_thread) from [<c0146224>] (kthread+0x140/0x184)
    [    3.140314] dfa0:                                     00000000 00000000 00000000 00000000
    [    3.148587] [<c0146224>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
    [    3.155910] [<c0146224>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
    [    3.163457] Exception stack(0xc2641fb0 to 0xc2641ff8)
    [    3.168564] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [    3.176855] 1fa0:                                     00000000 00000000 00000000 00000000
    [    3.185108] Exception stack(0xc1c9ffb0 to 0xc1c9fff8)
    [    3.190244] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
    [    3.196929] ffa0:                                     00000000 00000000 00000000 00000000
    [    3.205373] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [    3.213692] mmc2: SDHCI controller on 47810000.mmc [47810000.mmc] using External DMA
    [    3.222108] omap_i2c 44e0b000.i2c: bus 0 rev0.12 at 100 kHz
    [    3.228579] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [    3.236893] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    [    3.244653] omap_i2c 4819c000.i2c: bus 2 rev0.12 at 100 kHz
    [    3.250672] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
    [    3.257382] mmc0: SDHCI controller on 48060000.mmc [48060000.mmc] using External DMA
    [    3.270214] mmc1: SDHCI controller on 481d8000.mmc [481d8000.mmc] using External DMA
    [    3.278034] omap_reset_deassert: timedout waiting for per:1
    [    3.284161] ti-sysc: probe of 54426000.target-module failed with error -110
    [    3.302504] omap_reset_deassert: timedout waiting for gfx:0
    [    3.308219] ti-sysc: probe of 5600fe00.target-module failed with error -110
    [    3.317177] mmc2: new high speed SDIO card at address 0001
    [    3.324516] input: gpio-keys as /devices/platform/gpio-keys/input/input0
    [    3.333478] cfg80211: Loading compiled-in X.509 certificates for regulatory database
    [    3.363002] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
    [    3.370205] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
    [    3.385537] cfg80211: failed to load regulatory.db
    [    3.392460] mmc0: new high speed SDHC card at address 1234
    [    3.403983] mmcblk0: mmc0:1234 SA08G 7.21 GiB 
    [    3.422091] Freeing unused kernel memory: 13312K
    [    3.426998]  mmcblk0: p1
    [    3.433939] mmc1: new high speed MMC card at address 0001
    [    3.440451] mmcblk1: mmc1:0001 MMC04G 3.66 GiB 
    [    3.445527] mmcblk1boot0: mmc1:0001 MMC04G partition 1 1.00 MiB
    [    3.451801] Run /init as init process
    [    3.468010] mmcblk1boot1: mmc1:0001 MMC04G partition 2 1.00 MiB
    [    3.490162] mmcblk1rpmb: mmc1:0001 MMC04G partition 3 128 KiB, chardev (243:0)
    [    3.512011]  mmcblk1: p1 p2 p3 p4
    [    3.770244] udevd[159]: starting version 3.2.9
    [    3.791252] udevd[160]: starting eudev-3.2.9
    [    5.527487] device-mapper: verity: sha256 using implementation "sha256-generic"
    [    6.028885] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
    INIT: version 2.96 booting
    [    7.699611] EXT4-fs (mmcblk1p4): recovery complete
    [    7.706368] EXT4-fs (mmcblk1p4): mounted filesystem with ordered data mode. Opts: (null)
    Starting udev
    [    8.030946] udevd[234]: starting version 3.2.9
    [    8.220317] udevd[234]: specified user 'tss' unknown
    [    8.225714] udevd[234]: specified group 'tss' unknown
    [    8.237369] udevd[235]: starting eudev-3.2.9
    [    8.563942] omap_wdt: OMAP Watchdog Timer Rev 0x01: initial timeout 60 sec
    [    8.589156] omap_rtc 44e3e000.rtc: already running
    [    8.640516] omap_rtc 44e3e000.rtc: registered as rtc0
    [    8.645641] omap_rtc 44e3e000.rtc: setting system clock to 2022-12-09T15:24:21 UTC (1670599461)
    [    8.703308] wfx: loading out-of-tree module taints kernel.
    [    8.744476] wfx: Silicon Labs 2.13.2
    [    9.567018] wfx-sdio mmc2:0001:1: started firmware 3.16.0 "WF200_ASIC_WFM_(Jenkins)_FW3.16.0" (API: 3.11, keyset: C0, caps: 0x00000000)
    [    9.598088] omap-sham 53100000.sham: hw accel on OMAP rev 0.0
    [    9.633200] omap-aes 53501000.aes: OMAP AES hw accel rev: 0.1
    [    9.663661] omap-des 53701000.des: OMAP DES hw accel rev: 0.33
    [    9.686138] omap-sham 53100000.sham: will run requests pump with realtime priority
    [    9.700153] omap-aes 53501000.aes: will run requests pump with realtime priority
    [    9.722520] omap-des 53701000.des: will run requests pump with realtime priority
    [    9.745186] wfx-sdio mmc2:0001:1: MAC address 0: 84:fd:27:9e:d2:93
    [    9.766211] wfx-sdio mmc2:0001:1: MAC address 1: 84:fd:27:9e:d2:94
    [   10.804500] usbcore: registered new interface driver usbfs
    [   10.811207] usbcore: registered new interface driver hub
    [   10.817582] usbcore: registered new device driver usb
    [   10.862657] xhci-hcd xhci-hcd.4.auto: xHCI Host Controller
    [   10.868354] xhci-hcd xhci-hcd.4.auto: new USB bus registered, assigned bus number 1
    [   10.876884] xhci-hcd xhci-hcd.4.auto: hcc params 0x0238f06d hci version 0x100 quirks 0x0000002002010010
    [   10.892274] xhci-hcd xhci-hcd.4.auto: irq 100, io mem 0x483d0000
    [   10.898730] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.10
    [   10.907112] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
    [   10.914431] usb usb1: Product: xHCI Host Controller
    [   10.919333] usb usb1: Manufacturer: Linux 5.10.153-g3eee621d16 xhci-hcd
    [   10.926517] usb usb1: SerialNumber: xhci-hcd.4.auto
    [   10.933101] hub 1-0:1.0: USB hub found
    [   10.936944] hub 1-0:1.0: 1 port detected
    [   10.944426] xhci-hcd xhci-hcd.4.auto: xHCI Host Controller
    [   10.950091] xhci-hcd xhci-hcd.4.auto: new USB bus registered, assigned bus number 2
    [   10.957808] xhci-hcd xhci-hcd.4.auto: Host supports USB 3.0 SuperSpeed
    [   10.964540] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
    [   10.972937] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.10
    [   10.981296] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
    [   10.988554] usb usb2: Product: xHCI Host Controller
    [   10.993467] usb usb2: Manufacturer: Linux 5.10.153-g3eee621d16 xhci-hcd
    [   11.000753] usb usb2: SerialNumber: xhci-hcd.4.auto
    [   11.007153] hub 2-0:1.0: USB hub found
    [   11.011074] hub 2-0:1.0: 1 port detected
    [   11.106896] EXT4-fs (dm-0): re-mounted. Opts: (null)
    INIT: Entering runlevel: 5
    Configuring network interfaces... [   13.099698] cpsw-switch 4a100000.switch: starting ndev. mode: dual_mac
    [   13.118873] Generic PHY 4a101000.mdio:02: attached PHY driver [Generic PHY] (mii_bus:phy_addr=4a101000.mdio:02, irq=POLL)
    [   13.133287] cpsw-switch 4a100000.switch eth0: Link is Up - 100Mbps/Full - flow control off
    Configuration file: /etc/hostapd.conf
    Using interface wlan1 with hwaddr 84:fd:27:9e:d2:94 and ssid "xxx"
    WPS: Converting push_button to virtual_push_button for WPS 2.0 compliance
    [   13.864347] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
    [   14.010232] cryptd: max_cpu_qlen set to 1000
    wlan1: interface state UNINITIALIZED->ENABLED
    wlan1: AP-ENABLED 
    Running hostapd
    done.
    Starting system message bus: dbus.
    Starting random number generator daemon.
    Starting OpenBSD Secure Shell server: sshd
    done.
    Starting rpcbind daemon...done.
    Starting ntpd: done
    Starting syslogd/klogd: done
    Starting firewall
    Switching on green LED blinking
    INIT: Switching to runlevel: 6
    INIT: Sending processes configured via /etc/inittab the TERM signal
    Stopping OpenBSD Secure Shell server: sshd.
    Stopping system message bus: dbus.
    Stopping ntpd: done
    Stopping syslogd/klogd: done
    Stopping random number generator daemon.
    Unmounting remote filesystems...
    Stopping firewall
    Stopping rpcbind daemon...
    done.
    swupdate is not running; none killed.
    Deconfiguring network interfaces... [   37.500582] cpsw-switch 4a100000.switch eth0: Link is Down
    done.
    LED blinking not active
    Sending all processes the TERM signal...
    Sending all processes the KILL signal...
    Deactivating swap...
    Unmounting local filesystems...
    [   42.909098] EXT4-fs (dm-0): re-mounted. Opts: (null)
    Rebooting... [   53.599752] mmc1: Timeout waiting for hardware cmd interrupt.
    [   53.605564] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   53.612051] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   53.618534] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   53.625016] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   53.631497] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   53.637977] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   53.644457] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   53.650937] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   53.657418] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   53.663897] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   53.670377] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   53.676857] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   53.683336] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   53.689815] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   53.696293] mmc1: sdhci: Host ctl2: 0x00000000
    [   53.700764] mmc1: sdhci: ============================================
    [   53.736780] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   53.774487] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   63.839751] mmc1: Timeout waiting for hardware cmd interrupt.
    [   63.845552] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   63.852036] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   63.858518] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   63.865000] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   63.871480] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   63.877959] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   63.884438] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   63.890918] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   63.897397] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   63.903877] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   63.910357] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   63.916835] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   63.923314] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   63.929793] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   63.936270] mmc1: sdhci: Host ctl2: 0x00000000
    [   63.940741] mmc1: sdhci: ============================================
    [   63.976489] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   64.014189] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   74.079759] mmc1: Timeout waiting for hardware cmd interrupt.
    [   74.085553] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   74.092039] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   74.098521] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   74.105002] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   74.111483] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   74.117963] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   74.124443] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   74.130923] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   74.137403] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   74.143882] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   74.150362] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   74.156841] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   74.163320] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   74.169799] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   74.176276] mmc1: sdhci: Host ctl2: 0x00000000
    [   74.180747] mmc1: sdhci: ============================================
    [   74.216570] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   74.254271] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   84.329727] mmc1: Timeout waiting for hardware cmd interrupt.
    [   84.335520] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
    [   84.342003] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
    [   84.348486] mmc1: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
    [   84.354967] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000000
    [   84.361448] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
    [   84.367927] mmc1: sdhci: Power:     0x00000000 | Blk gap:  0x00000000
    [   84.374406] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000000
    [   84.380884] mmc1: sdhci: Timeout:   0x00000000 | Int stat: 0x00000000
    [   84.387364] mmc1: sdhci: Int enab:  0x007f0003 | Sig enab: 0x007f0003
    [   84.393843] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
    [   84.400322] mmc1: sdhci: Caps:      0x05e10080 | Caps_1:   0x00000000
    [   84.406801] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
    [   84.413280] mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
    [   84.419758] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
    [   84.426235] mmc1: sdhci: Host ctl2: 0x00000000
    [   84.430706] mmc1: sdhci: ============================================
    [   84.466485] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   84.504186] sdhci-omap 481d8000.mmc: Timeout waiting on controller reset in sdhci_omap_reset
    [   84.512837] mmc1: card 0001 removed
    
    

  • Hi Pawel,

    Thanks for the update.

    I probably didn't review the kernel code correctly and placed dump_stack() at a wrong place. Let me try to look it again.

  • Pawel,

    While we continue debugging this issue. Do you consider to enable the watchdog timer (if you haven't done it) so that it will reset the system and reboot in case this type of lockup happens?

  • Hi Bin

    We were considering the watchdog timer as a workaround solution when the issue was observed only during shutdown/reboot. When it happened also during mounting of SD card it was discarded. System and other applications are still working then (excluding access to the SD card). The issue is when you try to restart the system. Each running of reboot command is only adding another reboot process to the list of running processes. If the system will be restarted with the watchdog while all applications are working we have a risk of filesystem or database corruption.

  • Agreed, it is not ideal to issue warm reset during the Linux operation. I will continue reviewing the kernel code and provide another debug patch.

  • Hi Andreas, Hi Bin Liu

    I still investigate this issue but from hardware side, on the board  "ARM MPU AM4378 General Purpose EV"  from  TI schematis I see the reset control pin on the eMMC is done via port B3 from Sitara CPU, is this signal on that port B3 is controled via sdhci-omap driver ?

    Also regarding into topic below I suppose that we must have a reset pin from eMMC assigned in pinux for Sitara, please correct me if I'm wrong.

    e2e.ti.com/.../tmdsevm437x-pin-conflict-for-gpmc-wpn

  • Hi Bin

    We have already results from reboot test on the AM437x HS EVM. Timeout issue for the sdhci-omap didn't happen, but test was failed with a timeout on the omap_i2c (communication with PMIC). Device was halted and terminal was blocked. Do you think that these two things can be somehow connected? In parallel we're continue investigation on the hardware side for our board.

    [53265.063613] tps65218 0-0024: Write for reg Oxla failed 
    [53265.068818] cpu cpu0: _set_opp_voltage: failed to set voltage (931000 950000 969000 mV): -110 
    [53265.077523] cpufreq: _target_index: Failed to change cpu frequency: -110 
    [53268.053590] omap_i2c 44e0b000.i2c: controller timed out 
    [53270.083536] omap_i2c 44e0b000.i2c: controller timed out 
    [53271.053612] tps65218 0-0024: Write for reg 0x17 failed 
    [53273.053592] omap_i2c 44e0b000.i2c: controller timed out 
    [53273.083544] tps65218 0-0024: Write for reg Oxla failed 
    [53276.053590] omap_i2c 44e0b000.i2c: controller timed out 
    [53278.083534] omap_i2c 44e0b000.i2c: controller timed out 
    [53279.053609] tps65218 0-0024: Write for reg 0x17 failed 
    [53281.053596] omap_i2c 44e0b000.i2c: controller timed out 
    [53281.083546] tps65218 0-0024: Write for reg Oxla failed 
    [53281.088744] cpu cpu0: _set_opp_voltage: failed to set voltage (931000 950000 969000 mV): -110 
    [53281.097442] cpufreq: _target_index: Failed to change cpu frequency: -110 
    [53284.053597] omap_i2c 44e0b000.i2c: controller timed out 
    [53286.083533] omap_i2c 44e0b000.i2c: controller timed out 
    [53287.053611] tps65218 0-0024: Write for reg 0x17 failed 
    [53289.053600] omap_i2c 44e0b000.i2c: controller timed out 
    [53289.083549] tps65218 0-0024: Write for reg Oxla failed 
    [53292.053600] omap_i2c 44e0b000.i2c: controller timed out 
    [53294.083531] omap_i2c 44e0b000.i2c: controller timed out 
    [53295.053614] tps65218 0-0024: Write for reg 0x17 failed
    

  • Hi Damina,

    I just came back from the holiday break and am catching up the work. I will review your question soon about eMMC reset and get back to you.

    Hi Pawel,

    I am not sure if this i2c error is related. But can you please check the default cpufreq governor? If it is ondemand, can you please set it to performance or userspace to see if this i2c timeout error still happens?

  • Hi, everyone.

    I'm working with Pawel and Damian in the same team and also investigating the issue. I've found the timeout is happening, because there is no response to the CMD13 (MMC_SEND_STATUS). Increasing the timeout and number of retries doesn't change anything. Furthermore the timeout occurs only after suspending the eMMC - that's why it's not getting the response. Upon deeper analysis, I discovered that the CMD13 sent by __mmc_send_status() method in drivers/mmc/core/mmc_ops.c is executed from _mmc_detect_card_removed() method in drivers/mmc/core/core.c.

    This shouldn't be happening, because our eMMC is non-removable. Manually setting the MMC_CAP_NONREMOVABLE flag to the mmc1 is probably a fix to our issue. I haven't noticed a single timeout since I've been rebooting in a loop for almost a day.

    In our DTS we're using `ti,non-removable` device property, the same as it is in am437x-gp-evm.dts, which I find is not read properly by the driver. However, `non-removable` property is working correctly. Which one should we use? Is `ti,non-removable` really deprecated as mentioned in this patch: https://patchwork.kernel.org/project/linux-omap/patch/20210921110029.21944-5-tony@atomide.com/ ?

  • Hi Dawid,

    Good catch.

    But seems you are referring to an old kernel version of am437x-gp-evm.dts: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/arch/arm/boot/dts/am437x-gp-evm.dts, which is v4.4.

    Long story short, in kernel v5.10 which is used in your project, you should use 'non-removable'.

    Full story, in old kernel versions, AM437x uses MMC driver omap_hsmmc.c (DTS compatible = "ti,omap4-hsmmc"), which uses 'ti,non-removable' for none removable devices. In 2020, AM437x kernel is updated to use MMC driver sdhci-omap.c (compatible = "ti,am437-sdhci"), which uses 'non-removable'.

  • Hi Bin and Dawid

    Today I regained access to the reply function on the forum :)

    @Dawid thank you for the investigation and verification. Looking on current test results it seems that issue is solved.

    @Bin thank you for the clarification. We copied the ti,non-removable flag from an older project and then it was verified in the documentation of the Linux kernel 5.10, but looking to the ti-omap-hsmmc.txt which as you wrote is not used in 5.10. When comparing to the am437x-gp-evm.dts we looked on the master branch and didn't notice that Linux kernel 4.4 is used there. On the ti-linux-5.10.y branch used in linux-ti-staging_5.10.bb from the meta-ti, the non-removable flag is used correctly. 

    Regarding i2c error, after changing the governor to performance, the AM437x EVM is also halted on a reboot after few hours of test. But this issue doesn't happen on our board. I propose that I will create a separate topic in a case when it will be observed on our board. Below is the startup/shutdown log in the failure mode:

    U-Boot SPL 2021.10-rc2-gb9cb74a5aa (Aug 27 2021 - 12:33:02 +0000)
    Trying to boot from MMC1
    Authentication passed
    Authentication passed
    
    
    U-Boot 2021.10-rc2-gb9cb74a5aa (Aug 27 2021 - 12:33:02 +0000)SCU200
    
    CPU  : AM437X-HS rev 1.2
    Model: TI AM437x GP EVM
    DRAM:  2 GiB
    PMIC:  TPS65218
    MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1
    Loading Environment from FAT... *** Error - No Valid Environment Area found
    *** Warning - bad env area, using default environment
    
    Loading Environment from nowhere... OK
    In:    serial@44e09000
    Out:   serial@44e09000
    Err:   serial@44e09000
    Net:   eth2: ethernet@4a100000
    Hit any key to stop autoboot:  0
    4559360 bytes read in 245 ms (17.7 MiB/s)
    78978 bytes read in 8 ms (9.4 MiB/s)
    Kernel image @ 0x82000000 [ 0x000000 - 0x459200 ]
    ## Flattened Device Tree blob at 88000000
       Booting using the fdt blob at 0x88000000
       Loading Device Tree to f9f51000, end f9f67481 ... OK
    
    Starting kernel ...
    
    [    0.000000] Booting Linux on physical CPU 0x0
    [    0.000000] Linux version 5.10.158scu200-g14d010266b (oe-user@oe-host) (arm-poky-linux-gnueabi-gcc (GCC) 9.5.0, GNU ld (GNU Binutils) 2.34.0.20200910) #1 PREEMPT Sat Dec 10 00:44:06 UTC 2022
    [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c53c7d
    [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
    [    0.000000] OF: fdt: Machine model: TI AM437x GP EVM
    [    0.000000] Memory policy: Data cache writeback
    [    0.000000] efi: UEFI not found.
    [    0.000000] cma: Reserved 48 MiB at 0xfcc00000
    [    0.000000] Zone ranges:
    [    0.000000]   Normal   [mem 0x0000000080000000-0x00000000afffffff]
    [    0.000000]   HighMem  [mem 0x00000000b0000000-0x00000000ffffefff]
    [    0.000000] Movable zone start for each node
    [    0.000000] Early memory node ranges
    [    0.000000]   node   0: [mem 0x0000000080000000-0x00000000bdafffff]
    [    0.000000]   node   0: [mem 0x00000000bdb00000-0x00000000bfafffff]
    [    0.000000]   node   0: [mem 0x00000000bfb00000-0x00000000ffffefff]
    [    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x00000000ffffefff]
    [    0.000000] CPU: All CPU(s) started in SVC mode.
    [    0.000000] AM437x ES1.2 (neon)
    [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 522559
    [    0.000000] Kernel command line: console=ttyS0,115200n8 root=/dev/mmcblk0p2 rootwait rw
    [    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
    [    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
    [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
    [    0.000000] Memory: 1980384K/2097148K available (9216K kernel code, 667K rwdata, 2976K rodata, 1024K init, 275K bss, 67612K reserved, 49152K cma-reserved, 1228716K highmem)
    [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
    [    0.000000] rcu: Preemptible hierarchical RCU implementation.
    [    0.000000] rcu:     RCU event tracing is enabled.
    [    0.000000]  Trampoline variant of Tasks RCU enabled.
    [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
    [    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
    [    0.000000] L2C: platform modifies aux control register: 0x0e030000 -> 0x3e430000
    [    0.000000] L2C: DT/platform modifies aux control register: 0x0e030000 -> 0x3e430000
    [    0.000000] L2C-310 enabling early BRESP for Cortex-A9
    [    0.000000] OMAP L2C310: ROM does not support power control setting
    [    0.000000] L2C-310 dynamic clock gating disabled, standby mode disabled
    [    0.000000] L2C-310 cache controller enabled, 16 ways, 256 kB
    [    0.000000] L2C-310: CACHE_ID 0x410000c9, AUX_CTRL 0x4e430000
    [    0.000000] TI gptimer clocksource: always-on /ocp@44000000/interconnect@44c00000/segment@200000/target-module@31000
    [    0.000010] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
    [    0.000027] clocksource: dmtimer: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
    [    0.000159] OMAP clocksource: 32k_counter at 32768 Hz
    [    0.000176] clocksource: 32k_counter: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 58327039986419 ns
    [    0.000477] TI gptimer clockevent: 24000000 Hz at /ocp@44000000/interconnect@48000000/segment@0/target-module@40000
    [    0.001799] Console: colour dummy device 80x30
    [    0.001868] Calibrating delay loop... 1594.16 BogoMIPS (lpj=7970816)
    [    0.087538] pid_max: default: 32768 minimum: 301
    [    0.087766] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
    [    0.087792] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
    [    0.088803] CPU: Testing write buffer coherency: ok
    [    0.088874] CPU0: Spectre v2: using BPIALL workaround
    [    0.089959] Setting up static identity map for 0x80100000 - 0x80100060
    [    0.090137] rcu: Hierarchical SRCU implementation.
    [    0.090264] EFI services will not be available.
    [    0.091009] devtmpfs: initialized
    [    0.105056] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
    [    0.105348] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
    [    0.105375] futex hash table entries: 256 (order: -1, 3072 bytes, linear)
    [    0.109214] pinctrl core: initialized pinctrl subsystem
    [    0.110181] DMI not present or invalid.
    [    0.111018] NET: Registered protocol family 16
    [    0.112885] DMA: preallocated 256 KiB pool for atomic coherent allocations
    [    0.122080] thermal_sys: Registered thermal governor 'fair_share'
    [    0.122092] thermal_sys: Registered thermal governor 'bang_bang'
    [    0.122109] thermal_sys: Registered thermal governor 'step_wise'
    [    0.122116] thermal_sys: Registered thermal governor 'user_space'
    [    0.122615] cpuidle: using governor ladder
    [    0.122655] cpuidle: using governor menu
    [    1.215913] No ATAGs?
    [    1.215949] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
    [    1.215959] hw-breakpoint: maximum watchpoint size is 4 bytes.
    [    1.248181] iommu: Default domain type: Translated
    [    1.250844] SCSI subsystem initialized
    [    1.251535] pps_core: LinuxPPS API ver. 1 registered
    [    1.251550] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
    [    1.251576] PTP clock support registered
    [    1.251640] EDAC MC: Ver: 3.0.0
    [    1.254295] clocksource: Switched to clocksource dmtimer
    [    1.262691] NET: Registered protocol family 2
    [    1.263118] IP idents hash table entries: 16384 (order: 5, 131072 bytes, linear)
    [    1.264632] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
    [    1.264692] TCP established hash table entries: 8192 (order: 3, 32768 bytes, linear)
    [    1.264770] TCP bind hash table entries: 8192 (order: 3, 32768 bytes, linear)
    [    1.264863] TCP: Hash tables configured (established 8192 bind 8192)
    [    1.265052] UDP hash table entries: 512 (order: 1, 8192 bytes, linear)
    [    1.265085] UDP-Lite hash table entries: 512 (order: 1, 8192 bytes, linear)
    [    1.265252] NET: Registered protocol family 1
    [    1.265926] RPC: Registered named UNIX socket transport module.
    [    1.265944] RPC: Registered udp transport module.
    [    1.265951] RPC: Registered tcp transport module.
    [    1.265958] RPC: Registered tcp NFSv4.1 backchannel transport module.
    [    1.267998] Initialise system trusted keyrings
    [    1.268338] workingset: timestamp_bits=14 max_order=19 bucket_order=5
    [    1.272353] squashfs: version 4.0 (2009/01/31) Phillip Lougher
    [    1.273274] NFS: Registering the id_resolver key type
    [    1.273359] Key type id_resolver registered
    [    1.273368] Key type id_legacy registered
    [    1.273456] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
    [    1.273478] nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver Registering...
    [    1.273517] ntfs: driver 2.1.32 [Flags: R/O].
    [    1.274470] Key type asymmetric registered
    [    1.274492] Asymmetric key parser 'x509' registered
    [    1.274555] bounce: pool size: 64 pages
    [    1.274621] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 244)
    [    1.274634] io scheduler mq-deadline registered
    [    1.274643] io scheduler kyber registered
    [    1.581493] ti-sysc: probe of 44e31000.target-module failed with error -16
    [    1.650245] ti-sysc: probe of 48040000.target-module failed with error -16
    [    1.665532] OMAP GPIO hardware version 0.1
    [    1.705234] omap-mailbox 480c8000.mailbox: omap mailbox rev 0x400
    [    1.977770] debugfs: Directory '49000000.dma' with parent 'dmaengine' already present!
    [    1.977810] edma 49000000.dma: TI EDMA DMA engine driver
    [    2.024224] pinctrl-single 44e10800.pinmux: 199 pins, size 796
    [    2.027645] pwm-backlight backlight: supply power not found, using dummy regulator
    [    2.077644] Serial: 8250/16550 driver, 10 ports, IRQ sharing enabled
    [    2.081798] omap8250 44e09000.serial: No clock speed specified: using default: 48000000
    [    2.082740] printk: console [ttyS0] disabled
    [    2.082922] 44e09000.serial: ttyS0 at MMIO 0x44e09000 (irq = 56, base_baud = 3000000) is a 8250
    [    2.804113] printk: console [ttyS0] enabled
    [    2.809460] omap8250 481a6000.serial: No clock speed specified: using default: 48000000
    [    2.818601] 481a6000.serial: ttyS3 at MMIO 0x481a6000 (irq = 70, base_baud = 3000000) is a 8250
    [    2.830803] omap_rng 48310000.rng: Random Number Generator ver. 20
    [    2.837510] random: crng init done
    [    2.842922] omapdss_dss 4832a000.dss: supply vdda_video not found, using dummy regulator
    [    2.858565] panel-simple display: supply power not found, using dummy regulator
    [    2.880775] loop: module loaded
    [    2.954317] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6, bus freq 1000000
    [    2.967930] davinci_mdio 4a101000.mdio: phy[0]: device 4a101000.mdio:00, driver unknown
    [    2.976802] cpsw-switch 4a100000.switch: initialized cpsw ale version 1.4
    [    2.983632] cpsw-switch 4a100000.switch: ALE Table size 1024
    [    2.989530] cpsw-switch 4a100000.switch: cpts: overflow check period 500 (jiffies)
    [    2.997203] cpsw-switch 4a100000.switch: CPTS: ref_clk_freq:250000000 calc_mult:2147483648 calc_shift:29 error:0 nsec/sec
    [    3.008306] cpsw-switch 4a100000.switch: Detected MACID = 50:65:83:d5:c1:c9
    [    3.016240] cpsw-switch 4a100000.switch: initialized (regs 0x4a100000, pool size 256) hw_ver:0019010F 1.15 (0)
    [    3.027600] i2c /dev entries driver
    [    3.033582] device-mapper: ioctl: 4.43.0-ioctl (2020-10-01) initialised: dm-devel@redhat.com
    [    3.043893] cpuidle: enable-method property 'ti,am4372' found operations
    [    3.051628] sdhci: Secure Digital Host Controller Interface driver
    [    3.057895] sdhci: Copyright(c) Pierre Ossman
    [    3.063650] sdhci-pltfm: SDHCI platform and OF driver helper
    [    3.072127] sdhci-omap 481d8000.mmc: supply vqmmc not found, using dummy regulator
    [    3.083917] NET: Registered protocol family 10
    [    3.089455] sdhci-omap 47810000.mmc: supply vqmmc not found, using dummy regulator
    [    3.106001] Segment Routing with IPv6
    [    3.109830] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
    [    3.117177] NET: Registered protocol family 17
    [    3.122075] Key type dns_resolver registered
    [    3.126491] mmc0: SDHCI controller on 481d8000.mmc [481d8000.mmc] using External DMA
    [    3.135405] omap_voltage_late_init: Voltage driver support not added
    [    3.142963] Loading compiled-in X.509 certificates
    [    3.164696] gpio-119 (SelEMMCorNAND): hogged as output/high
    [    3.206022] tps65218-pwrbutton: Failed to locate of_node [id: -2]
    [    3.212474] tps65218-gpio: Failed to locate of_node [id: -2]
    [    3.223325] mmc0: new high speed MMC card at address 0001
    [    3.230077] mmcblk0: mmc0:0001 MMC04G 3.60 GiB
    [    3.235985] mmcblk0boot0: mmc0:0001 MMC04G partition 1 16.0 MiB
    [    3.242623] mmcblk0boot1: mmc0:0001 MMC04G partition 2 16.0 MiB
    [    3.248969] mmcblk0rpmb: mmc0:0001 MMC04G partition 3 128 KiB, chardev (243:0)
    [    3.260471]  mmcblk0: p1 p2
    [    3.264523] omap_i2c 44e0b000.i2c: bus 0 rev0.12 at 100 kHz
    [    3.273878] omap_i2c 4802a000.i2c: bus 1 rev0.12 at 100 kHz
    [    3.280379] gpio-136 (SelLCDorHDMI): hogged as output/high
    [    3.299223] omap_reset_deassert: timedout waiting for gfx:0
    [    3.304915] ti-sysc: probe of 5600fe00.target-module failed with error -110
    [    3.312745] omapdss_dss 4832a000.dss: supply vdda_video not found, using dummy regulator
    [    3.321205] DSS: OMAP DSS rev 2.0
    [    3.327061] omapdss_dss 4832a000.dss: bound 4832a400.dispc (ops dispc_component_ops)
    [    3.335404] omapdrm omapdrm.0: DMM not available, disable DMM support
    [    3.409350] Console: switching to colour frame buffer device 100x30
    [    3.432419] omapdrm omapdrm.0: [drm] fb0: omapdrmdrmfb frame buffer device
    [    3.440120] [drm] Initialized omapdrm 1.0.0 20110917 for omapdrm.0 on minor 0
    [    3.450106] omap_gpio 44e07000.gpio: Could not set line 6 debounce to 200000 microseconds (-22)
    [    3.460446] sdhci-omap 47810000.mmc: supply vqmmc not found, using dummy regulator
    [    3.472484] sdhci-omap 48060000.mmc: Got CD GPIO
    [    3.477975] sdhci-omap 48060000.mmc: supply vqmmc not found, using dummy regulator
    [    3.502955] mmc2: SDHCI controller on 47810000.mmc [47810000.mmc] using External DMA
    [    3.511127] mmc1: SDHCI controller on 48060000.mmc [48060000.mmc] using External DMA
    [    3.551598] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)
    [    3.559995] mmc1: new high speed SDHC card at address 59b4
    [    3.565729] VFS: Mounted root (ext4 filesystem) on device 179:2.
    [    3.572851] mmcblk1: mmc1:59b4 USDU1 7.51 GiB
    [    3.579717] devtmpfs: mounted
    [    3.582942]  mmcblk1: p1 p2
    [    3.588117] Freeing unused kernel memory: 1024K
    [    3.614552] Run /sbin/init as init process
    INIT: version 2.96 booting
    Starting udev
    [    4.328442] udevd[102]: starting version 3.2.9
    [    4.435884] udevd[103]: starting eudev-3.2.9
    [    4.819742] omap_rtc 44e3e000.rtc: already running
    [    4.844621] omap_wdt: OMAP Watchdog Timer Rev 0x01: initial timeout 60 sec
    [    4.887734] omap_rtc 44e3e000.rtc: registered as rtc0
    [    4.892865] omap_rtc 44e3e000.rtc: setting system clock to 2018-03-13T02:54:23 UTC (1520909663)
    [    4.971251] omap_i2c 4802a000.i2c: Arbitration lost
    [    5.550290] omap-sham 53100000.sham: hw accel on OMAP rev 0.0
    [    5.557092] omap_aes_driver: Unknown symbol aes_expandkey (err -2)
    [    5.563541] omap_aes_driver: Unknown symbol aes_encrypt (err -2)
    [    5.581962] omap-des 53701000.des: OMAP DES hw accel rev: 0.33
    [    5.631772] omap-sham 53100000.sham: will run requests pump with realtime priority
    [    5.643265] omap-des 53701000.des: will run requests pump with realtime priority
    [    5.666939] input: matrix_keypad0 as /devices/platform/matrix_keypad0/input/input1
    [    7.064295] sched: RT throttling activated
    [    7.068848] pixcir_ts 1-005c: pixcir_int_enable: can't write reg 52 : -11
    [    7.090371] pixcir_ts 1-005c: Failed to disable interrupt generation: -11
    [   20.104057] OF: graph: no port node found in /ocp@44000000/interconnect@48000000/segment@300000/target-module@a8000/ocp2scp@0/phy@8000
    [   67.097558] udevd[103]: worker [109] /devices/platform/44000000.ocp/48000000.interconnect/48000000.interconnect:segment@0/4802a000.target-module/4802a000.i2c/i2c-1/1-005c is taking a long time
    [   80.074369] udevd[103]: worker [108] /devices/platform/44000000.ocp/48000000.interconnect/48000000.interconnect:segment@300000/48380000.target-module/48380000.omap_dwc3/48390000.usb is taking a long time
    [   80.092594] udevd[103]: worker [114] /devices/platform/44000000.ocp/48000000.interconnect/48000000.interconnect:segment@300000/483c0000.target-module/483c0000.omap_dwc3/483d0000.usb is taking a long time
    [  127.085122] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
    INIT: Entering runlevel: 5
    Configuring network interfaces... ip: can't find device 'wlan1'
    ip: SIOCGIFFLAGS: No such device
    [  148.072429] cpsw-switch 4a100000.switch: starting ndev. mode: dual_mac
    [  148.089805] Generic PHY 4a101000.mdio:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=4a101000.mdio:00, irq=POLL)
    Starting rpcbind daemon...done.
    Starting syslogd/klogd: done
    Starting programming
    Programming done.
    INIT: Switching to runlevel: 6
    INIT: Sending processes configured via /etc/inittab the TERM signal
    Stopping syslogd/klogd: done
    Unmounting remote filesystems...
    Stopping nanotap_server_bin
    No process in pidfile '/var/run/nanotap_server_bin.pid' found running; none killed.
    Stopping dpcmtpm20
    No process in pidfile '/var/run/dpcmtpm20.pid' found running; none killed.
    Stopping rpcbind daemon...
    done.
    Deconfiguring network interfaces... ifdown: interface wlan1 not configured
    done.
    Sending all processes the TERM signal...
    Sending all processes the KILL signal...
    Deactivating swap...
    Unmounting local filesystems...
    [  173.054480] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
    Rebooting...

    Especially please look to these lines, where you can see that system is stopped for some time during startup, before failure during reboot:

    [ 7.090371] pixcir_ts 1-005c: Failed to disable interrupt generation: -11
    [ 20.104057] OF: graph: no port node found in /ocp@44000000/interconnect@48000000/segment@300000/target-module@a8000/ocp2scp@0/phy@8000
    [ 67.097558] udevd[103]: worker [109] /devices/platform/44000000.ocp/48000000.interconnect/48000000.interconnect:segment@0/4802a000.target-module/4802a000.i2c/i2c-1/1-005c is taking a long time
    [ 80.074369] udevd[103]: worker [108] /devices/platform/44000000.ocp/48000000.interconnect/48000000.interconnect:segment@300000/48380000.target-module/48380000.omap_dwc3/48390000.usb is taking a long time
    [ 80.092594] udevd[103]: worker [114] /devices/platform/44000000.ocp/48000000.interconnect/48000000.interconnect:segment@300000/483c0000.target-module/483c0000.omap_dwc3/483d0000.usb is taking a long time
    [ 127.085122] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
    INIT: Entering runlevel: 5

    In the normal mode it is not observed: 

    [ 7.456783] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
    INIT: Entering runlevel: 5
    Configuring network interfaces... ip: can't find device 'wlan1'
    ip: SIOCGIFFLAGS: No such device
    [ 9.031267] cpsw-switch 4a100000.switch: starting ndev. mode: dual_mac
    [ 9.048830] Generic PHY 4a101000.mdio:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=4a101000.mdio:00, irq=POLL)
    Starting rpcbind daemon...done.

  • Hi Pawel,

    Regarding i2c error, after changing the governor to performance, the AM437x EVM is also halted on a reboot after few hours of test. But this issue doesn't happen on our board. I propose that I will create a separate topic in a case when it will be observed on our board.

    Thanks for reporting the issue. Yes please create a new e2e thread if the issue does happen on your board.