This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

BEAGL-BONE-BLACK: Problems reading from eMMC boot partitions on beaglebone with kernel 6.1

Part Number: BEAGL-BONE-BLACK


I have a problem with unstable read/write from emmc 'boot' partitions (/dev/mmcblk1boot1) on kernel 6.1. The behavior looks like it has something to do with power-management. If I try to read from /dev/mmcblk1boot1 it will sometimes fail and give incorrect data. It seems this happens when the device has been 'idle' for a little while. In this state it will continue to fail. But if I do a read from /dev/mmcblk1boot0 or run the 'sync' command right before it will work. The following is a fairly reproducible way to demonstrate the problem: 

$ hexdump -C -n 0x40 /dev/mmcblk1boot0; hexdump -C /dev/mmcblk1boot1 -n 0x40 ; sleep 5; hexdump -C /dev/mmcblk1boot1 -n 0x40
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000040
00000000 00 b4 e0 a9 44 61 72 63 68 3d 61 72 6d 00 61 72 |....Darch=arm.ar|
00000010 67 73 5f 6d 6d 63 3d 72 75 6e 20 66 69 6e 64 75 |gs_mmc=run findu|
00000020 75 69 64 3b 73 65 74 65 6e 76 20 62 6f 6f 74 61 |uid;setenv boota|
00000030 72 67 73 20 63 6f 6e 73 6f 6c 65 3d 24 7b 63 6f |rgs console=${co|
00000040
00000000 fa b8 00 10 8e d0 bc 00 b0 b8 00 00 8e d8 8e c0 |................|
00000010 fb be 00 7c bf 00 06 b9 00 02 f3 a4 ea 21 06 00 |...|.........!..|
00000020 00 be be 07 38 04 75 0b 83 c6 10 81 fe fe 07 75 |....8.u........u|
00000030 f3 eb 16 b4 02 b0 01 bb 00 7c b2 80 8a 74 01 8b |.........|...t..|
00000040

The first read from /dev/mmcblk1boot1 is right after a read from /dev/mmcblkboot0. You can see the start of u-boot environment I have saved there. After waiting 5 seconds another read will fail. It will keep failing for a long time (possibly several minutes). I guess until something pokes the flash again. But this seems to always work: 

$ sync; hexdump -C -n 0x40 /dev/mmcblk1boot1
00000000 00 b4 e0 a9 44 61 72 63 68 3d 61 72 6d 00 61 72 |....Darch=arm.ar|
00000010 67 73 5f 6d 6d 63 3d 72 75 6e 20 66 69 6e 64 75 |gs_mmc=run findu|
00000020 75 69 64 3b 73 65 74 65 6e 76 20 62 6f 6f 74 61 |uid;setenv boota|
00000030 72 67 73 20 63 6f 6e 73 6f 6c 65 3d 24 7b 63 6f |rgs console=${co|

Some googling led me to this, which looks related: lore.kernel.org/.../

On my board when it is failing I read 

$ cat /sys/kernel/debug/mmc1/ios
clock: 0 Hz
vdd: 0 (invalid)
bus mode: 2 (push-pull)
chip select: 0 (don't care)
power mode: 0 (off)
bus width: 0 (1 bits)
timing spec: 0 (legacy)
signal voltage: 0 (3.30 V)
driver type: 0 (driver type B)

When it is working: 

$ sync; cat /sys/kernel/debug/mmc1/ios
clock: 52000000 Hz
vdd: 21 (3.3 ~ 3.4 V)
bus mode: 2 (push-pull)
chip select: 0 (don't care)
power mode: 2 (on)
bus width: 3 (8 bits)
timing spec: 1 (mmc high-speed)
signal voltage: 0 (3.30 V)
driver type: 0 (driver type B)

I am building Yocto Kirkstone, using the the linux-ti-staging_6.1 kernel from this commit of the meta-ti layer: 

commit 6a3f358e33a20034ef7b48a7df69bace6fc80c6b (HEAD, tag: cicd.kirkstone.202401251532, tag: 09.02.00.002)
Author: LCPD Automation Script <lcpdbld@list.ti.com>
Date: Thu Jan 25 15:32:34 2024 -0600

CI/CD Auto-Merger: cicd.kirkstone.202401251532

Updated the value(s) for:
u-boot-ti-staging_2023.04: SRCREV
linux-ti-staging-rt_6.1: SRCREV
linux-ti-staging_6.1: SRCREV

Signed-off-by: LCPD Automation Script <lcpdbld@list.ti.com>

  • Hi Filip,

    The following kernel patch disables mmc driver autosuspend. Can you please try if it resolves the issue?

    diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
    index 033be559a730..f30462d4eb17 100644
    --- a/drivers/mmc/host/sdhci-omap.c
    +++ b/drivers/mmc/host/sdhci-omap.c
    @@ -1303,7 +1303,7 @@ static int sdhci_omap_probe(struct platform_device *pdev)
             * callback will be invoked as part of pm_runtime_get_sync.
             */
            pm_runtime_use_autosuspend(dev);
    -       pm_runtime_set_autosuspend_delay(dev, 50);
    +       pm_runtime_set_autosuspend_delay(dev, -1);
            pm_runtime_enable(dev);
            ret = pm_runtime_resume_and_get(dev);
            if (ret) {
    

  • Thanks for the help! I tried the patch, but unfortunately it didn't help. I still have the same issue. 

  • Hi Filip,

    The link your referred discussed about reverting 2 kernel patches related to MMC PM. Have you tried to revert the patches in your test too?

  • I tried reverting both patches now and that seems to work. After reverting these I have not observed any read failures.