This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM625: Linux Kernel MMC Oops

Part Number: AM625

Hello,

we experienced the following Oops a few times, using TI Linux Kernel 09.02.00.008 on custom hardware while accessing the eMMC device.

[   31.377291] Unable to handle kernel paging request at virtual address 0000fffffc386a14
[   31.385348] Mem abort info:
[   31.388136]   ESR = 0x0000000096000006
[   31.392338]   EC = 0x25: DABT (current EL), IL = 32 bits
[   31.397681]   SET = 0, FnV = 0
[   31.400730]   EA = 0, S1PTW = 0
[   31.405397]   FSC = 0x06: level 2 translation fault
[   31.410355] Data abort info:
[   31.413245]   ISV = 0, ISS = 0x00000006
[   31.417086]   CM = 0, WnR = 0
[   31.420049] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000084f89000
[   31.426552] [0000fffffc386a14] pgd=0800000084af2003, p4d=0800000084af2003, pud=0800000083ec0003, pmd=0000000000000000
[   31.437393] Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
[   31.443657] Modules linked in: crct10dif_ce ti_k3_r5_remoteproc virtio_rpmsg_bus rpmsg_ns rtc_ti_k3 ti_k3_m4_remoteproc ti_k3_common tidss drm_dma_helper mcrc sa2ul lontium_lt8912b tc358768 display_connector drm_kms_helper ina2xx syscopyarea sysfillrect sysimgblt fb_sys_fops spi_omap2_mcspi pwm_tiehrpwm drm lm75 drm_panel_orientation_quirks optee_rng rng_core
[   31.475530] CPU: 0 PID: 8 Comm: kworker/0:0H Not tainted 6.1.80+git.ba628d222cde #1
[   31.483179] Hardware name: Toradex Verdin AM62 on Verdin Development Board (DT)
[   31.490480] Workqueue: kblockd blk_mq_run_work_fn
[   31.495216] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   31.502172] pc : __mmc_blk_ioctl_cmd+0x12c/0x590
[   31.506795] lr : __mmc_blk_ioctl_cmd+0x2cc/0x590
[   31.511408] sp : ffff8000092a39e0
[   31.514717] x29: ffff8000092a3b50 x28: ffff8000092a3d28 x27: 0000000000000000
[   31.521853] x26: ffff80000a5a3cf0 x25: ffff000018bbb400 x24: 0000fffffc386a08
[   31.528989] x23: ffff000018a8b808 x22: 0000000000000000 x21: 00000000ffffffff
[   31.536124] x20: ffff000018a8b800 x19: ffff0000048c6680 x18: 0000000000000000
[   31.543260] x17: 0000000000000000 x16: 0000000000000000 x15: 0000146d78b52ba4
[   31.550394] x14: 0000000000000206 x13: 0000000000000001 x12: 0000000000000000
[   31.557529] x11: 0000000000000000 x10: 00000000000009b0 x9 : 0000000000000651
[   31.564664] x8 : ffff8000092a3ad8 x7 : 0000000000000000 x6 : 0000000000000000
[   31.571800] x5 : 0000000000000200 x4 : 0000000000000000 x3 : 00000000000003e8
[   31.578935] x2 : 0000000000000000 x1 : 000000000000001d x0 : 0000000000000017
[   31.586071] Call trace:
[   31.588513]  __mmc_blk_ioctl_cmd+0x12c/0x590
[   31.592782]  mmc_blk_mq_issue_rq+0x50c/0x920
[   31.597049]  mmc_mq_queue_rq+0x118/0x2ac
[   31.600970]  blk_mq_dispatch_rq_list+0x1a8/0x8b0
[   31.605588]  __blk_mq_sched_dispatch_requests+0xb8/0x164
[   31.610898]  blk_mq_sched_dispatch_requests+0x3c/0x80
[   31.615946]  __blk_mq_run_hw_queue+0x68/0xa0
[   31.620215]  blk_mq_run_work_fn+0x20/0x30
[   31.624223]  process_one_work+0x1d0/0x320
[   31.628238]  worker_thread+0x14c/0x444
[   31.631989]  kthread+0x10c/0x110
[   31.635219]  ret_from_fork+0x10/0x20
[   31.638801] Code: 12010000 2a010000 b90137e0 b4000078 (b9400f00) 
[   31.644888] ---[ end trace 0000000000000000 ]---

0xffff8000087f3a10 is in __mmc_blk_ioctl_cmd (drivers/mmc/core/block.c:570).
565			 * We don't do any blockcount validation because the max size
566			 * may be increased by a future standard. We just copy the
567			 * 'Reliable Write' bit here.
568			 */
569			sbc.arg = data.blocks | (idata->ic.write_flag & BIT(31));
570			if (prev_idata)
571				sbc.arg = prev_idata->ic.arg;
572			sbc.flags = MMC_RSP_R1 | MMC_CMD_AC;
573			mrq.sbc = &sbc;
574		}

Looking at the code it could be something wrong with `commit c4edcd134bb7 ("mmc: core: Use mrq.sbc in close-ended ffu")`, which was introduced when merging v6.1.80 into the TI Linux kernel branch.

The issue is not systematic, it happened more than once, and as of now, it's not clear what's triggering it.

It's possible (but not 100% sure), that the issue happened while reading/writing to the mmcboot partition.

Any idea?