TDA4VM: TDA4VM emmc issuse

xin alex

Part Number: TDA4VM
Other Parts Discussed in Thread: CSD, DRA829

Dear, expers

We're having some problems on tda4vm custom board as follows (sdk7.1):

Problem 1: When Setting eMMC Signal Voltage To 1.8V, Then The State Machine Will Be Broken

On some hardware boards with samsung automotive eMMC, we found the following kernel error messages when selecting HS200 mode:

mmc0: mmc_select_hs200 failed, error -110
mmc0: error -110 whilst initialising MMC card

The kernel code snippet is as follow:

static int mmc_select_hs200(struct mmc_card *card)
{
	......
	old_signal_voltage = host->ios.signal_voltage;
	if (card->mmc_avail_type & EXT_CSD_CARD_TYPE_HS200_1_2V)
		err = mmc_set_signal_voltage(host, MMC_SIGNAL_VOLTAGE_120);

	if (err && card->mmc_avail_type & EXT_CSD_CARD_TYPE_HS200_1_8V)
		err = mmc_set_signal_voltage(host, MMC_SIGNAL_VOLTAGE_180);

	/* If fails try again during next card power cycle */
	if (err)
		return err;

	mmc_select_driver_type(card);

	/*
	 * Set the bus width(4 or 8) with host's support and
	 * switch to HS200 mode if bus width is set successfully.
	 */
	err = mmc_select_bus_width(card);
	if (err > 0) {
		......
	}
err:
	if (err) {
		/* fall back to the old signal voltage, if fails report error */
		if (mmc_set_signal_voltage(host, old_signal_voltage))
			err = -EIO;

		pr_err("%s: %s failed, error %d\n", mmc_hostname(card->host),
		       __func__, err);
	}
	return err;
}

This is because the mmc_set_signal_voltage() method will set the MMCSD0_HOST_CONTROL2 register's V1P8_SIGNAL_ENA bit field, this causes the following mmc_select_bus_width() method return time out error.

An introduction can be found through the description of V1P8_SIGNAL_ENA bit field

Host Driver can set this bit to 1h when Host Controller supports 1.8 V signaling (one of support bits is set to 1h: SDR50, SDR104 or DDR50 in the MMCSD0_CAPABILITIES register) and the card or device supports UHS-I.

It seems that there is no need to set this bit when using eMMC HS200/HS400 mode. To work around this work we implemented the sdhci_j721e_voltage_switch() method:

static int sdhci_j721e_voltage_switch(struct mmc_host *mmc,
				       struct mmc_ios *ios)
{
	switch (ios->signal_voltage) {
	case MMC_SIGNAL_VOLTAGE_180:
		/*
		 * Plese don't switch to 1V8 as j721e MMCSD0, 5.1 doesn't
		 * actually refer to this setting to indicate the
		 * signal voltage and the state machine will be broken
		 * actually if we force to enable 1V8. That's something
		 * like broken quirk but we could work around here.
		 */
		return 0;
	case MMC_SIGNAL_VOLTAGE_330:
	case MMC_SIGNAL_VOLTAGE_120:
		/* We don't support 3V3 and 1V2 */
		break;
	}

	return -EINVAL;
}

static int sdhci_am654_probe(struct platform_device *pdev)
{
	......
	host->mmc_host_ops.execute_tuning = sdhci_am654_execute_tuning;
	/* EXPERIMENTAL: From the TRM 12.3.6, we learned that the MMCSD0
	 * host controller does not support 3.3V, 3.3V and 1.2V. 
	 *                                               wsl, 09/07/2021.
	 */
	if (of_device_is_compatible(dev->of_node, "ti,j721e-sdhci-8bit"))
		host->mmc_host_ops.start_signal_voltage_switch = 
						sdhci_j721e_voltage_switch;
	......
}

Problem 2: CQE Recovery With Damaged Filesystem When Running Under Heavy Load In HS400 Mode

[   16.434022] mmc0: running CQE recovery
[   16.434354] ------------[ cut here ]------------
[   16.434358] mmc0: cqhci: spurious TCN for tag 9
[   16.434390] WARNING: CPU: 0 PID: 122 at drivers/mmc/host/cqhci.c:727 cqhci_irq+0x310/0x490
[   16.434392] Modules linked in: shared_memory(O) ti_k3_dsp_remoteproc ti_k3_r5_remoteproc
[   16.434400] CPU: 0 PID: 122 Comm: kworker/0:1H Tainted: G           O      5.4.74-g9574bba32a #2
[   16.434402] Hardware name: Texas Instruments K3 J721E SoC (DT)
[   16.434409] Workqueue: kblockd blk_mq_run_work_fn
[   16.434413] pstate: 40000085 (nZcv daIf -PAN -UAO)
[   16.434415] pc : cqhci_irq+0x310/0x490
[   16.434418] lr : cqhci_irq+0x310/0x490
[   16.434419] sp : ffff80001000fd20
[   16.434420] x29: ffff80001000fd20 x28: ffff000846f60580 
[   16.434423] x27: ffff000846fe5680 x26: 0000000000000001 
[   16.434425] x25: ffff800010d5de60 x24: ffff000846fe5698 
[   16.434428] x23: ffff800011f1a9a7 x22: ffff000846ba0000 
[   16.434430] x21: ffff000846f60000 x20: 0000000000000002 
[   16.434432] x19: 0000000000000009 x18: 0000000000000001 
[   16.434434] x17: 0000000000000000 x16: 0000000000000000 
[   16.434436] x15: ffff800011eaa000 x14: ffff800011f674e0 
[   16.434438] x13: 0000000000000000 x12: ffff800011f66000 
[   16.434440] x11: ffff800011eaa000 x10: ffff800011f66b30 
[   16.434442] x9 : 0000000000000000 x8 : 0000000000000000 
[   16.434445] x7 : 0000000000000004 x6 : 00000000000001cf 
[   16.434447] x5 : ffff00087fb81848 x4 : 0000000000000001 
[   16.434449] x3 : ffff00087fb81848 x2 : 0000000000000007 
[   16.434451] x1 : 919b97856fdf9800 x0 : 0000000000000000 
[   16.434454] Call trace:
[   16.434457]  cqhci_irq+0x310/0x490
[   16.434460]  sdhci_am654_cqhci_irq+0x54/0x80
[   16.434462]  sdhci_irq+0xa8/0xe94
[   16.434466]  __handle_irq_event_percpu+0x64/0x170
[   16.434469]  handle_irq_event_percpu+0x30/0x88
[   16.434471]  handle_irq_event+0x44/0xd8
[   16.434473]  handle_fasteoi_irq+0xb4/0x160
[   16.434475]  generic_handle_irq+0x24/0x38
[   16.434477]  __handle_domain_irq+0x60/0xb8
[   16.434480]  gic_handle_irq+0x5c/0x148
[   16.434482]  el1_irq+0xb8/0x180
[   16.434486]  _raw_spin_unlock_irqrestore+0x38/0x48
[   16.434489]  cqhci_request+0xac/0x4f0
[   16.434492]  mmc_cqe_start_req+0x50/0x60
[   16.434496]  mmc_blk_mq_issue_rq+0x450/0x7b0
[   16.434498]  mmc_mq_queue_rq+0x104/0x270
[   16.434500]  blk_mq_dispatch_rq_list+0xa4/0x5c8
[   16.434505]  blk_mq_sched_dispatch_requests+0xec/0x1d0
[   16.434507]  __blk_mq_run_hw_queue+0xa8/0x120
[   16.434509]  blk_mq_run_work_fn+0x1c/0x28
[   16.434513]  process_one_work+0x198/0x320
[   16.434515]  worker_thread+0x48/0x420
[   16.434518]  kthread+0x138/0x158
[   16.434521]  ret_from_fork+0x10/0x1c
[   16.434523] ---[ end trace 840021488594c02d ]---
[   16.441832] mmc0: running CQE recovery
[   16.448984] mmc0: running CQE recovery
[   16.458225] mmc0: running CQE recovery
[   16.460761] blk_update_request: I/O error, dev mmcblk0, sector 95232 op 0x1:(WRITE) flags 0x4800 phys_seg 8 prio class 0
[   16.471631] Buffer I/O error on dev mmcblk0p8, logical block 88, lost async page write
[   16.479588] Buffer I/O error on dev mmcblk0p8, logical block 89, lost async page write

Disabling CQ with patch mmc: core: Add MMC Command Queue Support kernel parameter did not help.

THANKS

over 4 years ago

0 Shiou Mei Huang over 4 years ago

TI__Genius 10180 points

XA,

1) Does the system work after you modified the voltage switching sequence? What values are you driving to the eMMC device and voltage rail?

2) After you disabled CQE support, what failures did you see? Are you able to dump the registers at point of failure? Is the failure happening on multiple units and easily reproducible?

Best Regards,

Shiou Mei

0 xin alex over 4 years ago in reply to Shiou Mei Huang

Intellectual 506 points

Hi Shiou Mei

1. System work normal while do not set eMMC Signal Voltage To 1.8V and the hardware provide 3.3v to vcc, 1.8v to vccq

VCC Supply (3.3V) = NAND interface I/O and NAND Flash power supply.

VCCQ Supply (1.8V) = e·MMC controller core and e·MMC interface I/O power supply.

2. After disabled CQE support, the following error occurs

blk_update_request: I/O error, dev mmcblk0, sector 95232 op 0x1:(WRITE) flags 0x4800 phys_seg 8 prio class 0

and It happen every time

3. Supplementary remarks on Problem 2：

According to J721E DRA829_TDA4VM_AM752x Processors_Silicon Revision 1.1 describe the HS400 mode is supported:

But according to [errata of J721e SR1.0/SR1.1](https://www.ti.com/lit/er/sprz455a/sprz455a.pdf?ts=1631068529743&ref_url=https%253A%252F%252Fwww.google.com%252F), HS400 mode is not supported in MMCSD0 subsystem (i2024).

> **i2024** ***MMC/SD Peripherals Do Not Support HS400\***
>
> **Details:** The MMCSD peripherals do not support the Multimedia Card HS400 mode.
>
> **Workaround(s):** None

$ cat /sys/devices/soc0/revision
SR1.0

I want to make sure does emmc hs400 mode are supported now?

4. We try to replace samsung emmc with micron emmc，the emmc work normal at hs400 mode, problem 2 do not happening.

Thanks

0 Shiou Mei Huang over 3 years ago in reply to xin alex

TI__Genius 10180 points

XA,

1 & 2. When do you see the error happening? Before u-boot, before kernel, or during kernel boot up?

3. We do not support HS400 on TDA4VM; this is documented in the datasheet.

4. Have you checked with Samsung why the issues could be happening? Are you OK proceeding with Micron eMMC or still would like to use Samsung eMMC in your system?

Best Regards,

Shiou Mei

Processors

Processors forum

TDA4VM: TDA4VM emmc issuse

Problem 2: CQE Recovery With Damaged Filesystem When Running Under Heavy Load In HS400 Mode