TDA4VM: eMMC HS200 mode not stable

jiale huang

Prodigy 50 points

Part Number: TDA4VM

Dear TI experts

After patching the HS200 firmware (patchwork.kernel.org/.../), an errors occurs sometimes, details as follows:

Could you give us some clues：what does "error -110" mean?

BTW: The eMMC device we used is KLMAG2GEUF-B04Q, and the configuration description is：

The nOK log:

Some time stuck at here:

and the OK log is (same pcba, same test condition):

Thanks in advance, Best regards!

over 4 years ago

0 Vishal Mahaveer over 4 years ago

TI__Mastermind 33010 points

Hi Jiale,

The screenshots didn't come through in the original post, could you re-post them?

Regards,
Vishal

0 jiale huang over 4 years ago in reply to Vishal Mahaveer

Prodigy 50 points

After patching the HS200 firmware (patchwork.kernel.org/.../), an errors occurs sometimes, details as follows:

Could you give us some clues：what does "error -110" mean?

BTW: The eMMC device we used is KLMAG2GEUF-B04Q, and the configuration description is：

The nOK log:

Some time stuck at here:

and the OK log is (same pcba, same test condition):

Thanks in advance, Best regards!

0 jiale huang over 4 years ago in reply to jiale huang

Prodigy 50 points

and i guess the root cause may be the high temperature of CPU, reasons are:

1) after running the stress test without heat sink, the CPU is very hot and shutdown (self protection) . Power off and then power on, the failure occurs. (CPU is still hot)

2) Cool the CPU with a FAN and then power on, no failure

3) heat the CPU with a hot air gun and power on failure occurs.

The log halts at "waiting for root device PARTUUID=0f671c52-02..."

0 Vishal Mahaveer over 4 years ago in reply to jiale huang

TI__Mastermind 33010 points

Hi Jiale Huang,

Is this behavior consistent?
Failure happens only with high temperature.

-110 means timeout. That error code doesn't give much info.
Could you reproduce the issue with CONFIG_MMC_DEBUG enabled in Kernel config?

Regards,
Vishal

0 Vishal Mahaveer over 4 years ago in reply to Vishal Mahaveer

TI__Mastermind 33010 points

Hi Jiale Huang,

On top of enabling MMC_DEBUG option, could you also apply the below debug patch and share the logs when you reproduce the issue?

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 50514fedbc76..695cff3e1d4b 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1347,6 +1347,8 @@ void sdhci_send_command(struct sdhci_host *host, struct mmc_command *cmd)
 	u32 mask;
 	unsigned long timeout;
 
+	pr_err("cmd:%d\n", cmd->opcode);
+
 	WARN_ON(host->cmd);
 
 	/* Initially, a command has no error */ @@ -3061,7 +3063,7 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
 	}
 
 	do {
-		DBG("IRQ status 0x%08x\n", intmask);
+		pr_err("IRQ status 0x%08x\n", intmask);
 
 		if (host->ops->irq) {
 			intmask = host->ops->irq(host, intmask);

Few additional questions,
- Is this behavior seen on multiple units? if so how many?
- Could you provide more info on temperature? at what temperature issue happens.

Regards,
Vishal

0 jiale huang over 4 years ago in reply to Vishal Mahaveer

Prodigy 50 points

Hi， Vihal

Great thanks for your support. Last Friday, we enable the CONFIG_MMC_DEBUG and capture the error log as follows:

We will do further analysis and debug.

0 Vishal Mahaveer over 4 years ago in reply to jiale huang

TI__Mastermind 33010 points

Hi Jiale Huang,

The images you are uploading are not coming through.
Could you follow the guidelines here to update images to e2e - https://e2e.ti.com/support/site-support/f/1024/t/761613

Also, when the issue happens at high temperature, could you do the below commands and read the temp sensor registers?

devmem2 0x42050300 write 0x50
devmem2 0x42050320 write 0x50
devmem2 0x42050340 write 0x50
devmem2 0x42050360 write 0x50
devmem2 0x42050380 write 0x50
devmem2 0x42040308
devmem2 0x42040328
devmem2 0x42040348
devmem2 0x42040368
devmem2 0x42040388

Regards,
Vishal

0 jiale huang over 4 years ago in reply to Vishal Mahaveer

Prodigy 50 points

Hi, Vishal

These two days, two PCBAs are tested. Both failure when temperature is high(run DDR test and without heatsink, cool fan)

Below is the failure log of 1# PCBA

Below is the failure log of 2# PCBA: File system down

Any progress in analysis, i will inform you.

Best regards! :)

Because of the holidays, TI E2E™ design support forum responses will be delayed from Dec. 25 through Jan. 2. Thank you for your patience.

Processors

Processors forum

TDA4VM: eMMC HS200 mode not stable