This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi Jiale,
The screenshots didn't come through in the original post, could you re-post them?
Regards,
Vishal
and i guess the root cause may be the high temperature of CPU, reasons are:
1) after running the stress test without heat sink, the CPU is very hot and shutdown (self protection) . Power off and then power on, the failure occurs. (CPU is still hot)
2) Cool the CPU with a FAN and then power on, no failure
3) heat the CPU with a hot air gun and power on failure occurs.
The log halts at "waiting for root device PARTUUID=0f671c52-02..."
Hi Jiale Huang,
Is this behavior consistent?
Failure happens only with high temperature.
-110 means timeout. That error code doesn't give much info.
Could you reproduce the issue with CONFIG_MMC_DEBUG enabled in Kernel config?
Regards,
Vishal
Hi Jiale Huang,
On top of enabling MMC_DEBUG option, could you also apply the below debug patch and share the logs when you reproduce the issue?
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 50514fedbc76..695cff3e1d4b 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -1347,6 +1347,8 @@ void sdhci_send_command(struct sdhci_host *host, struct mmc_command *cmd) u32 mask; unsigned long timeout; + pr_err("cmd:%d\n", cmd->opcode); + WARN_ON(host->cmd); /* Initially, a command has no error */ @@ -3061,7 +3063,7 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id) } do { - DBG("IRQ status 0x%08x\n", intmask); + pr_err("IRQ status 0x%08x\n", intmask); if (host->ops->irq) { intmask = host->ops->irq(host, intmask);
Few additional questions,
- Is this behavior seen on multiple units? if so how many?
- Could you provide more info on temperature? at what temperature issue happens.
Regards,
Vishal
Hi, Vihal
Great thanks for your support. Last Friday, we enable the CONFIG_MMC_DEBUG and capture the error log as follows:
We will do further analysis and debug.
Hi Jiale Huang,
The images you are uploading are not coming through.
Could you follow the guidelines here to update images to e2e - https://e2e.ti.com/support/site-support/f/1024/t/761613
Also, when the issue happens at high temperature, could you do the below commands and read the temp sensor registers?
devmem2 0x42050300 write 0x50
devmem2 0x42050320 write 0x50
devmem2 0x42050340 write 0x50
devmem2 0x42050360 write 0x50
devmem2 0x42050380 write 0x50
devmem2 0x42040308
devmem2 0x42040328
devmem2 0x42040348
devmem2 0x42040368
devmem2 0x42040388
Regards,
Vishal