Because of the holidays, TI E2E™ design support forum responses will be delayed from Dec. 25 through Jan. 2. Thank you for your patience.

This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: eMMC HS200 mode not stable

Part Number: TDA4VM
Dear TI experts
After patching the HS200 firmware (patchwork.kernel.org/.../), an errors occurs sometimes, details as follows:
Could you give us some clues:what does "error -110" mean? 
BTW: The eMMC device we used is KLMAG2GEUF-B04Q, and the configuration description is:
The nOK log:
Some time stuck at here:
and the OK log is (same pcba, same test condition):
Thanks in advance, Best regards!

  • Hi Jiale,

    The screenshots didn't come through in the original post, could you re-post them?

    Regards,
    Vishal

  • After patching the HS200 firmware (patchwork.kernel.org/.../), an errors occurs sometimes, details as follows:
    Could you give us some clues:what does "error -110" mean? 
    BTW: The eMMC device we used is KLMAG2GEUF-B04Q, and the configuration description is:

    The nOK log:

    Some time stuck at here:


    and the OK log is (same pcba, same test condition):



    Thanks in advance, Best regards!


  • and  i guess the root cause may be the high temperature of CPU, reasons are:

    1)  after running the stress test without heat sink, the CPU is very hot and shutdown (self protection) . Power off and then power on, the failure occurs. (CPU is still hot)

    2) Cool the CPU with a FAN and then power on, no failure

    3) heat the CPU with a hot air gun and power on failure occurs.

    The log halts at "waiting for root device PARTUUID=0f671c52-02..."

  • Hi Jiale Huang,

    Is this behavior consistent?
    Failure happens only with high temperature.

    -110 means timeout. That error code doesn't give much info.
    Could you reproduce the issue with CONFIG_MMC_DEBUG enabled in Kernel config?

    Regards,
    Vishal

  • Hi Jiale Huang,

    On top of enabling MMC_DEBUG option, could you also apply the below debug patch and share the logs when you reproduce the issue?

    diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 50514fedbc76..695cff3e1d4b 100644
    --- a/drivers/mmc/host/sdhci.c
    +++ b/drivers/mmc/host/sdhci.c
    @@ -1347,6 +1347,8 @@ void sdhci_send_command(struct sdhci_host *host, struct mmc_command *cmd)
     	u32 mask;
     	unsigned long timeout;
     
    +	pr_err("cmd:%d\n", cmd->opcode);
    +
     	WARN_ON(host->cmd);
     
     	/* Initially, a command has no error */ @@ -3061,7 +3063,7 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
     	}
     
     	do {
    -		DBG("IRQ status 0x%08x\n", intmask);
    +		pr_err("IRQ status 0x%08x\n", intmask);
     
     		if (host->ops->irq) {
     			intmask = host->ops->irq(host, intmask);

    Few additional questions,
    - Is this behavior seen on multiple units? if so how many?
    - Could you provide more info on temperature? at what temperature issue happens.

    Regards,
    Vishal

  • Hi, Vihal

    Great thanks for your support. Last Friday, we enable the CONFIG_MMC_DEBUG and capture the error log as follows:

    We will do further analysis and debug.

  • Hi Jiale Huang,

    The images you are uploading are not coming through.
    Could you follow the guidelines here to update images to e2e - https://e2e.ti.com/support/site-support/f/1024/t/761613


    Also, when the issue happens at high temperature, could you do the below commands and read the temp sensor registers?

    devmem2 0x42050300 write 0x50
    devmem2 0x42050320 write 0x50
    devmem2 0x42050340 write 0x50
    devmem2 0x42050360 write 0x50
    devmem2 0x42050380 write 0x50
    devmem2 0x42040308
    devmem2 0x42040328
    devmem2 0x42040348
    devmem2 0x42040368
    devmem2 0x42040388


    Regards,
    Vishal

  • Hi, Vishal

    These two days, two PCBAs are tested. Both failure when temperature is high(run DDR test and without heatsink, cool fan)

    Below is the failure log of 1# PCBA

    Below is the failure log of 2# PCBA: File system down

    Any progress in analysis, i will inform you.

    Best regards! :)