This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VE-Q1: eMMC command timeout

Part Number: TDA4VE-Q1


Hi there,

We use J721S2 and SDK8.4

When we performed some temperature stress test, we got an eMMC error from Linux kernel as below.

[87489.660360] mmc0: cqhci: timeout for tag 27

It seems related to eMMC access timeout. We like to know is this timeout value depends on the value in SDHCI setting or not?

Now its setting is as below,

mmc0: sdhci: Timeout:   0x0000000e

If it is not where we can find out the timeout value for cqhci?

Because we would like to extend the cqhci timout value.

Thanks,

Sean

=========error log from Linux==================================

[87428.456913] mmc0: running CQE recovery
[87489.660360] mmc0: cqhci: timeout for tag 27
[87489.665585] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[87489.673620] mmc0: cqhci: Caps:      0x000030c8 | Version:  0x00000510
[87489.681654] mmc0: cqhci: Config:    0x00000001 | Control:  0x00000000
[87489.689685] mmc0: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
[87489.697716] mmc0: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
[87489.705748] mmc0: cqhci: TDL base:  0x00000000 | TDL up32: 0x00000000
[87489.713780] mmc0: cqhci: Doorbell:  0x30000000 | TCN:      0x00000000
[87489.721814] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000000
[87489.729846] mmc0: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
[87489.737877] mmc0: cqhci: SSC2:      0x00000000 | DCMD rsp: 0x00000000
[87489.745908] mmc0: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00001d2c
[87489.753939] mmc0: cqhci: Resp idx:  0x00000000 | Resp arg: 0x00000000
[87489.761969] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[87489.770001] mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00001004
[87489.778033] mmc0: sdhci: Blk size:  0x00007200 | Blk cnt:  0x00000000
[87489.786067] mmc0: sdhci: Argument:  0x00000001 | Trn mode: 0x00000000
[87489.794099] mmc0: sdhci: Present:   0x01ff00f0 | Host ctl: 0x00000018
[87489.802130] mmc0: sdhci: Power:     0x0000000b | Blk gap:  0x00000080
[87489.810161] mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x0000fa07
[87489.818192] mmc0: sdhci: Timeout:   0x0000000e | Int stat: 0x00000000
[87489.826224] mmc0: sdhci: Int enab:  0x02ff4000 | Sig enab: 0x02ff4000
[87489.834255] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
[87489.842285] mmc0: sdhci: Caps:      0x7cecc801 | Caps_1:   0x98002407
[87489.850317] mmc0: sdhci: Cmd:       0x00003013 | Max curr: 0x00000000
[87489.858348] mmc0: sdhci: Resp[0]:   0x00ff8080 | Resp[1]:  0x00000000
[87489.866379] mmc0: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
[87489.874409] mmc0: sdhci: Host ctl2: 0x00000000
[87489.879947] mmc0: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x0000000000000000
[87489.888844] mmc0: sdhci: ============================================
[87489.896884] mmc0: running CQE recovery
[87489.902748] blk_update_request: I/O error, dev mmcblk0, sector 4824016 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

  • Hi Sean,

    What temperature are you seeing this timeout? Have you checked with the eMMC data sheet for the max temperature supported?

    Best Regards,

    Keerthy 

  • Hi Keerthy,

    We put our board into an oven for and the environment temperature was set to 85 C. It looks like not over its max temperature.

    We would like to do more test. For example extend the eMMC sending command time.

    Does  cqhci sent command through  SDHCI? Therefore the timeout setting is set in SDHCI register? If it is not how can we extend timeout time by setting timeout value for below error log for test?

    Thanks,

    Sean

    ============cqhci timeout error log===========

    [87489.660360] mmc0: cqhci: timeout for tag 27

  • Hi Sean,

    CQ is a command engine where we submit the request for the command execution

    Can you help us with these information.

    Regards
    Diwakar

  • Hi Diwaker,

    Please find my replies below.

    Thanks,

    Sean

    ======================================================================================

    CQ is a command engine where we submit the request for the command execution

    [Sean]Does this CQ HW module send eMMC commands via SDHCI?So it uses the command timeout setting in SDHCI module.

              We are eager to know the timeout value setting for a CQHCI command response from eMMC device.

    Can you help us with these information.

    • Which emmc part you are using ?

            [Sean] We use ' mmc0: SDHCI controller on 4f80000.mmc [4f80000.mmc] using ADMA 64-bit' which is defined in your original Linux device tree.

    • Can you share the complete logs also enable more logs mmc driver to get the better picture.

            [Sean] Unfortunately, we only copied the log when it crashed. Therefore that's all the log so far we had for that cash.

    • What speed mode you are testing with ?

           [Sean] I powered on the test board again and got the info as below,

                   " mmc0: new HS200 MMC card at address 0001"

    • What is the boot flow you are using ?

         [Sean] It was booted from eMMC( images of all cores are put in that device).

    • How many of sample have this issue ?

           [Sean] 2 sets. We are doing the same test several times but only encountered the problem that time.

    • Have you tried capturing waveform of CMD ,DATA and CLOCK lines during issue ?

            [Sean]No

        [Sean] We don't change the settings from your original released except disabling HS400 feature from the device tree.

  • Hi Diwakar,

    I want to correct one of my reply that it is only one set of our boards in house has been tested with this issue.

    regards,

    Sean

  • HI Sean 

    [Sean]Does this CQ HW module send eMMC commands via SDHCI?So it uses the command timeout setting in SDHCI module.

              We are eager to know the timeout value setting for a CQHCI command response from eMMC device.

    • Command queue engine is a part of host controller where you send all the commands in que to eliminate software overhead and for faster processing.Once the error come the emmc host controller it will notify the same to the CQ engine , so we need to check what was the error we got in the host controller.

    • The device means which emmc part you are using ?
    • In order to analyse the issue we need to have the full logs.
    • Have yo u tried replacing the emmc part on the issue board mostly it will be the issue with emmc device.

    Regards
    Diwakar

  • Hi Diwakar,

    Please see my reply below

    Regards,

    Sean

    ==================================================================

    • The device means which emmc part you are using ? 

            [Sean] The information for the eMMC device that we use is as below

                    Product Number:PTE7A0MI-16GX

                    Vendor: Phison

    • In order to analyse the issue we need to have the full logs.

        [Sean] We are performing the same test right now and also saving the log at the same time. We will provide you the log once we encounter the issue.

    • Have you tried replacing the emmc part on the issue board mostly it will be the issue with emmc device.

            [Sean] Understand.But we have not replaced it.

                       We use more different boards for the same test for now.

  • Hi Diwaker,

    How to config emmc speed to  DDR50 or SDR50?

  • Hi Terry 

    As the higher speed mode (HS400,HS200) are operates at 1.8  v

    if you add a " no-1-8-v" property under your emmc node in the dtsi you will be able to use the DDR50 speed mode as the linux try to select the highest speed mode supported by the device.

    Regards
    Diwakar

  • Hi Diwaker,

    thanks your reply.

    How to enable emmc debug log in j721s2 ??

  • Hi Terry 

    You need to increase the linux kernel log level in the bootargs.

    Please refer:https://www.kernel.org/doc/html/v4.14/admin-guide/kernel-parameters.html

    Regards
    Diwakar

  • How to check if emmc debug log has been successfully enabled?

  • Hi Terry 

    One you increase the log level you should see more logs getting printed on the console even debug logs as well, you can grep for the mmc logs in the dmesg to analyse the issue.

    Regards
    Diwakar