This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TPS6594-Q1: TPS6594-Q1 problem

Part Number: TPS6594-Q1
Other Parts Discussed in Thread: TDA4VM

Hi   engineer:

     

Issue description:

we took LEO power solution (tps65941212+tps65941111) for TDA4VM in application. A fault happened occasionally that is the board will be stuck at 31mA during power on stage.

Root cause analysis: 

Step1. Fault repeating

In order to repeat the fault quickly, we stopped feeding WDG 1s later after power up successfully, which means it has ~20 times fed WDG and then stop feeding WDG to warmly restart system periodically. Possibility is not certain, sometime it can repeat  in several cycles, and sometimes it may happen in one day. Possibility is about one in 50 boards

Step2. We checked registers values of PMICs as following when fault happened, and found that PMICs enter orderly shutdown mode, and SPMI_ERR_INT flag detected.

  • PMIC-A
    • INT_MISC (0x66)=00h
    • INT_MODERATE_ERR (0x67)=08h
    • INT_FSM_ERR (0x69)=82h
  • PMIC-B: 
    • INT_MISC (0x66)=01h
    • INT_MODERATE_ERR (0x67)=10h
    • INT_FSM_ERR (0x69)=02h

Step3. Monitoring SPMI_SCLK/SPMI_DATA waveform with oscilloscope and found that it looks very good waveforms

 

My question: 

Why does the SPMI_ERR_INT flag happened? and what is root cause for my case?

  • Does this only happen in the presence of a watchdog failure?

    After a warm reset induced by WDG fail, do you clear the recovery counter after a successful reset?

  • Thanks to Johnsin to transfer my question to here. Soooo sorry i guess my wrong click button just now!

    Dear Mike,

    In our application, it was observed during remote wake up testing by CAN application message. Test setup as following.

    (1) Power on KL30 and KL31, power off KL15

    (2) Simulate CAN message in vehicle with CANoe to send application message, and period is 500ms

    (3) Expectation is that DUT will be woke up by application message, then it detects KL15 is off and no NM message at the same time, then it will shutdown itself. And the next period beginning with application message wake up event, and keep going like this

    After a warm reset induced by WDG fail, do you clear the recovery counter after a successful reset?  

    Answer: Yes, every time reset, the recovery counter will be cleared in SBL session

  • Hello Tim, Johnsin,

    I'm working with Mike also on this as well. I'll be working with him as we look into this.

    I'm sorry I'm not familiar with the entire system, what exactly are KL30, KL31, & KL15, I'm assuming that these are modules in the system correct, can you give more detail?

    BR,

    Nicholas

  • Hello Nicholas,

    KL30 is battery power supply for DUT, KL31 is the power return, and KL15 is IGN signal.

  • Hello Tim, Johnsin,

    Sorry for the long awaited reply as it's been taking a while to find this cause. After doing a  setup with an EVM stack, we were able to replicate this issue.

    This situation is occurred by a Watchdog error (longwindow timeout, incorrect feeding, etc.) that causes a warm reset, if the settings that caused the warm reset are loaded back in we get a loop until the recovery counter is hit and then the PMIC is LOCKED out and needs to be power cycled.

    Suggestion is to read the 0x09 Watchdog Error register and read the cause of the watchdog issue (if it's a timing issue make the timing tolerances looser).

    As in the case of the EVM stack, I made the longwindow to short as seen here in the 0x09 register.

    Please try this.

    Thank you for your patience,

    Nicholas

  • Hi Nicholas,

    thank you for your info.

    In our setup, it suppose that the DUT can power up successfully, the WDG counter and recovery counter will be CLEARED during the SBL session, then feed WDG about 1s( it means ~20 times feeding WDG) later in APP, then stop feeding WDG to make a warm reset artificially. 

     (it's in SBL codes)

    When the fault happened, PMICs register as following:

    PMIC-A: 

    • RECOV_CNT_REG_1 Register (Offset = 83h) = 0x0F
    • RECOV_CNT_REG_2 Register (Offset = 84h) = 0x0F
    • INT_VMON Register (Offset = 62h) = 0x0
    • INT_GPIO Register (Offset = 63h) = 0x0

    PMIC-B: 

    • RECOV_CNT_REG_1 Register (Offset = 83h) = 0x01
    • RECOV_CNT_REG_2 Register (Offset = 84h) = 0x0F
    • INT_VMON Register (Offset = 62h) = 0x0
    • INT_GPIO Register (Offset = 63h) = 0x08

    As above, it found that the recovery counts was NOT cleared successfully, and no VCCA_OV/UV error detected, we suspect that PMICs didn't boot successfully somehow

    any comments form your side?  or any analysis approaches can be performed?

  • Hi Tim,

    the device expert is out for the Christmas holidays. Please expect a response earliest on 27th of December.

    regards,

    Niko

  • hello Niko and hello Nicholas,

    Happy new year in advance!

    as for this issue above, any comments from your side? we're looking forward your response, thank you!

  • Hi Tim,

    The device expert is still out of the office, when they return they will be able to look into this and provide a response. Please expect some delay accordingly.

    Thanks,
    Field

  • Hello Tim,

    is it possible to probe the lines with a logic analyzer to see if the commands during the SBL are able to be sent?

    If there's a possibility that WD error that causes a warm reset happens and the SBL isn't able to clear before the problem arises again, resulting in the RECOVERY COUNTER not being reset. I 

    wdgData[0]=0x84;
    wdgData[1]=0x1F;
    I2C_write(handle_pmic,0x48,wdgData,2);//clean pmic-a
    I2C_write(handle_pmic,0x4C,wdgData,2);//clean pmic-a

    Where wdgData is an array of: register address @ [0] & register data @ [1], correct?

    BR,

    Nicholas

  • Hi Nicholas,

    Sorry for late reply, because it took some time to debug the issue and stressing test. Just share info. below with you.

    We think the root cause was found that at the moment of warm reset of PMIC-A, SoC was trying to read status of PMICs via config I2C, and sometime PMIC-B will lock config I2C bus, then keep I2C_SDA low, I2C_SCL high. So it has no chance to process SBL codes COMPLETELY, which means the recovery count was not cleared. After PMIC 15 times retrying failed, it will enter LOCKED. 

    Solution is that config the I2C pins as standard GPIOs with SoC side at beginning of SBL codes to unlock I2C bus with PMIC-B, then the fault disappeared.

  • Hello Tim,

    No problem at all.

    Sorry that was the problem, but it's good to know why the resets weren't occurring and now what is needed to move forward.

    Solution is that config the I2C pins as standard GPIOs with SoC side at beginning of SBL codes to unlock I2C bus with PMIC-B, then the fault disappeared.

    I'm assuming this is GPIO1/2 (I2C2) on PMIC-A of course?

    And that the watchdog is Q&A type, as there are two I2C interfaces, the I2C1 which is standard across both PMICs and share a single bus, and I2C2, which is a separate bus that is connected to PMIC-A as stated above.

    Later I'm guessing you switch the GPIO1/2 to SDA_I2C2/SCL_I2C2 back sometime later to get your Watchdog Q&A mode back?

    BR,

    Nicholas

  • Hello Nicholas,

    When the fault occurred, config I2C was latched between SOC and PMIC-B, the config I2C is I2C1 of PMICs that is shared a common bus between PMICs. and I2C2(GPIO1/2) of PMIC-A is configured as Q&A watchdog type.

    Our solution is configuring the I2C pins of SoC side as GPIOs at beginning, then to unlatch I2C1 bus with PMIC-B, and then re-config SoC side as I2C function to continue boot

  • Hello Tim,

    now I see, the issue of being latch as I2C1 was on the SoC side of the system setup and the solution was the changing -> GPIO -> I2C again.

    As of right now are there any issues, if not I'd like to close this thread.

    To summarize the problem & Solution:

    An error occurred in which the the only power draw from the Leo Power stage was 31mA.

    The interrupts indicated that a SPMI error was the primary caused, and similar setup of EVMs was tested on the lab.

    Testing for a warm reset from the Watchdog that would lead to a time out, the results yielded the same interrupts.

    It is concluded that the Watchdog was the root cause of problem.

    I suspect after a warm reset, the lock out counter kept incrementing, but you showed the code that would normally clear the counter which would prevent lockout. It was found that there was an I2C1 bus issue preventing the counter reset. The system now boots without the PMIC being the issue.

    BR,

    Nicholas

  • hello Nicholas,

    EXACTLY!  Thanks for your kindly support! And the thread can be closed.

    BR,

    Tim