BQ40Z50-R2: Bricked pack during compound CUV and CHGC faults

Jeff LaBundy

Part Number: BQ40Z50-R2
Other Parts Discussed in Thread: BQ25798,

Tool/software:

Hi there—we have encountered a corner case in which both the CHG and DSG FETs are permanently disabled if the gauge experiences a PCHGC, CHGC or OCC fault while recovering from a CUV fault. The sequence is as follows:

The battery naturally discharges below the CUV threshold.
The gauge enters CUV, and disables the DSG FET while waiting for the cell voltage(s) to clear the CUV recovery threshold.
The ChargingCurrent() is lowered in response to temperature.
Due to system latency, the host SW does not reduce the charger IC constant-charging current within the CHGC delay time.
The gauge enters CHGC, and disables the CHG FET.
With the CHG FET disabled, the battery stops charging and the CUV condition persists. This in turn keeps the DSG FET disabled, preventing a discharge current from flowing and clearing the CHGC recovery threshold.
Both FETs remain disabled effectively forever, and the battery never charges again until it self discharges enough for the gauge to shut down; any reasonable customer will RMA the product well before this time.

This issue was originally reported in the following thread: https://e2e.ti.com/support/power-management-group/power-management/f/power-management-forum/1451501/bq40z50-r2-precharge-start-voltage-vs-charging-voltage-low/5620339#5620339

The workaround I decided upon at the time was to change the PCHGC, CHGC and OCC recovery thresholds from negative to positive. This allows a current of zero to clear the recovery threshold, and avoid this "chicken and egg" problem.

The problem with this workaround is that the gauge then repeatedly enables and disables charging at a period equal to the sum of the delay and recovery delays. After further consideration, we do not wish to accept this behavior; in case of a faulty charger, we want charging to stop until the charger is disabled or removed. This seems to be the recommended behavior anyway.

A second workaround is for our host SW to monitor for prolonged periods of DSG = CHG = 0 and reset the gauge using the DeviceReset command (0x0041); however, this command is not available while the gauge is sealed. My questions are as follows:

[1] Are there any other workarounds for this issue? I see a few other instances of this issue reported across E2E, but no clear solution other than to say this issue is unlikely to happen often.

[2] Is there any problem with leaving the gauge unsealed during production so that DeviceReset remains available, or at least temporarily unsealing the gauge for the purpose of resetting it?

Thank you in advance for your support—in case I can clarify either of my questions, please let me know.

6 months ago

0 Anthony 6 months ago

TI__Mastermind 28440 points

Hi Jeff,

I think the only other workaround would be to extend the CHGC: Delay to a long enough period to allow the change in charge to be allowed. Regarding the unseal, I believe that going through an unseal in the field could be a security risk, while also opening the door to data flash values being accidentally changed by the host. This is why we do not recommend doing this in the field.

Regards,

Anthony

0 Jeff LaBundy 6 months ago in reply to Anthony

Intellectual 300 points

Hi Anthony—thanks again for your support on this topic; I very much appreciate your guidance and patience. I had a similar concern about unsealing the pack in the field—thank you for sharing your feedback; I am aligned with you.

I studied this issue some more, but it seems it is not possible for our design to trigger an OCC fault under normal circumstances anyway. Our OCC1 and OCC2 thresholds are set to 6500 and 8000 mA, respectively; however, the maximum charging current of our charger IC (BQ25798) is only 5000 mA. There would have to be some major hardware problem for an OCC fault to occur, and the corner case described here would be irrelevant.

Similarly, it is not possible for our design to trigger a PCHGC fault either. Our precharge start voltage is set to 2500 mV; however, the BQ25798 has its own precharge start voltage near 3000 mV as configured in our design. As such, the BQ25798 automatically reduces the precharge current below the PCHG threshold with plenty of margin, and does not rely on our host SW to react in time.

Therefore, there is no practical chance of OCC or PCHGC faults; the only concern is a possible CHGC fault. I agree that it is pertinent to set the CHGC delay longer than our host SW reaction time; however, we should assume the latter is infinite in case our host SW fails.

I think a compromise we can make here is to use a positive CHGC recovery threshold as we discussed before, but set the CHGC recovery delay to something much higher (e.g. 60 seconds). This way, the gauge can recover from a compound CUV and CHGC fault; however, the retry period is much longer, and the cells are not repeatedly faced with excessive charging current as rapidly. In case you have any concerns, please let me know.

My last question for now is to understand what is the expected behavior if both OCC and CHGC faults occur simultaneously? Does the gauge wait for both faults' recovery criteria to be met, or does one fault take priority over another?

In our design, the OCC1 and CHGC (RT) thresholds are the same (6500 mA), but the delays are 6 and 3 seconds, respectively. If our charger IC did somehow drive over 6500 mA into the cells, the CHGC fault would occur first; it's therefore impossible for an OCC fault to occur according to our configuration. In case I have misunderstood, please let me know.

0 Jeff LaBundy 6 months ago in reply to Jeff LaBundy

Intellectual 300 points

Hi Anthony—one additional question; I noticed this statement on page 40 of the TRM:

The bq40z50-R2 device goes through a full reset when exiting from SHUTDOWN mode, which means the device will reinitialize.

I see the ShutdownMode command (0x0010) is available while the pack is sealed. As an alternative workaround, can our host SW simply send this command to shut down the gauge during prolonged periods of DSG = CHG = 0? After the customer removes and inserts the charger, it seems the gauge would reset as if we had sent the DeviceReset command (0x0041).

I tested this hypothesis by decreasing VC1–4 below the CUV threshold such that the gauge signaled a CUV fault, then sent the ShutdownMode command. While the gauge remained in shutdown mode, I increased VC1–4 above the CUV threshold, but below the CUV recovery threshold. For the purpose of this experiment, I set the CUV recovery threshold unrealistically high.

I then very briefly applied a voltage higher than the charger present threshold, but below the CUV recovery threshold. The gauge powered back up as expected, and the CUV fault was cleared.

This suggests that the ShutdownMode command followed by a charger unplug/plug cycle can clear faults in the same way as the DeviceReset command. Can you confirm whether this is correct, and whether it can be a viable SW workaround to this chicken-and-egg problem?

0 Anthony 6 months ago in reply to Jeff LaBundy

TI__Mastermind 28440 points

Hi Jeff,

Jeff LaBundy said:
I think a compromise we can make here is to use a positive CHGC recovery threshold as we discussed before, but set the CHGC recovery delay to something much higher (e.g. 60 seconds). This way, the gauge can recover from a compound CUV and CHGC fault; however, the retry period is much longer, and the cells are not repeatedly faced with excessive charging current as rapidly. In case you have any concerns, please let me know.

I am aligned with you here, I think this could be a tangible workaround for this kind of issue. I would recommend testing this just to confirm its validity.

Jeff LaBundy said:
My last question for now is to understand what is the expected behavior if both OCC and CHGC faults occur simultaneously? Does the gauge wait for both faults' recovery criteria to be met, or does one fault take priority over another?

Since both the OCC and CHGC are both firmware protections, there is no priority between them. If they are triggered simultaneously, the XCHG bit will become set regardless and open the CHG FET. For the gauge to clear the XCHG bit, both of the protection's recovery conditions need to be met for the FET to be closed again. This idea can be applied for any firmware protections that are triggered at the same time that affect the same FET.

Jeff LaBundy said:
In our design, the OCC1 and CHGC (RT) thresholds are the same (6500 mA), but the delays are 6 and 3 seconds, respectively. If our charger IC did somehow drive over 6500 mA into the cells, the CHGC fault would occur first; it's therefore impossible for an OCC fault to occur according to our configuration. In case I have misunderstood, please let me know.

You are correct here. If the CHGC protection is triggered first, then it would be impossible for the OCC protection to trigger since the FET is now open and there is no access to the sense resistor, which the OCC relies on.

Regards,

Anthony

0 Jeff LaBundy 6 months ago in reply to Anthony

Intellectual 300 points

Hi Anthony—thank you for your feedback; I'm aligned with you. Thanks to your generous support these last few weeks, I feel I've become adept at managing the gas gauge configuration.

I think the last question for now is whether the ShutdownMode command (0x0010) followed by a charger unplug/plug can effectively act as the DeviceReset command (0x0041) for the purpose of clearing out any such "chicken and egg" compound faults, as my quick test suggests.

This is more attractive as a general workaround for cases where DSG = CHG = 0 for prolonged periods of time, as opposed to trying to protect against any one case using carefully selected golden file parameters.

+1 Anthony 6 months ago in reply to Jeff LaBundy

TI__Mastermind 28440 points

Hi Jeff,

Jeff LaBundy said:
I think the last question for now is whether the ShutdownMode command (0x0010) followed by a charger unplug/plug can effectively act as the DeviceReset command (0x0041) for the purpose of clearing out any such "chicken and egg" compound faults, as my quick test suggests.

This could work since the command is accessible in sealed mode (I believe you will have to send it twice if in sealed), with the unplug/plug meeting the VStartup condition needed to exit shutdown. A full reset is accomplished upon exiting shutdown so it seems to fit the goal here. Again, I would perform more extensive testing to confirm this.

Regards,

Anthony

0 Jeff LaBundy 6 months ago in reply to Anthony

Intellectual 300 points

Hi Anthony—thank you for your feedback; I'm aligned with you. Thanks again for your continued support!

Power management

Power management forum

BQ40Z50-R2: Bricked pack during compound CUV and CHGC faults