We have discovered some sort of glitch state or failure mode where the SCL pin on the BQ25895 becomes stuck pulling low, thus stalling the I2C bus indefinitely. Once the device enters this state it persists across power cycles, even after power is completely removed and all caps are discharged. In this state (while powered on, or power removed) the pin shows a low resistance to GND (measures around 9 - 28 ohms).
It seems almost like the BQ25895 is doing clock-stretching (holding SCL low); except that it remains in this state indefinitely.
Isolating the Issue:
We have isolated the issue specifically down to this pin on the BQ25895 device, it is not caused by other devices on the bus. We are puzzled how this is even possible, however we have now seen this on 3 different devices. We have so far been unable to reproduce the problem on demand, so are still unclear what conditions could be causing the BQ25895 to enter this state. Particularly puzzling is why it remains in this glitch state across power cycles.
During rework, in one case hot-air reflowing the module fixed it, only for the problem to return several hours later. On another board, the problem went away after cutting traces, probing, power cycling several times, etc. It worked for a while, then problem returned for several minutes, then problem cleared again. It seems that momentarily shorting the SCL line's pull-up resistor (forcing the line high) will also temporarily reset/fix this error condition; though this does not seem like a reliable long-term solution.
Some background on the circuit / application:
The application microcontroller is powered from the SYS supply rail. We are disabling the BQ25895 watchdog. We are also not using PMID. After the system is brought out of ship mode with a button press, the application processor gets powered on, does some things, then eventually sends the command to place the BQ25895 back in ship mode to power itself off. Normally this has been working great; until the BQ25905 pin becomes locked up. We are using 4.7k ohm pull-up resistors (from the microcontroller's 3.3V supply).
[Circuit snippet schematic]
[Edit - We are wondering if it might have something to do with our button QON circuit. We are sharing this button with the microcontroller input pin, which pulls this pin up to 3.3V through internal pullups. Also QON pin should have internal 200k pullup as well (pulling up to ~5v or Vbatt depending on supply present), according to BQ25895 datasheet. Including an image of this circuit below.]
Any help or suggestions would be greatly appreciated (1) what might be causing this condition, or (2) how to best recover from this (since power cycling does not reset the problem).
Thank you!