BQ33100: Flash programming caution

Charles Bergren

Part Number: BQ33100

Your document slua645-1.pdf states this about programming the BQ33100 flash - "Caution: If power is interrupted during this process, the device may become unusable". I can confirm that this caution appears to be true; I executed a power interruption during programming before seeing the cautionary statement. The BQ33100 now no longer acks it's default address of 0x16 and no pulse activity shows up on the TS pin (thermistor). To me, the BQ33100 appears to be 'bricked'. I have some extra questions about this.

1. There are probably extra circumstances under which this bricking occurs. On another drive, I saw the host fw start a learn cycle in the middle of the programming cycle. That chip too appears bricked. This makes me worried about how to treat the chip; I'd like to figure out all the circumstances I must avoid.

2. Can you throw light on why the chip becomes bricked? It must be some state variable within the chip, since I seem to have bricked a chip without a power cycle.

3. Is there no recovery possible for a bricked chip? Do I need to replace chips that are bricked? If so, what is a definitive measure that indicates bricking has taken place? Is NACKing address 0x16 sufficient? Is lack of pulses on the TS pin sufficient?

Best regards

over 6 years ago

0 Onyx Ahiakwo over 6 years ago

TI__Guru* 76230 points

Hi CB,

If your device is bricked, there will be no pulsing of the TS pin and the 2.5 V LDO will not have a voltage output or it will be less than 2.5V.

You will need to replace your chip, there is no recovery mechanism for a bricked chip.

Bricking the IC due to programming occurring while learning occurs is new to me but thanks for pointing that out. The only other known time the chip bricks is cycling power while programming

thanks
Onyx

0 Charles Bergren over 6 years ago

Prodigy 130 points

Here is the response of what I call a 'bricked' drive. Hope someone can spot something. It shows an immediate NACK of the default address 0x16

0 Charles Bergren over 6 years ago in reply to Onyx Ahiakwo

Prodigy 130 points

Thanks for the prompt information Onyx. I will relay our information to the team. I'll resolve the issue after contacting them to see if they have more questions on this issue

0 Bryan Kahler over 6 years ago in reply to Charles Bergren

TI__Mastermind 25955 points

Hi Charles,

I have attempted to brick the device in the lab by removing power during programming both in ROM mode and also when in FIRMWARE mode. I was not able to replicate the failure. The device either recovered after reset in FIRMWARE mode or reverted to ROM mode whereupon flashing the .senc file restored device functionality. The device was able to ACK the default address of 0x16.

Some follow up/through questions and steps:

Are you using an EV2300 to program the device?
If so, what version of the firmware is currently on the EV2300?
If an EV2300 is not being used to program the device, what is?

Regarding the scope plot, where on the board was this plot taken?
Please scope the SMBC/SMBD lines:
1. At the gauge
2. At the master device

Sincerely,
Bryan Kahler

0 Charles Bergren over 6 years ago in reply to Bryan Kahler

Prodigy 130 points

Hi Bryan

Thanks for the post. The master for this transaction is a host microprocessor, on an internal I2C bus. I took traces both at the host and right at the BQ33100 TSSOP24 ( Pins 14 , 16 ) SDA SCL. I'll have to check which measurement this graphs shows. The salient detail is the immediate nack... This couldnt happen if the BQ wasn't bricked. On a 'normal' bq, the waveforms are identical, but proceed to full interaction, (ie past the nack).

The circuitry has the signals coming from the host, through an I2C level translator, then through an RC combination 100 ohms 100 pf). Measured at the host or at the BqA chip, the rise-times looked very similar. I suspect internal chip capacitances predominate. I think the BQ side pullups are 5K into 3.3V.

0 Bryan Kahler over 6 years ago in reply to Charles Bergren

TI__Mastermind 25955 points

Hi Charles,

Thank you for those details. I agree wholeheartedly - nack is the important detail.

Could you please shed a little more light on the programming flow of the bq33100 from the host micro? Are there any integrity checks? Are the first 2 rows of data flash saved to non-volatile memory, then erased, data flash is written (except for the first 2 rows), followed by the the first two rows being written to data flash?

To try and recover the device, please spam the device with the enter ROM command (write 0x00, 0x0F00) as you power the device. If the device ACKs, program the device with the senc file.

Please try the spamming/powerup several times and let me know how it turns out!

Sincerely,
Bryan Kahler

0 Charles Bergren over 6 years ago in reply to Bryan Kahler

Prodigy 130 points

Hi Bryan

Thanks for the reply. I'm putting my replies >>>inline below

Could you please shed a little more light on the programming flow of the bq33100 from the host micro?
Are there any integrity checks?

>>> We haven't performed any integrity check on the flash data yet. If we do manage to get to a reliable programming process, we will probably add a read-back (during ROM mode) and a full byte compare. Please see my other recent thread about the BQ checksum. For us, it's walk before crawl.

Are the first 2 rows of data flash saved to non-volatile memory, then erased, data flash is written (except for the first 2 rows), followed by the the first two rows being written to data flash?

>>> I'm not quite sure what you're advising here; can you restate the question? During a successful programming, we erase 32 times then program 32 times. They are each done in order, 0-31 much like the slua645-1.pdf document section 2.3 advises.

To try and recover the device, please spam the device with the enter ROM command (write 0x00, 0x0F00) as you power the device. If the device ACKs, program the device with the senc file.

>>> I have tried hundreds of powerup cycles followed by addressing the BQ33100 (BQ). The BQ nacks the address 0x16 every time without fail. We tried probing the TS pin and it had no pulse on it; whereas a working BQ does. We tried every other address besides 0x16, and the only one NOT nacked was 0x00, the general call address. However, the BQ would nack any command following the 0x00.

Please try the spamming/powerup several times and let me know how it turns out!

0 Bryan Kahler over 6 years ago in reply to Charles Bergren

TI__Mastermind 25955 points

Hi Charles,

Thank you for the inline comments. It is not recommended to remove power from the device during programing. However, for a more resilient programming flow, please see Fig 8-10 in this document: http://www.ti.com/lit/slua449f

The app note is for a single cell gauge and the commands may differ from those of the bq33100, however, these flowcharts describe the process of programming the device in more resilient manner (instead of straight erase and rewrite). In this flow, the first two rows are written after data flash has been confirmed to be written successfully.

As for an attempted device recovery:

1) Please power up the host individually (do not power up the bq33100).
2) With the host powered up, try sending the enter ROM command in an infinite loop from the host.
3) As the commands are continually being sent (to address 0x16), power up the bq33100.
4) If the device ACKs in response to its address and enters ROM mode, please program the device.

Please let me know how this test turns out.

Sincerely,
Bryan Kahler

0 Charles Bergren over 6 years ago in reply to Bryan Kahler

Prodigy 130 points

Hi Thanks
I put my comments >>>inline below
I thought I replied properly once. Please delete this if it is a duplicate

Hi Charles,

Thank you for the inline comments. It is not recommended to remove power from the device during programing. However, for a more resilient programming flow, please see Fig 8-10 in this document: http://www.ti.com/lit/slua449f

The app note is for a single cell gauge and the commands may differ from those of the bq33100, however, these flowcharts describe the process of programming the device in more resilient manner (instead of straight erase and rewrite). In this flow, the first two rows are written after data flash has been confirmed to be written successfully.

>>>CB Thanks - I will study the newer programming algorithm to see if we can implement it in place of the one we're using

As for an attempted device recovery:

1) Please power up the host individually (do not power up the bq33100).
2) With the host powered up, try sending the enter ROM command in an infinite loop from the host.
3) As the commands are continually being sent (to address 0x16), power up the bq33100.
4) If the device ACKs in response to its address and enters ROM mode, please program the device.

>>>CB Thanks - I've relayed this info to the team and we can decide if we want to try to 'unbrick' drives. We'll try this if we decide to attempt recovery. I will surely let you know how it works out.

Please let me know how this test turns out.

Power management

Power management forum

BQ33100: Flash programming caution