BQ79616-Q1: HV transients in the module battery causes BQ chips to be locked up and need physical power cycling

Part Number: BQ79616-Q1
Other Parts Discussed in Thread: BQ79616, , ESD562

Tool/software:

Dear TI Engineers,

We are a Formula Student racing team from Germany, developing and building a fully electric race car. Our high-voltage accumulator operates at 600 V and consists of 12 modules, each equipped with a slave board. The system also includes a master board, which uses a BQ79616 in master mode with galvanic isolation (implemented according to TI’s recommendation with PCB-to-PCB isolation between two slaves).

Last year, we encountered a very similar problem once the modules were connected together. With our second-generation accumulator, the failure has reappeared—this time in a slightly different manner.

For more than three months now, we have been facing this issue without finding a sustainable solution. I am at a loss as to whether and how we can still solve the problem within the next two weeks, which is why I am contacting you in a desperate attempt. Unfortunately, time is running out for us, and we want to make sure we have tried everything possible to get the race car running at the upcoming events.

We are turning to you in the hope that you might somehow be able to help us, or that you may know someone who could.

To summarize briefly, it concerns the precharge process of our HV battery, during which two errors occur randomly:

  1. Communication crash between master and slaves

    • Communication between the master board and the twelve slave boards (12 × 50 V stacks) crashes.

    • The precharge process is aborted and the shutdown circuit opens.

    • Sometimes no values are transmitted, or suddenly implausible values appear and continue to be transmitted even after the shutdown circuit opens.

    • Only when the master board is power-cycled (reinitializing the BQ79616-Q1 startup sequence) do the readings (voltage and temperature) become valid again.

  2. Complete crash of one or more slaves

    • At least one BQ79616-Q1 on the stack boards crashes completely.

    • The shutdown circuit opens, and communication cannot be restored until the affected board is manually restarted via its hardware switch.

Both errors occur either when the negative pole (AIR –) is closed, or when the battery’s precharge relay is closed. However, once the process succeeds even once, the battery can be used without further problems.

We have performed extensive troubleshooting: oscilloscope measurements at many points, replacement of all components, redesign of boards with improvements, cable swapping, EMC shielding with copper tape, software validation, and more. Despite this, the issue persists, and we are running out of ideas—I fear that we no longer “see the forest for the trees.”

Additional observations:

  • We do observe HV transients on the daisy-chain communication lines.

  • Each slave board hosts two BQ devices.

  • Our layout does not include ground planes, as we initially assumed the copper busbars beneath them would provide sufficient reference.

We are currently preparing a structured write-up with a detailed description of the accumulator, the failure cases, and our attempted solutions. The information provided here is therefore only a summary.

Perhaps, however, you can already classify the problem and indicate whether there are known sensitivities in such setups, or whether you could potentially support us in narrowing down the root cause.

We are fully aware that you receive many support requests, and we have no expectations. We simply wanted to make this attempt, because hope for a solution is the last thing we are holding on to.

Many thanks in advance for your time and any guidance you can provide.

With best regards,
Hasan Fawaz

  • Hello Hasan,

    I immediately go to the com lines, with the closing of the contactors, you can induce transients on the COM pins that can soft lock the device. 
    Are you sure those TVS diodes 'ESD2CAN12' are rated properly?

    We typically use the ESD562, maybe you can try swapping those in?

    Best,

        Quentin

  • Hi Quentin ,  
    Thank you so much for your reply . 
    Yes that is something that we can still try. I would like to let you know that we have tried different types of diodes, and with that we got better results and the problem did not come as often . 

    This is a way to tackle maybe the root cause, however I am still interested in knowing how to remove the soft lock and get the BQ to restart. We have of course tried sending a HW Reset tone to the base device and then used it to ping up the HW reset tone up the stack. 

    Is HW reset really a function that is similar to manually power cycling the BQs ? and what is the difference between both ? 

  • Hasan,

    If you have some sort of soft latch in the COM pins, then a HW_RST is not going to do anything. 
    The reason is because an issue on the COM pins may prevent future communication from making its way to the IC. 
    Therefore, a HW_RST wouldn't work. 

    A Power cycle will completely unpower all internal blocks and disable the COM pins completely removing any latching. 

    Best,

         Quentin