This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

BQ27510-G3: Difficulty with jamming the I2C Bus to test an error-recovery routine

Part Number: BQ27510-G3
Other Parts Discussed in Thread: BQ27510,

Two years ago, I asked a question about "unjamming" the I2C slave interface of a BQ27510 Battery Gauge. (With a little luck, this post has a link back there. If not, I'll add one.)

Based on the answer to that question, we wrote the unjammer and included it in our code but as we test our completed system, we need to prove that the unjammer is effective. This means that we first need to be able to JAM the I2C interface of the Battery Gauge. So I wrote some code that bit-bangs out an I2C "Start condition" followed by a few Address Bits but then leaves off in the midst of that I2C transaction. Here's a 'scope shot of that code running:

 

This jams some samples of our Battery Gauges (such that subsequent attempts to do I2C transactions with the gauge fail until we do the recovery procedure) but not other samples.

So we're wondering "What's up?"

If the Battery Gauges are true, correct I2C devices except that the can be reset by holding SCL and SDA low for two seconds, then the jammer should be effective against all of them. But it's not.

Can you shed any light on this? For example, have there been any "steppings" of the BQ27510 lately? Is there an errata sheet published that might give me some clues?

Atlant

  • Hello Atlant,

    Are you holding the lines down for 2 seconds after you jam the gauge?

    Let me check with some other members of my team and get back to you tomorrow.

    Sincerely,

    Wyatt Keller

  • Wyatt:

    Thanks for your reply!

    After we do the "jam" sequence (that I showed in my first post), the next thing we attempt to do is an ordinary I2C Battery Gauge read operation (which happens anywhere from instantaneously to 60 seconds later). We expect that read to fail but for most of our Battery Gauge chips, it doesn't. The Battery Gauges respond to that I2C transaction just as if we hadn't tried to jam the bus. One sample jams about 50% of the time. One sample jams rarely. But most* seem to completely ignore the jam attempt**.

    If the ordinary read fails, it is only THEN that we do the "two second unjam" operation and as far as we know, that always works: after that, the Battery Gauge responds to the next ordinary I2C Battery Gage read operation.

    Atlant

    * We don't have a huge amount of statistics yet. The date code for the gauge that does jam is "99 TD" (which ought to be quite recent) but other gauges of that same date code don't jam. We believe they all have identical firmware and chemistry files and other configuration data.

    ** Ordinarily, I'd say that that's a great thing but it's giving our test personnel grief because it prevents them from proving that our "unjammer" works.

  • Hello Atlant,

    I'm waiting to hear back from the team. If you are able to un-jam with the 2 second wait and also still able to get some good communication after trying to jam wouldn't this condition be okay to proceed with? It sounds like they are actually harder to jam than expected.

    Sincerely,

    Wyatt Keller

  • Wyatt:

    > I'm waiting to hear back from the team.

    Thanks -- I appreciate your help!

    > If you are able to un-jam with the 2 second wait and also still able to get some good communication
    > after trying to jam wouldn't this condition be okay to proceed with? It sounds like they are actually
    > harder to jam than expected.

    I agree that it's a good thing that the gauges appear to be "harder to jam"* than expected but the
    problem for me is that our test folks want us to be able to affirmatively demonstrate that all of our
    code is working. For the unjammer code, that means that they would like to see the gauge be
    proven to be jammed, have the unjammer run, and then have the gauge be proven to be unjammed.

    That's a good approach to testing. The difficulty comes if the gauge really is hard to jam (in which
    case we'll need a different testing approach or we'll need to write a justification about why the
    unjammer can't be explicitly tested).

    But that still leaves us wondering why different samples of the gauge are behaving differently.

    Atlant

    * Maybe some SMBus technology has snuck into the BQ27510? ;-) Reading the SMBus
    spec, it looks like those gauges would be essentially impossible to jam owing to how the
    SMBus spec requires resetting the state machine after 25 ms of CLK low (CLK stretching)
    or after (I think) 10 ms of CLK high (representing bus idle).

  • Hello Atlant,

    I see your problem now, it is more about testing the recovery.

    I've heard back from the designers, they told me the following:

    There's no 2-second timeout requirement in the I2C protocol so not all devices will honor it. The best thing to do in an abandoning transactions scenario is to send a STOP to reset the bus. If the transaction is abandoned while the slave is holding SDA low, preventing a STOP from being generated, then they need to follow the I2C specification and send up to 9 clocks to make the slave release SDA first, then send a STOP.

    I'm not sure if this helps with your testing, they didn't specify how to cause a glitch in the I2C to be able to recovery from, just how to fix it.

    Sincerely,

    Wyatt Keller

  • Wyatt:

    > I'm not sure if this helps with your testing, they didn't specify how to cause a glitch
    > in the I2C to be able to recovery from, just how to fix it.

    Can they (publicly) describe the BQ27510-G3's I2C state machine enough to allow
    me/you/us to understand why most of my gauge samples don't jam? It's still puzzling
    why one does and most don't when they ought to be identical.

    Also, a new and maybe-related question:

    The Technical Reference Manual mentions the "I2C Timeout" parameter:

    It's stated as a percentage of something and I ASSUME that it's a percentage
    of this Data Sheet value:

    But as far as I can see, none of this is explained further. And I
    wonder if any of this feeds-in to my question of why my gauges
    generally won't jam.

    BTW, I THINK we set this parameter to "4" which may mean 0.5 seconds
    (the math works out sorta-kinda close).

    Atlant

  • Hello Atlant,

    I don't believe we can go into the details of the state machine. They said that the gauge can't "jam" but it's basically abandoning a transaction, and that was the design teams answer to get out of any abandoned transactions.

    Yes this could be related to your question, have you tried adjusting the value to see if this affects your testing? the gauge could be timing out before there are any errors occurring, preventing any jams.

    Sincerely,

    Wyatt Keller

  • Wyatt:

    > There's no 2-second timeout requirement in the I2C protocol so not all devices will honor it.

    Thanks; I understand that. But at this point, the BQ27510-G3 is the only device left on our I2C Bus; everything else has been migrated to their SPI Bus equivalents. (Would this be the moment for me to suggest that we really wish that the TI Gas Gauges came in SPI Bus versions? ;-) They don't do they?)

    > I see your problem now, it is more about testing the recovery.

    Absolutely! We generally require that every high-level specification item (such as "The Battery Gauge subsystem shall recover from transient bus errors.") propagate down to the subsystem specs (same language) and then farther down to the very specific test plans (such as "The test of Battery Gauge the Battery Gage bus error recovery logic shall 1. Create an I2C Bus error; 2.Demonstrate successful recovery from that bus error.") It's that first item in the test that's giving us grief right now.

    > ...they didn't specify how to cause a glitch in the I2C to be able to recovery from...

    We may need to figure out how to justify testing our requirement purely through software and specification analysis. I guess I'll see how folks react to that.

    Atlant

    P.S.:

    > The best thing to do in an abandoning transactions scenario is to send a STOP to reset the bus. If the
    > transaction is abandoned while the slave is holding SDA low, preventing a STOP from being generated,
    > then they need to follow the I2C specification and send up to 9 clocks to make the slave release SDA
    > first, then send a STOP.

    In another of our projects that had more-conventional I2C Bus slave devices,
    this is exactly the approach that we took and we have the Bus Analyzer traces
    to prove that our code works! ;-)

  • Hello Atlant,

    Currently none of the gauges use SPI, and I'm not sure if there are plans in the near future to support this communication as well. Hopefully there will be!

    I am sorry I can't give a better answer, it looks like you may have to change your testing procedure to see how well the communication can recover.

    Sincerely,

    Wyatt Keller