This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

BQ76942: POR reset by internal watchdog after exit of config mode

Part Number: BQ76942
Other Parts Discussed in Thread: BQSTUDIO

Hi!

We have some problems related to config mode and the internal watchdog of the AFE. When powering up, we have two subroutines that we run.

Subroutine 1 checks if AFE is ready:

  • Check if AFE is in deepsleep and wake AFE if so
  • Check if there is any permanent failures stored in AFE by checking PF Status: 0x0053.
  • Check OTP programming etc
  • Check if AFE is configured. Right now we do this by checking some version numbers in MCU, POR, and check some registers in the AFE.
    • If a configuration is needed, we first perform a partial reset via pin.
    • Run subroutine 2 if configuration is needed
  • Wait for a fullscan to occur before resuming normal operation, if configuration is not needed

Subroutine 2 happens if configuration of the AFE is needed. The configuration subroutine looks like:

  • Enter config mode
  • Write parameters over I2C  (~150+ parameters)
    • Readback parameters after write to check that it is set.
  • Exit config mode
  • Store version numbers in MCU
  • Go back trough subroutine 1 to resume normal operation

We've run into a case where the AFE seems to reboot due to the internal watchdog. 

What we've done is that we've removed two of our parameters our parameter list, the CFETF permanent fail threshold and delay registers 0x92EE and 0x92F0. When we do this, we get no feedback on the 0x0053 command. When we later check the battery status, we see that the POR and WD bit have been set in the MCU.

Questions:

  • Any idea what may trigger the internal watchdog in this scenario? Is there any way of finding the reason?
  • Is there any relationship between the CFETF permanent fail registers and other registers? Do we have to write all of them? We want AFE to handle DFETF protection, but not CFETF.
  • Is there any particular sequence that should occur when exiting config mode? We figured out that we should wait for a fullscan. Is there any other timing related events?

Best regards

// Erik Almqvist

  • Hi Erik,

    This seems very strange - I cannot see why removing two parameters from their list would cause any issue. Is it possible some other error was introduced when making this change? 

    Matt

  • Hi Matt!

    Thanks for your support!

    I work as a programmer, so I'm quite familiar with the feeling of finding out that it's your own code that's messed up. It wouldn't surprise me if, in the end - this was the case yet again. I'm quite certain that the only change between my two builds are sending these extra parameters (two commented lines in my case). I've also tried to validate that the rest of the data sent to AFE are consistent. Although, for sure - I might have missed things.

    I've previously sent this forum post regarding reset:

    BQ76942: AFE reset when query DA STATUS 5 after config mode
    where it seems like the AFE resets after querying stuff right after config mode. We still don't use CRC, but our protocol are quite stable once up and running.

    We are trying to figure out if it might even be possible to send a garbage setting that would trigger the watchdog, or if there is anything else that might come into play timing wise. As far as I can tell, the documentation doesn't mention any procedure for exiting config mode. The documentation do mention that overflow of serial communication might cause watchdog to trigger. We send less data in our case though...

    Is there any way of finding out why the WD has reset?

    Best regards

    //Erik Almqvist

  • Hi Erik,

    There isn't any method that I am aware of that would cause the watchdog to trigger. I also have never been able to cause a watchdog by communicating too much with the device (I have tried to test this before to see if I could trigger it on purpose this way). Is there any way you could share your settings with me - you could send them in a private E2E message?

    One other thing that would be good to verify - there was a watchdog issue on the early samples of the device before this device was released to production. Can you confirm you are using the production version (FW version should be 36)? 

    Regards,

    Matt

  • Hi Matt!

    Thanks for your support!

    The firmware version of the AFE is 36, so no luck there.

    We are not using bqstudio to flash the parameters. We have a embedded C code for that. Creating a minimal example with this code that preserves the timing of the configuration sequence is possible, but could take long time. I can share the parameter set if you want though?

    I've been able to find some additional clues today. It seems like the AFE can cause the POR reset even without the parameter changes I mentioned. It seems to be more likely to happen when powering on the board from shutdown, rather than trigger of the reconfiguration sequence. Adding a delay between each parameter seem to impact the likelyhood of the POR, but not necessarily in a positive way. I'm not sure if the problem here would lie in the configuration sequence though. If I understand the manual correct, the internal watchdog should be disabled while in CONFIG UPDATE MODE. Hence, it must be something that happens at the exit of config mode when AFE resumes normal operation.

    Is there any bit that should be set prior to querying the AFE for data after update mode? Is it safe to poll the 0x0053 PERMANENT FAILURE status at exit of config mode? Or should we poll e.g Alarms status INITCOMP or similar first?

    Best regards

    //Erik Almqvist

  • Hi Erik,

    There is always the possibility you've uncovered a new behavior with the device we have not encountered before. How much time is there between the exit of CONFIG_UPDATE mode and the read from 0x0053? Does adding some wait time at this point have any impact? 

    Regards,

    Matt

  • Hi Matt!

    Thanks for your support!

    It's a bit hard to summarize in a good way, because we've tested a bunch of things back and forth on our end. But we learned some things from your post yesterday.

    * If we added a 50 ms delay after the exit of config mode, the sequence seems alot more (No fullscan check is made prior to this in this test ). We ran a couple of hundreds of iterations successfully. Normally we would have a 7 ms delay before we querry AFE for PF STATUS register. If we extend this to ~60 ms it looks better.

    * We've also worked a bit with re-adding a fullscan check back after exit of config mode. In previous tests, we waited for up to 500ms after exit of config mode - but still had problems. It seems like it takes around 1.1 seconds for the full scan bit to reset if we clear it just after exit config mode. I'm not sure that waiting this long actually guarantees success though, we are still testing. It do make sense to me to wait for a fullscan after exit of config mode.

    It seems like the AFE performs POR due to watchdog if we querry some commands before the AFE is ready after exit of config mode. It seems timing related. Some commands seems OK to send, like battery status and control status. Others does not seem OK to send, like PF status, REG 12 (and DA Status 5 requests).


    (We are still evaluating on our side. Please keep the thread open if we have any followup questions within next few days)

    Best regards

    //Erik Almqvist

  • Hi Erik,

    These are interesting findings. So it seems that subcommand requests are causing issues without enough wait time upon exiting CONFIG_UPDATE mode, but direct commands are okay (just a hypothesis based on the commands that cause issues vs the ones that do not).

    Regards,

    Matt

  • Hi Matt!

    Thanks for your support once again!

    I did some additional tweaking to our sequences today. We now wait for a complete fullscan after exit of configuration, and have changed the order of some checks we do when we boot up. To me, it seems like the time it takes for the AFE to load the parameters are dependent on the amount of parameters. We currently don't have PIN driven fullscan checks, but checks every 50 ms for new data. If we write 179 parameters in config mode, it takes 23*50 = 1.15s for the fullscan bit to set. If I instead write 71 parameters in config mode, it take 16*50 = 800ms for the fullscan to complete. The fact that it takes time for the AFE to load the parameters seems logical. I don't find any references to this in the documentation though ( or to avoid sending subcommands... :) )

    It seems like the fix on our side to avoid the POR in this case are slight adjustments to how we wait for fullscan. I've been looping hundreds of iterations without problem if I fixed our fullscan check (previously we waited max 500ms). If we communicate prior to fullscan, the AFE seem prone to reset due to the internal watchdog. As it seems timing related, maybe there could be something with the processing of incoming subcomands during the sequence of applying the configuration (just guessing at this point though).

    Edit: The addition / removal of the CFETF parameters likely changed the timing of the events in the AFE slightly, which is why I originally had problems with the sequence by changing only these parameters...

    Best regards
    //Erik Almqvist

  • Hi Erik,

    Thanks for sharing these details. I didn't realize the timing changed based on the number of parameters, but this makes sense and I can check with our firmware team to confirm this. I am sure this is going to help other users in the future who run into this issue.

    Thanks!

    Matt

  • Hi Matt!

    Yes, I guess it makes sense in the end. It's just a long way to conclude this.

    Thanks for mentoring through this issue!


    Best regards

    //Erik Almqvist

    Edit: Please check with firmware team and update documentation if you find it relevant!