This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Obscure device failure during production: Software stops, no debug possible

Other Parts Discussed in Thread: CC2640

Hy there,

we are in the stage of SOP, firmware of our BLE device (CC2640, battery powered, some I2C devices) proved to run stable in extensive field testing, no HW issues either.

Now, in the first two badges of series production there is a strange effect, that hits about 5% of all devices: During casting the processor stops working. Devices don't advertise any more and are not connectable to the JTAG debugger, FLASH Programmer does not recognize the device.

  

After HW reset the devices run normal again. Unfortunately, this does not help, because JTAG connector is not accessible in the final product.

Apparently, there is no current draw, at least battery voltage seems normal, even after three weeks in the fail mode. As far as I can measure this, there is no peak current, that is typical for a running device (i.e. during advertising etc). So far, we did not implement watchdog but we do have a software "advert lifeguard". If there was no advertising callback for longer than 3min the device resets. So apparently, the SW does not run any more.

We did implement an exception handler, that writes some debug info to an external eeprom and resets the device, but apparently, that was not triggered.

Any ideas on what else could be the reason for this? Stopped quartz? Electromagnetic Immunity?

What measurements could help to find out the root cause for this?

Regards

Harald

  • Hi,

    Can you try to move the chip on the failing board to a functional board? Then we can at least see if it's a board problem or chip problem.
  • Hy Christim

    thanks a lot for your response. I probably did not express myself well: As soon as I perform a HW reset (connect Pin 10 to Pin 9 on JTAG connector) the device runs like charm. And I am pretty sure, that I can't unsolder the chip from the board without triggering a reset.

    Nothing is damaged, it is only in a bad state.

    Regards

    Harald

  • is it reproducable after you reset the device?
    Can you send us some photos of the final product packaging, schematics and layout?
  • Hy,

    >is it reproducable after you reset the device?

    No, the only situation where it happens is during casting of the device. We have never observed any device getting into this state after or before this process.

    We now have one theory that would explain the behaviour:

    During encapsulation of the device, for whatever reasons (temperature, mechanical stress) one pin (i.e. quartz) lifts, so that the devices stops. After cooling and curing of the casting compund the pin returns to its original position but the controller/quartz would not start, because it won't get a proper reset signal. As soon as we actively reset it by pulling down the reset pin, everything is back to normal.

    Does that sound reasonable to you?

    Regards

    Harald

  • Harald,

    I'm asking simply out of curiosity, in case I ever encounter the same thing, since you linked to this from the other thread. What is your casting process - what's the material, temperature, and cooling time?

    I have had some unexplained failures where a hard reset works (and we do currently have the JTAG exposed for troubleshooting) but I would also be very interested in seeing if implementing the built-in watchdog helps your situation. That seems like the best thing to try in your case. I am currently planning on converting from a software implementation "soft watchdog" to the hardware watchdog, and hope this helps for our issues.

    Best,
    --Allen
  • Hy there,

    although this is almost a year ago, I still owe our explanation for this: We find out, that the 32kHZ quartz apparently is very susceptiple to touching.

    We defined new handling rules during production (wear gloves, ESD protection etc) and managed to diminish the issue almost completely. In addition a small SW change described here helped, too.

    Regards

    Harald

  • Harald,

    Thank you for following up with this useful information.  ESD protection and handling methods are indeed sometimes overlooked, and the consequences can be difficult to diagnose.  Thank you for identifying and documenting this device handling issue.

    ~Leonard 

  • Harald,

    Thanks very much for your reply, it's actually very timely.  I was just trying to diagnose some board failures on the line last week and this might help.

    Did you ever have this issue (suspected) cause permanent device damage?  I have two boards that would no longer connect to the debugger through JTAG regardless of performing hard resets or completely removing power.

    Thanks,

    --Allen

  • Hy Allen,
    actually, I do have quite an impressive pile of "bricked" devices on my desk, but none of them I can blame on ESD issues. It seems, that there is some circumstances that can be triggered by software that get devices in the non-resuscitating state. For instance, during implementation of the watchdog, three devices passed away. Also OAD development has some casualties on its conscience. However, the effect described in this thread can always be reverted by a hardware reset.
    Regards
    Harald