This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/CC2640R2F: In field device failures - crashes in Icall_init() in main.c

Part Number: CC2640R2F
Other Parts Discussed in Thread: SYSBIOS

Tool/software: TI-RTOS

I need some pointers on how to track down exactly what is happening with some device failures in the field.

Basically, the devices do not start properly (will not run our application code) in the failed mode - using the debugger in Code Composer Studio 7 I have isolated it to a particular strangeness.

The devices are all identical.  The vast majority work OK but 2-3% of them die.  The devices get as far in our main.c as calling ICall_init() but go no further.  Inside ICall_init(), the device seems to be calling a function ti_sysbios_BIOS_start__E(); which doesn't seem right to be (i.e. the device never gets as far as the BIOS_Start() call at the end of main.c.

This always ends up busy inside a HWI exception function.  Understandably I guess since none of the application has been registered at that point.

I initially thought it might be a crash related to loading some data using the SNV but it never gets that far.

If I reflash the device with a fresh copy of the (identical) firmware, they start OK.

Is there any way for the firmware to get corrupted?  The devices are very simple, running a cr2032 battery and as far as I can tell not subjected to a disastrous amount of static i.e. they can be revivied by re-flashing them (although I don't know how long they are being revived for).  Is there a hardware / processor failure mode that would cause something like this?

  • Hi David,

        I suggest that you duplicate the cause of the failure in house, by subjecting the device to whatever outside factors might affect the device. One such test is running the device in a high temp chamber oven and see if the device will fail. If the device fail then you need to find out how the temperature cause such failure. It is possible to have affected power to the device. To confirm that you would need to analyze the current consumption while subjecting the device to high temperature. Here below are some ways to measure BLE device current consumption.

    1. Oscilloscope + Current Probe.

    2. DC Power Analyzer.

    3. Digital Multi Meter

      Also, I think it is important to know the battery level of cr2032 of the devices that failed in the field.

    -kel

  • Thanks Markel - what am I looking for when doing the heat / environmental test with respect to the power consumption? How would that be related to a potential firmware corruption? Not sure what the link would be (unless perhaps the battery is over-volting or something like that? Or if a moisture incursion is causing a short across some pins and that is affecting the flash memory in the CC2640R2? Since they all fail in the same place it seems unlikely i.e. I would have thought any corruption would be fairly random.
  • David Rubie said:
    How would that be related to a potential firmware corruption? Not sure what the link would be (unless perhaps the battery is over-volting or something like that?

    I have extensive experience in Failure Analysis and Debugging. From this I would look into how environmental factors would affect the device power and device clock.

    Going back to my earlier suggestion.

    I am just giving an example of the possibility that temperature might affect your device. Let say if you subject your device to high temp chamber test and it fails. Why and what is the root cause.

    Device Current Consumption Profile can give you an idea during what operation of the device fails when subjected to high temperature. You compare the Device Current Consumption Profile to device operating at normal operating temperature.

    Again why and what is the root cause. What particular part of the hardware fails. Is it power circuitry, device clock, particular component, or is it caused by cold solder.

    If the device did not fail at high temp chamber test. Then look into other factors that might cause this failure.

    If the failure is caused "moisture intrusion". Then put several boards to a casing that is moisture proof and see if it fails. If it does not fail then your device failure was caused by a casing problem.

    How about implementing a software solution like a watchdog or shutting down the device when the battery voltage reaches critical level. If several boards with the software fix does not fail, then the software fix your device failure.

    -kel