This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/CC2640: WFI (wait for interrupt) hang in multi role device

Part Number: CC2640
Other Parts Discussed in Thread: CC2650, SYSBIOS

Tool/software: TI-RTOS

I have an issue where our firmware gets stuck in a WFI instruction (ARM Wait For Interrupt).

When the device is re-attached to the debugger, it correctly moves to the next instruction and updates the clocks, although they are all obviously wrong (i.e. waaaaay past the tick they are supposed to expire on).

Reading through the forum it seems that the only way this might happen is that if the interrupts have been suspended before any WFI instruction is called.  Our board is extremely simple with no external peripherals and the problem exhibits itself on the SensorTag as well.

All the stacks, heaps, ICALL resources are fine, no overruns and no errors in the ROV.

Searching through the RTOS source tree (tirtos_cc13xx_cc26xx_2_20_01_08) the only place I can find where the WFI instruction is explicitly referenced is in the power management code, specifically for us in the standby function.

Compiler versions don't seem to make much difference but we are following the instructions in the documentation, our BLE stack (2_02_01_18) is being compiled with 5.2.6, the app itself with 15.12.5.LTS.

I have replaced the standby function with a custom one trying to debug this problem but I don't really know what it is I'm looking for - is there a way to test if interrupts are disabled so I can skip over the WFI instruction until the next time I am in the idle loop and hopefully it gets reset?  It's very hard to debug (or produce a simple program to reproduce it unfortunately).

Anyone else have any insight into where else the WFI instruction might be generated so I can instrument that?

  • OK, so it appears that we might be affected by the bug that appears in this thread:

    However, changing RTOS to increase the timeout value (see the thread) and recompiling it (and setting NO_ROM) and putting a custom power handling function didn't fix it.  The system still ended up stuck at address 0x10003298 when reattached to the debugger (the debugger causes the preceeding WFI instruction to unlock).  I have no idea where this code at 0x10003298 is coming from, it's not an address being generated by the stack or the application so I'm just assuming it's mapped to the ROM on the CC2650.

    It looks to me that the Bluetooth Stack project itself is calling the CPU sleep functions directly instead of using RTOS to idle.  Looking at the build, it seems possible to set a few flags to force the bluetooth stack to use RTOS (and hence the modified version of RTOS with different timer settings).  We are testing this at the moment since the application level code is definitely using the modified version of RTOS but the bluetooth stack seems like it is still using the ROM routines.

  • Hello David,

    Yes, the 0x1000xxxx addresses match to ROM. You can load the RTOS symbols from the TI-RTOS SDK:
    C:\ti\tirtos_cc13xx_cc26xx_2_20_01_08\products\bios_6_46_01_38\packages\ti\sysbios\rom\cortexm\cc26xx\golden\CC26xx

    Also, the BLE stack does not call any power management sleep APIs - this is handled by the Power driver in the RTOS when there is no scheduled RTOS activity. In other words, when the Idle task is able to run, the power driver will enter standby based on the power policy and if no constraints are set. More details are in the Power Management document in the TI-RTOS.

    Beyond that, can you provide more specific steps to reproduce this on the TI CC2650 LaunchPad? Are you using the latest multi-role sample app from GitHub?

    Best wishes
  • Thanks for the reply JXS.

    It looks to me that the out of the box version of the BLE stack does call power management directly unless you explicitly specify you want it to call RTOS - I couldn't get our custom power management function called consistently unless we recompiled the BLE stack with the extra predefined symbol OSAL_PORT2TIRTOS and fixed a few little issues that result from doing that.

    However I think we did manage to solve the issue, it was to do with a very stupid bug we had with context switching, calling the GAP role from one task but using the task ID of a different one.  We have fixed this error and the devices are more stable.  The only way we found it was by accident - changing an application clock from static to dynamic still resulted in a crash but in a different place which lead us to believe we were looking at some kind of memory corruption problem.

    I will mark this as solved for the moment, when I have more time I will try to build a minimal crashing system but it's quite a complicated thing to recreate unfortunately.  If anybody is reading this thread and hitting a random crash like this, the best advice is probably to take a very good look at all your code coming in and out of the BLE stack and making sure that the calls to the GAPRole functions from particular tasks are consistent with the "selfEntity" variable you are passing.  I don't really know enough about how the RTOS context switcher works but it seems like doing stuff like that will work, but be very unstable.

  • Glad you have it working , David. I was suspecting a dead lock which may explain the misrouted GAP selfEntity taskID.

    Best wishes