This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

LAUNCHXL-CC26X2R1: Stops advertising after ~20 minutes with multiple devices connected

Part Number: LAUNCHXL-CC26X2R1

Hi all,

We went through a lot of trouble lately to get the BLE stack a bit more stable and have made some progress. Not there yet unfortunately, as explained below.

Setup

  • SDK 2.30.00.34 (12 Oct release)
  • Simple_Peripheral example from this release
  • Three bug fixes applied to the example, one change
  • Three Android phones

Procedure

  • Connect all phones (e.g. with NRF Connect app). Keep disconnecting and reconnecting the phones, for at least 30 minutes. For some reason, sometimes the issue occurs very fast, and sometimes it takes a long time. Often we use up to 7 phones.

Result

  • Application stops advertising (no crash), and does not resume advertising ever.

Applied changes to simple_peripheral

We have applied the below bug fixes and changes to the simple_peripheral project. Without the bug fixes, the application crashes very fast if connection parameter updates are send (as any Android phone seems to do at least 3 times).

Fix 1 (we stop the clock and free it, like as is happening in the removeConn function as well):

/*********************************************************************
 * @fn      SimplePeripheral_processParamUpdate
 *
 * @brief   Process a parameters update request
 *
 * @return  None
 */
static void SimplePeripheral_processParamUpdate(uint16_t connHandle)
{
  gapUpdateLinkParamReq_t req;
  uint8_t connIndex;

  req.connectionHandle = connHandle;
  req.connLatency = DEFAULT_DESIRED_SLAVE_LATENCY;
  req.connTimeout = DEFAULT_DESIRED_CONN_TIMEOUT;
  req.intervalMin = DEFAULT_DESIRED_MIN_CONN_INTERVAL;
  req.intervalMax = DEFAULT_DESIRED_MAX_CONN_INTERVAL;

  connIndex = SimplePeripheral_getConnIndex(connHandle);
 // SIMPLEPERIPHERAL_ASSERT(connIndex < MAX_NUM_BLE_CONNS);
  if (connIndex < MAX_NUM_BLE_CONNS){


      if (connList[connIndex].pUpdateClock != NULL)
      {
        // Stop and destruct the RTOS clock if it's still alive
        if (Util_isActive(connList[connIndex].pUpdateClock))
        {
          Util_stopClock(connList[connIndex].pUpdateClock);
        }// Deconstruct the clock object
          Clock_destruct(connList[connIndex].pUpdateClock);
          // Free clock struct
          ICall_free(connList[connIndex].pUpdateClock);
          connList[connIndex].pUpdateClock = NULL;
          // Free ParamUpdateEventData
          ICall_free(connList[connIndex].pParamUpdateEventData);
      }


      // Send parameter update
      bStatus_t status = GAP_UpdateLinkParamReq(&req);

      // If there is an ongoing update, queue this for when the udpate completes
      if (status == bleAlreadyInRequestedMode)
      {
        spConnHandleEntry_t *connHandleEntry = ICall_malloc(sizeof(spConnHandleEntry_t));
        if (connHandleEntry)
        {
          connHandleEntry->connHandle = connHandle;

          List_put(&paramUpdateList, (List_Elem *)connHandleEntry);
        }
      }
  }

else
{
Display_printf(dispHandle, SP_ROW_STATUS_1, 0, ANSI_COLOR_RED"Not Matched Handle"ANSI_COLOR_RESET);
}
}

Fix 2 (change line 1296 to set connHandleEntry to NULL)

if (connHandleEntry != NULL) {ICall_free(connHandleEntry); connHandleEntry = NULL;}

Fix 3 (remove ampersand)

We removed the ampersand (see fix 1 code) as discussed in: e2e.ti.com/.../2714998

Change 1

We changed the DEFAULT_ADDRESS_MODE to ADDRMODE_PUBLIC

What's next?

After the issue occured (advertising stopped), we tried, as a workaround, to disable advertising and enable it again with a timer. We see the callback to SimplePeripheral_processAdvEvent when we issue the disable command, but when we issue the enable command, the stack doesn't send a callback to SimplePeripheral_processAdvEvent (note: we are not using 8 phones, so the devices should advertise)

static void SimplePeripheral_performPeriodicTaskAdvRestart(void)
{
 GapAdv_disable(advHandleLegacy, GAP_ADV_ENABLE_OPTIONS_USE_MAX , 0);
 GapAdv_disable(advHandleLongRange, GAP_ADV_ENABLE_OPTIONS_USE_MAX , 0);

 GapAdv_enable(advHandleLegacy, GAP_ADV_ENABLE_OPTIONS_USE_MAX , 0);
 GapAdv_enable(advHandleLongRange, GAP_ADV_ENABLE_OPTIONS_USE_MAX , 0);
}

What causes the stack to stop advertising? Is there any workaround to fix this issue? We can't  launch our product Beta, as within a few hours, people cannot connect anymore.

  • Hi,

    I will try to reproduce it and get back to you asap/
  • Hi,

    Sorry for the inconvenience. We are aware of the connection parameter update queue issue and we are working on fixing that.

    What's your connection parameters from the android phone?
    If you don't have a custom app, just use BLE scanner then the default should be 30~50ms based on phone type.

    According to your description that cc26x2 stops advertising after connecting to multiple device but it's still functional, the reason could simply be that CC26x2 does not have time to do advertising anymore when it reaches 7 connections with the connection parameters used from the phone.

    We have tested that CC26x2 was able to connect to more than 20 devices however the connection parameters we used was 1s.
    In Android phone when the connection was established, the connection parameters used are often the balanced mode which is 30~50ms connection interval with 0 slave latency.

    What you can try to check is to mux out PA and LNA signal to IO pin and check the logic analyzer trace to see when CC26x2 stops advertising, what the signals look like.

    The only way to fix this is to use different set of connection parameter, this is a universal problem for any wireless device.

    What I would suggest is that after establishing the connection change the connection priority to low if there is no need for data transaction.
    Then change it back to balanced mode or high priority based on the speed of the data transaction you want.
  • Hi,

    The problem persists after disconnecting all the phones, so it seems unlikely that the chip is too busy to advertise. The chip won't start advertising again (or at least not within ~15 minutes). Therefore I doubt it has anything to do with the connection parameters. Also, we see the chip is almost all the time idle when this issue occurs.

    Can you try to reproduce this issue?

  • I can try to reproduce it but I don't have 7 android phones.
    I only got 2 iphones and 1 android phone since we are not really focus on app developing.

    All I can do is using 3 phones to test it.

    Can you check if there is HW exception when CC26x2 stops advertising?
    Can you also implement HAL Assert function?
    You can find the information here:
    dev.ti.com/.../ble-index.html
  • Hi,


    To make testing easier, and to make 100% sure you can reproduce it, our app developers have helped us out. We have created an iOS and an Android app that will:

    • Connect to the Simple_Peripheral project
    • Wait a random time between 2000ms and 10000ms
    • Disconnect
    • Wait a random time between 2000ms to 3000ms
    • ...and then start over again.

    The Android app can be installed using this link (first install Hockeyapp): rink.hockeyapp.net/.../36b20176810e4770a744b38d369a94d0
    The iOS app can be installed using this link (first install Testflight): [I will soon post this]

    Next to those apps, we also use an ESP32 with our custom firmware. If you also want to try that, let me know and I’ll send you this firmware.
    Our firmware developer is currently out of office for 2 weeks, therefore all we can do is help you reproduce the error. We worry that this issue won’t be resolved in the Q4 release, which would have a huge impact to us as a startup company.

    Test 1: 4 iOS devices (of which one with BLE5 support, we see the PHY speed change to 2M)

    Run 1:

    After 3 minutes, the Ti2642 got into an abort. This was actually something we didn’t see before.
    Note: The app will connect to the device advertising with SimplePeripheral and some specific service. Sometimes the iOS app doesn’t connect anymore, then kill it and start it again.

    Run 2:

    After 1,5 hours everything was still stable. This shows how difficult the problem actually is.

    Run 3:

    Same as run 2.

    Test 2: 4 Android devices (with three parameter updates per connection)

    Using the test app, after approx. 5 minutes (per phone 35 connections/disconnections):

    • The Ti2642 stopped advertising.
    • All devices lost connection, but the SimplePeripheral app seems to be stuck: it doesn’t show any phone disconnecting.

    Test 3: 7 ESP32 devices (with one parameter update per connection)

    The ESP32 firmware doesn’t support automatically connecting/disconnecting. Thus, we just left the devices connected, always one or two lose connection after a while and will try to reconnect automatically. If you have an ESP32 available, we can share our firmware with you.

    Run 1:

    After about 5 minutes and connecting and disconnecting the power of each ESP32 3 time:
    The Ti2642 stopped advertising.
    5 devices were still connected.
    When we unplug the power of a device, we see SimplePeripheral lose the connection.
    When we unplug them all, suddenly it starts advertising again, and all 7 devices connect again.

    Run 2:

    Same as run 1, however, when we unplugged the devices, advertisement didn’t start again.

    Run 3:

    Same as run 2, however, when we unplugged the devices, SimplePeripheral didn’t notice the devices disconnected, neither the advertisement started again.


    Conclusion so far

    The bug is not easily reproducible and results are slightly different. iOS seems to be pretty stable (though once an ABORT), whereas Android or the ESP32 causes advertising to stop pretty fast. Although we fixed a few issues with parameter updates in the SimplePeripheral project that make it somewhat more stable, we’re starting to think that there might also be a problem with parameter updates in the stack itself. That would at least explain why Android and ESP32 devices have issues and iOS not. Of course, there could be other differences between iOS and Android/ESP32 that we don’t know of.

    This instability is causing our development process to be slow and troublesome, as well as it puts our startup company at risk, as a door lock that isn’t stable we can’t sell to our investors, neither customers. We really hope you can thoroughly review this report to fix the issue with the Q4 SDK release. Thanks!!

  • Hi Christin!

    Any update/input on our findings?

  • Hi,

    Sorry for the delay, I have already reported bugs on connection parameter update. I will get back to you with the finding.

    Sorry again for the delayed response.
  • Hi,

    I will be off the following days for holidays, will continue working on this thread when I get back.
  • Hi,

    Any update on this? Will the issue be fixed in new SDK that will be released this month?

  • Hi there,

    I'm really disappointed with the support on this. We've reported this issue months ago, but no updates are being given.

    Pierre

  • Hi,

    Sorry for getting back to you this late, however, we have not identified what's the root cause for the bug yet. Therefore, the Q4 SDK will still have the bug.

    We are still working on this, sorry again for not updating frequently.
  • Hi,

    Can you route out the PA and LNA signal on CC26X2 device? From what I can see that the reason for advertising stop could be BLE STACK simply does not have enough time slots to do advertisement while maintaining all the connections.

    The easiest way to verify this assumption is to check the PA and LNA activities.

        // Map RFC_GPO0 to DIO6, LNA
        IOCPortConfigureSet(IOID_6, IOC_PORT_RFC_GPO0,
                          IOC_IOMODE_NORMAL);
        // Map RFC_GPO1 to DIO7, PA
        IOCPortConfigureSet(IOID_7, IOC_PORT_RFC_GPO1,
                          IOC_IOMODE_NORMAL);
    

    The main reason that it's not easy to reproduce is due to our link layer scheduler forms the anchor point randomly, therefore sometimes it has time to advertise and sometime it does not.

    dev.ti.com/.../link-layer-cc13x2_26x2.html