CC2640R2F: connection parameter update request to iOS central device causes firmware hang

lakshmikanth satyavolu

Part Number: CC2640R2F

Tool/software:

Hi Team,

I am facing an issue with dynamic connection parameter update request in iOS.

I am using simplelink_cc2640r2_sdk_5_30_00_03 sdk (BLE4.2). I use below configuration by default which works super fine.

#define DEFAULT_DESIRED_MIN_CONN_INTERVAL 60

#define DEFAULT_DESIRED_MAX_CONN_INTERVAL 108

#define DEFAULT_DESIRED_SLAVE_LATENCY 3

#define DEFAULT_DESIRED_CONN_TIMEOUT 600

Since my application requires live data transfer, sometimes I need to transfer data at faster rate. with below configuration parameters.

#define DEFAULT_DESIRED_MIN_CONN_INTERVAL 12 //15ms

#define DEFAULT_DESIRED_MAX_CONN_INTERVAL 24 //30ms

#define DEFAULT_DESIRED_SLAVE_LATENCY 3

#define DEFAULT_DESIRED_CONN_TIMEOUT 600

I am using GAP_UpdateLinkParamReq API for dynamically modify these parameters.

I made sure all these settings are inline with iOS design guidelines.

On Android everything works fine. But on iOS central ( iPhone 7 Plus), as soon as call this API, firmware hangs and the device goes into unresponsive mode. it needs a power on reset to make it alive.

Any idea what could be going wrong here?

Best

Lakshmi

5 months ago

0 lakshmikanth satyavolu 5 months ago

Expert 1670 points

Hi Community,

Please look into it. Any help would be highly appreciated.

Thanks in advance.

Best

Lakshmi

0 Jan 5 months ago in reply to lakshmikanth satyavolu

TI__Mastermind 38970 points

Hi Lakshmi,

Thank you for reaching out. Can you try pausing the execution of the device when it hangs and sharing the call stack?

Best Regards,

Jan

0 lakshmikanth satyavolu 4 months ago in reply to Jan

Expert 1670 points

Hi Jan,

Since this problem occurrence is very random in nature, I couldn't be able to catch this event using debugger.

I need some info on how to properly get conn param update done.

I am registering a callback using GAPRole_RegisterAppCBs to catch the event. I ideally do nothing inside this callback except to check if parameters are updated or not.

When I want to update new parameters, I wait till the connection is established and in GAPROLE_CONNECTED event, I call GAP_UpdateLinkParamReq() and wait for callback to be invoked.

But in sniffer log, I observe multiple connection parameter updates are being requested by sensor, even if it is accepted in first attempt.

Anything I am doing wrong? Would you please suggest proper way of requesting connection parameters?

Thanks in advance.

Best

Lakshmi

0 David 4 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Could you please confirm what is the return (bStatus_t) status (SUCCESS, INVALIDPARAMETER, bleIncorrectMode, bleAlreadyInRequestedMode, bleNotConnected ) when executing GAP_UpdateLinkParamReq()? In addition would it be possible for you to share the sniffer logs to take a look?

BR,

David.

0 lakshmikanth satyavolu 4 months ago in reply to David

Expert 1670 points

Hi David,

Attached is nRF sniffer log for the test performed. Please have a look while I check return status. You will find firmware hang event around 13.16 timestamp. You can open the file in wireshark.

fw_hang_wireshark_log.zip

Best

Lakshmi

0 David 4 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Alright, thanks. Please allow me tomorrow to take a look at it.

BR,

David.

0 lakshmikanth satyavolu 4 months ago in reply to David

Expert 1670 points

Hi David,

Thanks for getting back.

>> Could you please confirm what is the return (bStatus_t) status.

My GAP_UpdateLinkParamReq call always returns success. But I observed, my firmware is not invoking the callback for this event most of the times. Hence the new parameters are not taking effect. After that eventually my firmware hangs. Seems the master is accepting the new connection parameters but somehow, firmware is unable to process it. I have attached sniffer screenshot for reference.

Best

Lakshmi

0 David 4 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

I see. Would it be possible to test with another IOS device (new version)? I would suspect the issue are the parameters, could you confirm what values are they actually taking after the GAP_UpdateLinkParamReq()? and compare them with the Android ones? Maybe this can give us a hint.

BR,

David.

0 lakshmikanth satyavolu 4 months ago in reply to David

Expert 1670 points

Hi David,

I have tested on multiple versions of iOS devices. Issue is same. And android also sometimes shows this pattern. I confirm that the parameters I try to update are with in BLE iOS guidelines. And those are accepted by master. Please check screeshot I attached.

But you can see 3 back to back connection parameters requests in that screenshot, even if I call GAP_UpdateLinkParamReq only once.

Sometimes I am not receiving conn param update callback in my application even if it was accepted by Master. So, the updated parameters not being used by my device. I suspect that is causing hang after a while.

0 lakshmikanth satyavolu 4 months ago in reply to lakshmikanth satyavolu

Expert 1670 points

Hi David,

Any update on this? Any help is highly appreciated. This is critical to our project.

Thanks in advance.

Best
Lakshmi

0 David 4 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Apologies for the delay here, I will work on reproducing this issue on my side with the modified parameters you shared and come back to you as soon as possible. If you start the connection with the new parameters, does the firmware also hangs?

From the logs I see there is a second Connection Update Parameter Request, after that the peripheral device is unresponsive. May I ask if this second update is to go back to the default parameters?

BR,

David.

0 lakshmikanth satyavolu 4 months ago in reply to David

Expert 1670 points

Hi David,

If I start the connection with new parameters firmware doesn't hang.

Yes. Second connection update is to getting back to default ones. In the log you can observe it is accepted. But lot of times I am not getting callback invoked.

Best

Lakshmi.

0 David 4 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello

By the callback you mean the task is not receiving back a GAP_LINK_PARAM_UPDATE_EVENT inside the gapRole_processGAPMsg() function after GAP_UpdateLinkParamReq() has been executed? Or is it the GAPRole_RegisterAppCBs()?

BR,

David.

0 lakshmikanth satyavolu 4 months ago in reply to David

Expert 1670 points

Hi David,

I register a callback with GAPRole_RegisterAppCBs API. I expect it to be invoked every time connection parameters update happens. But it is not getting invoked. I have not tested if GAP_UpdateLinkParamReq() has been executed or not.

Best

Lakshmi

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

I managed to capture the call stack finally. Please have a look at attachment. it seems the code is never coming back from HeartRate_MeasNotify() API.

What could be the possible reason?

Best.

Lakshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

What is the reason for the application to enter to the HearRate function? Is it access right after modifying back the parameters to the default ones? Could it be that we are not correctly freeing memory allocated for notification (that failed to be sent)? In addition, I think we can get more information by knowing what is the cause for the HAL assert. This source can bring some more light on this matter. Inside you project, can you locate a AssertHandler() function? Some of the causes are: HAL_ASSERT_CAUSE_OUT_OF_MEMORY, HAL_ASSERT_CAUSE_ICALL_ABORT, HAL_ASSERT_CAUSE_WRONG_API_CALL, etc - found inside hal_assert.h.

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

I use gatt notifications to send data to the phone. Since this device streams live data HeartRate function gets called continuously. Below is how I am freeing the memory when the notification failed. However, you can see from the call stack that the firmware hangs even before the notification API fails. It is not returning from HeartRate_MeasNotify() call.

// Send notification.
if (HeartRate_MeasNotify(gapConnHandle, &heartRateMeas) != SUCCESS)
{
     GATT_bm_free((gattMsg_t *)&heartRateMeas, ATT_HANDLE_VALUE_NOTI);
   return;
}

Best

Lakshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Is it always the same number of notifications that the device is able to send before it crashes (hal assert)? Could you please try running Runtime Object Viewer to see if we might be running out of memory heap size? https://software-dl.ti.com/simplelink/esd/simplelink_cc13xx_cc26xx_sdk/7.40.00.77/exports/docs/ble5stack/ble_user_guide/html/ble-stack-5.x-guide/debugging-index.html?highlight=rov#ti-rtos-object-viewer

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

I managed to capture ROV during the hang. Attaching bios->scan for errors and hwi states during hang. Please verify.

Thanks

Lakshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Thank you. Could you also please attach the HeapMem view? Also, Is it always the same number of notifications that the device is able to send before it crashes (hal assert) or does the issue reproduces with different number of notifications before crash?

BR.

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

Please find HeapMem view attached.

I will send number of notifications info later after calculated.

Best

Lakshmi

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

We also have checked number of notifications before crash. Those were not same, different for each crash.

Best

Lashmi

0 lakshmikanth satyavolu 3 months ago in reply to lakshmikanth satyavolu

Expert 1670 points

Hi David,

Below are few more insights about this issue.

I have disabled connection parameter update. Using same parameters throughout the session. But still I am seeing my firmware hangs on iPhone.
It is happening only on iOS central devices (mobile).
Happens only when BLE disconnects and reconnected again.
On android it is not happening.

I hope the above points provide more details about the issue. I really need your help here to fix this. We are working on this over couple of months by now.

Please look into it ASAP.

Thanks in advance.

Best

Lakshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Okay, so it is not about the parameters update. I do not see something unusual on the memory stats you have shared. However I was expecting the heap memory that was freed as well as the total size as you can see here: https://software-dl.ti.com/lprf/sdg-latest/html/debugging/ble-common_heap.html. Before attempting the previous, or already increasing the Heap size directly to see if this avoids or increases the time the issue reproduces, could you please help me sharing the the air-logs gathered from using the Android vs the IOS device? That way we can try to spot the reason for the issue in the log differences.

The firmware hangs with the default parameters correct?

What is the PDU size? I see from your previous IOS log it is 20 bytes, is it the same for Android?

#define DEFAULT_DESIRED_MIN_CONN_INTERVAL              60

#define DEFAULT_DESIRED_MAX_CONN_INTERVAL               108

#define DEFAULT_DESIRED_SLAVE_LATENCY                         3

#define DEFAULT_DESIRED_CONN_TIMEOUT                         600

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

I will share the requested tomorrow.

Parameters are same for both Android and iOS

Best

Lakashmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

It would be useful to take a look at how the notification function has been implemented, for instance, this is a typical way of allocating mem, sending notifications using GATT_Notification() and freeing mem.

I would also suggest to take a look at this example which implements a UART over BLE server/peripheral which uses notifications: simple_serial_socket_server.

static bStatus_t gattServApp_SendNotiInd( uint16 connHandle, uint8 cccValue,
                                          uint8 authenticated, gattAttribute_t *pAttr,
                                          uint8 taskId, pfnGATTReadAttrCB_t pfnReadAttrCB )
{
  attHandleValueNoti_t noti;
  uint16 len;
  bStatus_t status;

  // If the attribute value is longer than (ATT_MTU - 3) octets, then
  // only the first (ATT_MTU - 3) octets of this attributes value can
  // be sent in a notification.
  noti.pValue = (uint8 *)GATT_bm_alloc( connHandle, ATT_HANDLE_VALUE_NOTI,
                                        GATT_MAX_MTU, &len );
  if ( noti.pValue != NULL )
  {
    status = (*pfnReadAttrCB)( connHandle, pAttr, noti.pValue, &noti.len,
                               0, len, GATT_LOCAL_READ );
    if ( status == SUCCESS )
    {
      noti.handle = pAttr->handle;

      if ( cccValue & GATT_CLIENT_CFG_NOTIFY )
      {
        status = GATT_Notification( connHandle, &noti, authenticated );
      }
      else // GATT_CLIENT_CFG_INDICATE
      {
        status = GATT_Indication( connHandle, (attHandleValueInd_t *)&noti,
                                  authenticated, taskId );
      }
    }

    if ( status != SUCCESS )
    {
      GATT_bm_free( (gattMsg_t *)&noti, ATT_HANDLE_VALUE_NOTI );
    }
  }
  else
  {
    status = bleNoResources;
  }

  return ( status );
}

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

1) Except below check, I am sending notifications as same as in the code you have shared.

status = (*pfnReadAttrCB)( connHandle, pAttr, noti.pValue, &noti.len,0, len, GATT_LOCAL_READ );

My notification is fixed length (20 bytes).

I am not sure how depth is internal TX FIFO of the radio. But after some tests I found 6 packets, each packet 20 bytes are going without receiving failure from GATT_Notification API.

So, I try to send maximum 6 notifications till I receive failure from GATT_Notification() API. Which means, maximum 6 notifications or till I receive failure. Below is my code. Please review.

                for(i=0; i<6; i++)
                    {
                        heartRateMeas.pValue = GATT_bm_alloc(gapConnHandle, ATT_HANDLE_VALUE_NOTI,
                                                             HEARTRATE_MEAS_LEN, NULL);
                        if (heartRateMeas.pValue != NULL)
                        {
                          uint8_t *p = heartRateMeas.pValue;
                          
                          //Prepend Packet counter
                          *p++ = (counter & 0xFF00)>>8;
                          *p++ = (counter & 0xFF);
                          
                          // copy sensor data

                           memcpy(p, &recv_buf[mwptr * PKT_DATA_SIZE], PKT_DATA_SIZE);

                          heartRateMeas.len = HEARTRATE_MEAS_LEN;
   
                          // Send notification. 
                          gattErr = HeartRate_MeasNotify(gapConnHandle, &heartRateMeas);
                        if (gattErr != SUCCESS)
                          {
                            switch (gattErr)
                            {
                            case INVALIDPARAMETER :
                                break;
                            case MSG_BUFFER_NOT_AVAIL :
                               break;
                            case     bleMemAllocError :
                                break;
                            case     bleInvalidMtuSize :
                                break;
                            case bleTimeout :
                                break;
                            default :
                                break;
                            }
                            PIN_setOutputValue(hLEDPin, Board_LED1, 1);
                            // free the notification buffer
                            GATT_bm_free((gattMsg_t *)&heartRateMeas, ATT_HANDLE_VALUE_NOTI);
                            return;
                          }
                    }

2) I also attached android central device log as requested. Please check.

Andorid_wireshark_air_log_no_fw_hang.zip

3) I have not enabled HEAPMGR_METRICS and HEAPMGR_CONFIG preprocessor defines in my project. I will enable them and take heap snapshot. Will share those details later.

I Hope 1) and 2) provide some insights.

Best

Lakshmi

0 lakshmikanth satyavolu 3 months ago in reply to lakshmikanth satyavolu

Expert 1670 points

Hi David,

In addition to my previous response, I have gone through the documentation of heapmem configuration in my project. I realized my project is using below configuration.

0x80

OSAL HeapMgr, static heap size

Automatically determined by the amount of free space available at link time between heapStart and heapEnd symbols

But to get detailed heapmem metrics from ROV, I may need to configure it to be HeapMem or HeapMem + HeapTrack.

Are there any guidelines and/or reference projects using this configuration? Please refer.

Best

Lakshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Please see my comments below.

The amount of bytes sent in an IOS notification is 20 bytes vs 2 bytes in Android. Which can be explained by looking at the MTU exchange request that the IOS master issues, which does not happen in Android (based on your last sniffer log at least). I would suspect that the issue is related to how we are handling that data that is being send (not the notification itself but maybe the process of reading/copying sensor data from buffer).

Based on the screen shot you provided where we see the error is happening inside the HeartRate_MeasNotify() function, and which let us to a halassert issue, I would double check that you are copying the data correctly from the buffer. Also, where you able to read the assertcause?

memcpy(p, &recv_buf[mwptr * PKT_DATA_SIZE], PKT_DATA_SIZE);

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

Copying data in firmware for Android and iOS both use same method since we are using the same code for both. Not sure why iOS is interpreting in different way. Could this be the application which runs on the iOS which is interpreting the data packet different way than Android?

Best

Lakshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

The difference is that the IOS triggers an MTU size exchange request while the android doesn't.

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

1) Even if iOS requests larger MTU size, we can observe that the device is limiting it to be 23 in its response. This should not cause any problem in this case right?

>>The amount of bytes sent in an IOS notification is 20 bytes vs 2 bytes in Android

Regarding above query, iOS log is displaying the complete raw 20 byte packet for some reason where as Android log only displays first two bytes. But actual notification being sent by firmware was 20 bytes only. We can observe that from raw data packet at right side window of the log.

Best

Lakshmi

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

I happened to capture sniffer log for firmware hangs. In this transaction, I observed during reconnection, when the central device (mobile phone iOS) has gone out of BLE range for a while and then again comes back to the range, the massive amount LL_VERSION_IND packets are requested continuously by central. For which device responds with an empty PDU. Eventually my device is going into unknown state and causing firmware hang.

Attached is the sniffer log for the same for your reference. Please have a look.

Best

Lakshmi

fw_hang_during_reconnection_with_repeat_LL_VERSION_IND.zip

0 lakshmikanth satyavolu 3 months ago in reply to lakshmikanth satyavolu

Expert 1670 points

Hi David,

We are in critical phase of this project. Any help on this is highly appreciated.

Would you please look into it ASAP?

Best

Lakshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Apologies for the delay. I see the slave is not answering back with the corresponding LL_VERSION_IND opcode and therefore the master seems to keeps attempting the request for a long time. I also see no difference in the link layer information between the first connection attempt (which is successful and the version are correctly exchanged) and the second one (after re-connection). Is this happening if you reconnect for other reasons such as resenting the cc26 device? May I ask if the issue is related to the previous one (halassert issue while transmitting notifications in HeartRate_MeasNotify())?

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

This is happening (sometimes not all the time) when I move the phone out of BLE range and coming back to the range to reattempt the connection.

Yes, when this happens, I am encountering halassert inside HeartRate_MeasNotify().

Best

Lalshmi

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Could you please share a sniffer log with this process? In the previous logs I do not see the LL_VERSION_IND, and in the latest one, I do not see any attempt of transmit notifications, just several LL_VERSION_IND attempts from master. Is the halassert happening during this exchange? I don't think so as the device is still answering with empty packets until the last log.

I think we need to systematically search for the halassert reason/cause here. I think this guide might be of use: https://software-dl.ti.com/simplelink/esd/simplelink_cc13xx_cc26xx_sdk/7.40.00.77/exports/docs/ble5stack/ble_user_guide/html/ble-stack-5.x-guide/debugging-index.html?highlight=assert#hal-assert-handling.

BR,

David

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

I am able to find out halassert reason. It is HAL_ASSERT_CAUSE_ICALL_TIMEOUT in which the assertion is catching up. Please find the screenshot attached.

According to the documentation, it is caused by stack hang. I have also attached the sniffer log for the same. This time there were many empty PDU's exchanged between master and device then master issued a connection termination. This is the case behind most of the instances of hang.

I will capture the LL_VERSION_IND hang sniffer soon and send.

Best

Lakshmi

fw_hang_dueto_icall_timout.zip

0 David 3 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

ICall is a communication mechanism used between different processors or tasks in TI's BLE implementation. When an ICall operation takes longer than expected to complete, it results in this timeout error. I would consider the following two possibilities:

Resource contention: If multiple tasks are competing for the same resources, it can lead to delays and timeouts.
System overload: If the system is processing too many tasks simultaneously, it may not respond to ICall requests in time.

This can be translated to having another task that has a higher priority than ICall (?), or the ICall task (BLE task), being interrupted by some higher priority tasks such as interrupts (maybe the interrupt callback coming from the sensor when it has collected the data?) Could you also try adding some delay in between notifications (inside the for loop)?

BR,

David.

0 lakshmikanth satyavolu 3 months ago in reply to David

Expert 1670 points

Hi David,

I have tested with 100 clock ticks delay in between. The issue still persists. More I increase the delay the hangs were more frequent and earlier than without the delay .

Below is my program flow for your understanding. Please check once and let me know if you see any issue with that.

I have a sensor on SPI0 which generates an interrupt on GPIO every 5 milli seconds. I accumulate approximately 1 second data and inside the ISR into a buffer, and writes into external flash on SPI1 interface.

A periodic event timer was programmed for every 10 milliseconds to send (Flash read from SPI1) GATT notifications(HeartRate_MeasNotify) when data is ready from above routine.

Best

Lakshmi

0 David 2 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Thanks for the extra info and the logs. As you mentioned, the peripheral is issuing empty packets, I don't see the same behaviour as before where the peripheral goes to halassert (timeout) and therefore stops answering back while in a connection. In addition, there are also no several LL_VERSION_IND before a re-connection is attempted.

If the logic you have shared is the same that is executed when using IOS or Android, then I would think the error is triggered due to differences in timing (connection intervals) and packet sizes (MTU).

Are you still using the same IOS parameters and it still only fails for IOS correct?

#define DEFAULT_DESIRED_MIN_CONN_INTERVAL 60

#define DEFAULT_DESIRED_MAX_CONN_INTERVAL 108

#define DEFAULT_DESIRED_SLAVE_LATENCY 3

#define DEFAULT_DESIRED_CONN_TIMEOUT 600

BR,

David

0 lakshmikanth satyavolu 2 months ago in reply to David

Expert 1670 points

Hi David,

Yes. using same connection parameters for this debug.

I experimented with different timing variables (delays between GATT_notfications and different Connection intervals, slave latencies and Supervision timeouts of iOS compatible) without success.

I realized that there are two variables which are causing this issue. I am providing the summary and my program logic below. Please review.

1) Connection interval

After reconnection, no matter what connection parameters I use firmware is getting into hang sooner or later.

2) The frequency of HeartRate_MeasNotify() API calls

I use a software timer of 10ms to send notifications after collecting data from sensor.

There are two scenarios here.

a) Before disconnection: I collect 28 packets (28x20 = 560 bytes) from sensor's ISR (external GPIO interrupt) and let the software timer callback know that the data is ready. During every callback execution I only send 6 notifications. The next 6 notifications will be sent in next call. Meanwhile, the sensor fill the next 28 packets into the other buffer to make it ready to be sent after current transfer in ping pong mode. If the current transfer is delayed or the BLE is disconnected unexpectedly or phone is out of range, then I am freezing the ping pong buffers, and writing the next sensor data into external flash (SPI) memory to make sure not losing the data.

b)After disconnection: After BLE is restored or the current buffer transfer is completed, I update connection parameters with new values (15ms CI ). Then data transfers are resumed from where they stopped. This time, unlike scenario 1), the frequency of HeartRate_MeasNotify calls will be rapidly increased. Even though I still send only 6 notifications in every timer callback, since the data is always ready inside the flash memory, until all the data is flushed out, HeartRate_MeasNotify is called continuously.

The firmware hang is happening always in scenario b). I have tested with various timer expiries ( 100ms,200ms .. so on 1000ms) with different values of CI (15ms,30ms .. so on) without success. My application always gets stuck inside GATT_Notification() API and caused hang, no matter how much delay I provide between these calls (6 calls and delay then 6 calls .. so on).

Sending stored data very fast is basic requirement for our application as the backup recovery is taking lot of time. We are relying on BLE (we are using 4.2) bandwidth for maximum throughput.

I am requesting you to review my logic explained above and suggest best way to handle these scenarios. This flow is perfectly working on Android devices.

Does increasing MTU size help here?

Or do I need to change my logic?

As we are at very critical phase of project and fighting with this issue more than 8 weeks, any help is really appreciated.

Thanks in advance.

Best

Lakshmi

0 David 2 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Thanks for the summery here. It helps a lot and now we can focus on the scenario b). Here are my thoughts:

Peripheral updates connection interval to 15 ms, however I only see this is being accepted by the master when using an Android master, but not the IOS (I am looking at the air logs again, and I see the conn update sets the interval to 30 ms).
You mentioned the "frequency of HeartRate_MeasNotify calls will be rapidly increased", does that mean that the timer is triggered before 10 ms? How fast is this?
How are you reading the external flash? I only see memcpy(p, &recv_buf[mwptr * PKT_DATA_SIZE], PKT_DATA_SIZE); which I thought was the buffer from the SPI transaction interrupt, would this also be a buffer that is filled by the NVS driver (when reading from external flash)? As mentioned, HAL_ASSERT_CAUSE_ICALL_TIMEOUT is happening because of some other higher priority task executing (NVS or SPI sensor driver callbacks) and the BLE task not able to deliver at the right time.

IOS

Android:

I will ask my colleges as well in the meantime for insights on how to debug this further.

BR,

David.

0 lakshmikanth satyavolu 2 months ago in reply to David

Expert 1670 points

Hi David,

Below are my answers. Hope they provide more insights.

1) iOS master accepts new parameters. The hang happens only after data transfers started.

2) In Scenario a) which is before BLE disconnection, no external flash memory is in place. I use ping pong buffers to store the sensor data. When Ping buffer is receiving data, data transfers happens from Pong buffer. Once Pong buffer is sent all the data, Ping buffer will be ready for transmitting the data. As I mentioned, I collect 28 data packets and start transfers. To collect 28 packets it approximately takes 1 second. So, at 70ms connection interval, I have plenty of time to transfer 28 packets. But in scenario b) where no BLE connection is available, I store data inside external flash. After reconnection of BLE, I have to flush all the stored data quickly, So, I change CI to 15ms and start sending continuous notifications as I don't need to wait for data to be ready. So frequency of notifications will be increased in scenario b).

3) I read data from SPI into a local buffer (28 packets 1 time) then use a local 20 bytes buffer to copy each packet and send. There is no chance of SPI Block in this case.

But, while the gatt notification is ongoing, there is a chance of sensor interrupt event which again uses SPI to read data from sensor. This interrupt is external GPIO triggered by sensor. We cannot avoid this event as it provides the data from sensor. Also I am using SPI in blocked mode.

Best

Lakshmi

0 David 2 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Just to reduce the scope of the issue, would you mind disabling the process that reads the data from the external flash and transmits it through notifications? I would like to see if this without this process the error does not reproduce and therefore focusing on how to avoid it with the mechanism in place. In addition, would it be possible to enable SPI interrupts after all the external flash sensor data has been transmitted? What do you currently do with that data in the meantime?

BR,

David.

0 lakshmikanth satyavolu 2 months ago in reply to David

Expert 1670 points

Hi David,

I am working on the first part. Meanwhile,

In addition, would it be possible to enable SPI interrupts after all the external flash sensor data has been transmitted? What do you currently do with that data in the meantime?

Can you please elaborate this? I did not understand what exactly need to be done.

Best

Lakshmi

0 David 2 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Sure, from the previous description of the code, it wasn't clear to me if the MCU is still fetching data from the sensor through SPI while the data from the external flash is being read and transmitted at a higher frequency, or is it that the sensor data is lost (not transmitted) while the data inside the external flash is transmitted?

lakshmikanth satyavolu said:
But, while the gatt notification is ongoing, there is a chance of sensor interrupt event which again uses SPI to read data from sensor. This interrupt is external GPIO triggered by sensor. We cannot avoid this event as it provides the data from sensor. Also I am using SPI in blocked mode.

BR,

David

0 lakshmikanth satyavolu 2 months ago in reply to David

Expert 1670 points

Hi David,

Yes. Sensor still fetches data through SPI while it is getting transmitted from external flash. The collected data is written into the same flash. So, the transmitting routine reads from the flash and transmits while the collection routine (Sensor ISR) collects the data and write into the flash. I am not stopping data collection from sensor since we cannot afford data loss.

David said:
lakshmikanth satyavolu said:
But, while the gatt notification is ongoing, there is a chance of sensor interrupt event which again uses SPI to read data from sensor. This interrupt is external GPIO triggered by sensor. We cannot avoid this event as it provides the data from sensor. Also I am using SPI in blocked mode.

SPI0 (sensor) generates data ready interrupt every 5ms. I use 10ms timer to call GATT_notification API. In this callback I read the data from flash and call the API. If GATT_notifications loop cannot be completed in 5ms, there is a chance of interruption from sensor. Hope it helps. Let me know if it is not clear.

Best

Lakshmi

0 David 2 months ago in reply to lakshmikanth satyavolu

TI__Genius 13535 points

Hello Lakshmi,

Thanks for the clarification. What I am curious about is if any of these mechanisms to fetch data (interrupt based ones for instance) are blocking the execution of the BLE task. I would suggest using the ROV tool again to look at the Task and Hwi information during runtime, like it is explained in this video: https://www.ti.com/video/5631158932001#transcript-tab. In addition, digging a bit more on other threads that might have reported a similar issue I found this one that tracked the error back to the incorrect usage of clock instances, which might be worth checking in your application.

BR,

David.

0 lakshmikanth satyavolu 2 months ago in reply to David

Expert 1670 points

Hi David,

1) I am using SPI flash read in blocking mode inside the clock handler constructed by Util_constructClock. I think this clock handler will be running in swi context. Also I use another SPI write (flash) read (sensor) inside external interrup handler. All spi operations are in blocking SPI mode. Could this be an issue?

2) I am using only one clock instance for transmission of data in which GATT_notifications and spi flash read operations happening back to back as explained in above point 1).

Best

Lakshmi

Bluetooth®︎

Bluetooth forum

CC2640R2F: connection parameter update request to iOS central device causes firmware hang