CC2640R2F: After BLE connects, Hal events stop and reception of characteristic changes does not work

Dale Kramer

Part Number: CC2640R2F

We are working with code that has been ported to CC2640R2F from a CCS 7.0.0.00042 ProjectZero based CC2640F128 project which uses BLE stack 2.2.1.

The ported code uses CCS 8.3.1.00004, SimpleLink CC2640F2 SDK 1.40.0.45 and BLE Stack is 3.x.

We also use our own IOS App that connects and works perfectly with CC2640F128 boards that are running the original un-ported code.

We have been stalled for weeks now trying to get the ported code to work on our CC2640R2F boards.

We have made progress but we are stuck at this point:

The ported code advertises and the R2F boards show up in our IOS App listing of nearby advertising devices (the list also includes un-ported F128 boards that are also advertising nearby).
When we touch an un-ported F128 board in the list, the connections are made and the IOS App works fully with 4 characteristics that are notifying and the IOS App can send and receive data to the un-ported F128 board.
When we touch a ported R2F board in the list, the connection is made and the Gap Role state shows 'Connected' during a debug session with that ported R2F board.
After connection, there is some back and forth IOS to R2F communication that gets 4 characteristics into a notifying state.
Then the IOS App sends values for a charcteristic but this never generates a '.._processCharValueChangeEvt(uint8_t paramID)' event in the CCS8 debug session.
Also, at this time, no more Hal events are generated in the debug session. For instance, a valid and active breakpoint on this line 'events = Event_pend(syncEvent, Event_Id_NONE, HAL_ALL_EVENTS, ICALL_TIMEOUT_FOREVER);' is never reached.
However the debug session has not crashed, code is executing and if I do a debug 'Pause', the code breaks at many different locations, here is one location:

Please take careful note of item #2 in the above list. The fact that item #2 happens means that the IOS App is not at fault here, and that the un-ported F128 code works perfectly in all of its BLE communications with the IOS App.

So, there is a reason that the ported code does not work on our ported code CC2640R2F boards and we need help finding out what that reason is.

Can anyone suggest a place to start from here?

Thanks,

Dale

over 4 years ago

0 Jan over 4 years ago

TI__Mastermind 42320 points

Hi Dale,

I have assigned an expert to comment on this. To be clear, you are porting from the BLE 2.2.1 stack to the BLE 3.x stack, correct? If so, then I suggest looking at the following section of the BLE User's Guide:

Porting Guide from CC2640 to CC2640R2F

Seeing as you have an almost working version of the code, you may have already referenced the above document, but as a sanity check can you verify that all steps were implemented during the porting process?

Also, the image you have provided in your post did not upload properly. Can you try uploading the image using the Insert -> Image option?

Best Regards,

Jan

0 Dale Kramer over 4 years ago in reply to Jan

Expert 2615 points

EDIT: Yes we are porting from BLE Stack 2.2.1

I am sure that Markel used that guide when he ported the code.

Markel will you please re-verify that you followed all the steps in that guide?

I did use the Insert->Image option, I will try again here:

0 Dale Kramer over 4 years ago in reply to Jan

Expert 2615 points

Jan,

Here is the image after I applied your ROV fix in this topic of mine that you just resolved

Can you help me with this 'stackPeak' issue?

Dale

0 Dale Kramer over 4 years ago in reply to Dale Kramer

Expert 2615 points

I increase the stack size to 1024 with this define:

#define HAL_TASK_STACK_SIZE 1024

The stackPeak for HaloLogger_taskFxn is now a comfortable value of 672.

But items #5 and #6 still persist as unexpected behaviors.

Here is the Paused debug session at this stage:

I can also report that at this stage, ROV reports hwiStackSize as 1024 and hwiStackPeak as 416.

Where to go now ???

Dale

0 Dale Kramer over 4 years ago in reply to Dale Kramer

Expert 2615 points

I increase the stack size to 1024 with this define:

#define HAL_TASK_STACK_SIZE 1024

The stackPeak for HaloLogger_taskFxn is now a comfortable value of 672.

But items #5 and #6 still persist as unexpected behaviors.

Here is the Paused debug session at this stage:

I can also report that at this stage, ROV reports hwiStackSize as 1024 and hwiStackPeak as 416.

Where to go now ???

Dale

EDIT: Here are more datapoints, I will continue the item numbering from my 1st post of this forum topic:

8. We have confirmed with a packet analyzer that at about the same time that Hal events stop, that a write of 0 0 0 0 1 appears which is what must have been sent from our IOS App on our 'Info' characteristic in item #5 of my first post. Here is that packet analyzer listing:

This data was never received by our R2F ported code since perhaps it may have been already in its 'hanging' state....

9. Hal events trigger and the code breaks repeatedly at an active breakpoint on ‘events = Event_pend…’ while in the ‘advertising’ state but not ‘connected’ state, due to timer events.

10. We know that the R2F ported code is still running at item #7 and that this is NOT a simple hang because we can 'Pause' the code and it breaks at many different code locations.

11. Single stepping the code after a ‘pause’ takes forever to find a 'repeat point' (I never actually single stepped enough to find that repeat point or realize that I had found a repeat point).

12. During single stepping, it actually steps through many iCall code lines.

I hope this sparks an idea in someone's mind as to where to investigate next ...

Dale

0 Dale Kramer over 4 years ago in reply to Dale Kramer

Expert 2615 points

Jan , have you found an expert that can help with this topic yet?

0 desouza over 4 years ago in reply to Dale Kramer

TI__Guru**** 171224 points

Dale,

Thanks for a very thorough description. I have a few possible ideas to help further debug this.

The fact the stack stops responding to a characteristic value sent by the iOS could indicate that perhaps a buffer overrun has happened - i.e., the buffer received could be larger than its allocated space. Can you check the allocation of the data for the characteristic?

However, the stack may have stopped before the data was sent by the iOS - in this case check if a specific scenario is happening, such as constant traffic for example. You could also try to use the host_test project plus the Btool to get a more controlled Host that allows to read and write to characteristics.

These two topics are addressed by our Simplelink Academy material: take a look at the Custom Profile module to get ideas on where the characteristic data is stored and the Fundamentals module to get familiar with the Btool utility.

https://dev.ti.com/tirex/explore/node?node=AO9tDQs88Bcw33.wxV2moA__krol.2c__LATEST

Those would be the two approaches I would use to start investigating this.

One aspect about the breakpoint setting and single stepping is that, depending on the code optimization used, the correlation between the debugger and the code itself can be very confusing due to inlining and code compaction. The section Debugging of our BLE Stack user's guide can help with this.

https://dev.ti.com/tirex/explore/node?node=ANPdcnhbDh6Eu5rLf2U9uA__krol.2c__LATEST

Hope this helps,

Rafael

0 Dale Kramer over 4 years ago in reply to desouza

Expert 2615 points

Hello Rafael,

Sorry for the late response, I have been trying a few more ideas to solve this but they did not work so I can now address your suggestions.

First, I would like to provide the detailed answer to Jans below query to me:

Jan said:
Porting Guide from CC2640 to CC2640R2F

Seeing as you have an almost working version of the code, you may have already referenced the above document, but as a sanity check can you verify that all steps were implemented during the porting process?

The document was followed but we diverged in that we chose to use SimpleLink CC2640F2 SDK 1.40.0.45. This was done so that we were able to have more free RAM space since there is a faction here that believe all our BLE communications issues are due to low RAM space. I do not believe that conclusion is justified since we have obtained much more free RAM space than was available for the un-ported code on the CC2640F128 and if you have grasped the nuances of this thread you will understand that the un-ported code on the CC2640F128 chip runs flawlessly with our IOS App whereas the ported CC2640R2F code 'hangs' and stops executing Hal timer events as described in detail above. We obtained this excess free RAM by using Cache as Ram and by placing a 4k byte array that we use into Cache memory.

We also used the simple_peripheral CC2640R2F base program. I have concerns about this since the un-ported CC2640F128 code used ProjectZero for its base code.

Also I have been informed that we used BLE Stack 3.20x.

Beyond the above changes, very little of the rest of the porting guide was needed since the simple_peripheral base code already had those changes in it.

Now for your suggestions:

desouza said:
The fact the stack stops responding to a characteristic value sent by the iOS could indicate that perhaps a buffer overrun has happened - i.e., the buffer received could be larger than its allocated space. Can you check the allocation of the data for the characteristic?

I am not sure how the allocated space for the characteristic could be different from the length of the space in the un-ported code since I believe that code was ported directly to the CC2640R2F code. I did check and both un-ported code and ported code have this line in the Haloservice.h file for the length of the HS_INFO_ID characteristic:

#define HS_INFO_LEN  86 //3 records for 255 bytes plus page id byte at end of each 85 bytes 
                        //size limited by MAX_PDU_SIZE=255 when getting from ios device

desouza said:
However, the stack may have stopped before the data was sent by the iOS - in this case check if a specific scenario is happening, such as constant traffic for example.

After the IOS App sends out the 0 0 0 1 on the info characteristic, the IOS App simply waits in a connected state for the CC2640XX to send back the data it requested by sending 0 0 0 1. So there is no more BLE traffic around to my knowledge.

desouza said:
You could also try to use the host_test project plus the Btool to get a more controlled Host that allows to read and write to characteristics.

These two topics are addressed by our Simplelink Academy material: take a look at the Custom Profile module to get ideas on where the characteristic data is stored and the Fundamentals module to get familiar with the Btool utility.

https://dev.ti.com/tirex/explore/node?node=AO9tDQs88Bcw33.wxV2moA__krol.2c__LATEST

Those would be the two approaches I would use to start investigating this.

I am really trying to avoid having to start from scratch here because the un-ported code on the CC2640F128 works flawlessly (including reading and writing characteristics) with our IOS App so the ported CC2640R2F should do the same thing.

desouza said:
One aspect about the breakpoint setting and single stepping is that, depending on the code optimization used, the correlation between the debugger and the code itself can be very confusing due to inlining and code compaction. The section Debugging of our BLE Stack user's guide can help with this.

https://dev.ti.com/tirex/explore/node?node=ANPdcnhbDh6Eu5rLf2U9uA__krol.2c__LATEST

I am very familiar with debugging and the use of breakpoint limitations when using high optimization levels (in this case we use 4). I have been debugging on Opt 4 for years with the un-ported CC2640F128 code. My experience with breakpoints is that if you are allowed to set a breakpoint on a line of code and also that you can obtain a checkmark on the breakpoint in the 'Breakpoints' debugging tab, then the code will ALWAYS break there. So when I say that the code does not break at a breakpoint in this forum topic, I mean a valid breakpoint that was set after the code was built and that there is a checkmark on the breakpoint in the 'Breakpoints' debugging tab.

The breakpoints that I have mentioned so far have confirmed that prior to this 'hanging but running state' that we reach, we can break the code when Hal events are generated and that there is no reception of the info characteristic with value 0 0 0 1 by the CC2640R2F.

Also, in this 'hanging state' code is running and is not in a simple loop since when we pause the code in a debug session, the code breaks at many different locations...

Does any of this help or generate more ideas that we can look into for this issue?

Is there a way to understand what code is actually being executed in the 'hanging state' more efficiently?

Do the any of these code sections mean anything to you as to what is executing at this pause point? :

Thanks,

Dale

0 Dale Kramer over 4 years ago in reply to Dale Kramer

Expert 2615 points

MORE INFO:

I have been able to investigate our 'hung state' more but I am reaching the limit of what I can do here without further guidance.

Here is a commented screenshot of may latest debugging attempt:

0 Dale Kramer over 4 years ago in reply to Dale Kramer

Expert 2615 points

I decided to leave my ...CRITICAL_SECTION... breakpoint active after a forced board reset. This would allow me to see what the code does during the initial advertising state where my 1ms Hall events are continually triggered.

This breakpoint seems to be in the very basic code that controls all the processor tasks and it still breaks regularly in the advertising state with very little code executed between repeats of the breakpoint.

This was encouraging, so I connected again and got back to our 'hung state'.

Then I tediously single stepped between triggers of my ...CRITICAL_SECTION...breakpoint .

Finally, I got to where an 'ICALL_ERRNO_SUCCESS' is generated as you can see below:

Now, I have to believe that there is somebody out there that can tell me how to fix this error... please

EDIT: DARN, maybe that is not an unexpected error here

BUT: Here is another clue, maybe this is an unexpected error along the way while looping to find the highest priority task of the 9 active tasks in 'void osal_run_system( void )':

0 Dale Kramer over 4 years ago in reply to Dale Kramer

Expert 2615 points

Another week goes by, our path to the future with CC26XX is very uncertain right now.

What else can I try?

Dale

EDIT: MORE INFO

0 Dale Kramer over 4 years ago in reply to Dale Kramer

Expert 2615 points

Dale Kramer said:
We also used the simple_peripheral CC2640R2F base program. I have concerns about this since the un-ported CC2640F128 code used ProjectZero for its base code.

I have begun looking down this rabbit hole and it appears to me that perhaps we could expect better results with the ported code if we had used Project_zero for the CC2640R2F as the base code for the porting to CC2640R2F.

The reason I say this is because 2 functions in the unported code, which get triggered 7 times during a successful connection process with our IOS App, are missing completely from the simple peripheral sample code for the CC2640R2F.

The functions are:

/*
 * @brief  Generic message constructor for characteristic data.
 *
 *         Sends a message to the application for handling in Task context where
 *         the message payload is a char_data_t struct.
 *
 *         From service callbacks the appMsgType is APP_MSG_SERVICE_WRITE or
 *         APP_MSG_SERVICE_CFG, and functions running in another context than
 *         the Task itself, can set the type to APP_MSG_UPDATE_CHARVAL to
 *         make the user Task loop invoke user_updateCharVal function for them.
 *
 * @param  appMsgType    Enumerated type of message being sent.
 * @param  connHandle    GAP Connection handle of the relevant connection
 * @param  serviceUUID   16-bit part of the relevant service UUID
 * @param  paramID       Index of the characteristic in the service
 * @oaram  *pValue       Pointer to characteristic value
 * @param  len           Length of characteristic data
 */
static void user_enqueueCharDataMsg( app_msg_types_t appMsgType,
                                     uint16_t connHandle,
                                     uint16_t serviceUUID, uint8_t paramID,
                                     uint8_t *pValue, uint16_t len )
{
  // Called in Stack's Task context, so can't do processing here.
  // Send message to application message queue about received data.
  uint16_t readLen = len; // How much data was written to the attribute

  // Allocate memory for the message.
  // Note: The pCharData message doesn't have to contain the data itself, as
  //       that's stored in a variable in the service implementation.
  //
  //       However, to prevent data loss if a new value is received before the
  //       service's container is read out via the GetParameter API is called,
  //       we copy the characteristic's data now.
  app_msg_t *pMsg = ICall_malloc( sizeof(app_msg_t) + sizeof(char_data_t) +
                                  readLen );

  if (pMsg != NULL)
  {
    pMsg->type = appMsgType;

    char_data_t *pCharData = (char_data_t *)pMsg->pdu;
    pCharData->svcUUID = serviceUUID; // Use 16-bit part of UUID.
    pCharData->paramID = paramID;
    // Copy data from service now.
    memcpy(pCharData->data, pValue, readLen);
    // Update pCharData with how much data we received.
    pCharData->dataLen = readLen;
    // Enqueue the message using pointer to queue node element.
    Queue_enqueue(hApplicationMsgQ, &pMsg->_elem);
    // Let application know there's a message.
    Semaphore_post(sem);
  }
}

/*
 * @brief  Generic message constructor for application messages.
 *
 *         Sends a message to the application for handling in Task context.
 *
 * @param  appMsgType    Enumerated type of message being sent.
 * @oaram  *pValue       Pointer to characteristic value
 * @param  len           Length of characteristic data
 */
static void user_enqueueRawAppMsg(app_msg_types_t appMsgType, uint8_t *pData,
                                  uint16_t len)
{
  // Allocate memory for the message.
  app_msg_t *pMsg = ICall_malloc( sizeof(app_msg_t) + len );

  if (pMsg != NULL)
  {
    pMsg->type = appMsgType;

    // Copy data into message
    memcpy(pMsg->pdu, pData, len);

    // Enqueue the message using pointer to queue node element.
    Queue_enqueue(hApplicationMsgQ, &pMsg->_elem);
    // Let application know there's a message.
    Semaphore_post(sem);
  }
}

Also, from their description, I can also reason that they likely do have an influence on whether an event may be triggered when new characteristic data arrives and possibly have an effect on Hall event generation.

Are there any comments on this or any other rabbit holes I have gone down?

Thanks,

Dale

Bluetooth®︎

Bluetooth forum

CC2640R2F: After BLE connects, Hal events stop and reception of characteristic changes does not work