This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC2340R5: OAD Service possible memory alloc/free bug?

Part Number: CC2340R5
Other Parts Discussed in Thread: CC2640

Hey Friends,

In debugging an Persistent App model OAD on Chip project based on the Basic BLE project, I am wondering if the default code in the 7_40 version of the SDK has an unintended issue: 

The code in question is in the bStatus_t OADService_setParameter(oadServiceChar_e srvChar, uint8 len, void *value) function in [oad_service.c] and it appears that the memory allocated for the notification structure

      attHandleValueNoti_t notification;

using this line of code: 

        notification.pValue = GATT_bm_alloc(activeOadCxnHandle, ATT_HANDLE_VALUE_NOTI, len, NULL);

if you examine the error paths in the function there are two paths where this allocated memory (in my case during a BLK SIZE REQUEST from the app) can fail to be returned to the heap.

One is here that might result in a leak:

     pAttr = GATTServApp_FindAttr(oadAttrTbl, GATT_NUM_ATTRS(oadAttrTbl), oadCharVals+srvChar);

     if(pAttr == NULL)
     {
          // If we cannot find the attribute, report an error
         return (ATT_ERR_ATTR_NOT_FOUND);
     }

The other is here, where the wrong pointer is used in the free function:

     status = GATT_Notification(activeOadCxnHandle, &notification, FALSE);

     if (status != SUCCESS)
     {
          // The stack will free the memory for us if the
          // the notification is successful, otherwise we have
          // to free the memory manually
         GATT_bm_free((gattMsg_t *)&notification, ATT_HANDLE_VALUE_NOTI);
     }

In the later case, the call to GATT_bm_free() attempts to pass the address of the local that is created on the stack which the free command most likely avoids action on (and this perhaps results in another leak).

Question: Am I missing something here? Does the local working copy of this source file need to be modified to correct for these two error paths? 

Thanks in advance.

PATH C:\ti\simplelink_lowpower_f3_sdk_7_40_00_64\source\ti\bleapp\services\oad\oad_service.c

  • Hey,

    Good catch for the memory leak in the first path, I reported it to the maintainers. While waiting for a patch I suggest adding GATT_bm_free before the return.

    if(pAttr == NULL)
    {
        // If we cannot find the attribute, report an error
        GATT_bm_free((gattMsg_t *)&notification, ATT_HANDLE_VALUE_NOTI);
        return (ATT_ERR_ATTR_NOT_FOUND);
    }

    In the second path you identified, there is no issue with the GATT_bm_free function because it free the payload of the notification and not the pointer itself. For more details about memory allocation for GATT procedures you can take a look at the following documentation: https://software-dl.ti.com/simplelink/esd/simplelink_lowpower_f3_sdk/7.40.00.64/exports/docs/ble5stack/ble_user_guide/html/ble-stack-5.x/gatt-cc23xx.html#allocating-memory-for-gatt-procedures

    Regards

    Tanguy

  • Thanks! Good to know that GATT_bm_free() handles the payload memory!

  • Another suspect malloc/free leak path is found in OAD_SERVICE.C in the procedure OadWriteAttrCB() at two places (i don't think they are contributing to my issue (which I shall post in a separate thread) but if the ICall_malloc() for the body of the post to BLEAppUtil_invokeFunction() fails, the allocated memory for the pOADSrvWriteReq object is not freed, and similarly, if the call to BLEAppUtil_invokeFunction() fails (say if the queue is full) then both objects are not freed as they should be.

    P.S. My issue is my OAD transfer of FW image blocks fails after after about 128K of the 187K that needs to traverse to the main application via the persistent app. For whatever reason, and we have dug very deep into the stack and surrounding details, and we have made many test/diagnostics builds of our app (the app works fine with a CC2640 sister product, but not the CC2340R5 [SDK 7_40], the process stalls and GATT_Notification() within oad_setParameter() returns 0x16 (blePending) bring an end to the transfer process.

    This is very unexpected as the process is in lock step for each block transferred from the app to the device via the persistent app. Essential the model which is stock from the SDK example (sans one encryption/step), uses a notify to the app to request the next block be sent via the WriteCharacteristic, which in turn generates a call to BLEAppUtil_invokeFunction(), sending the block to the app level for programming into the flash. This lock step method means that with no other BLE operations running concurrently to the OAD download, the internal queues should be essentially drained, and there are no outstanding INdications or steps that could be causing the blePending return status from GATT_Notification().

    If we see a 0x16 in return from Gatt_Notification(), we set a timer to resend the notification for the desired block, but that subsequent call (250mSec later) to GATT_Notfication() also returns a status of 0x16, as will every retry attempt thereafter.

    The issue has onset somewhere after typically 400-600 blocks (236 bytes per block) out of 802 bytes in our FW image for the main ON-CHIP application. But interestingly, the stall doesn't occur at the same point each time.

    Examination of the details in ROV show healthy stacks for each thread, and the heap is a little over 50% allocated, with a deficit of about 99 malloc/frees out of 12K operations, so these seem in line with normal use. 

    Our other CC2640 based device only has about 44K in its OAD update, the CC2340R5 always makes it past the 44K point getting stuck after about 100-130K being transferred. We can see the correct code flashed into the mainapp area up to the point it fails.

    Hence all the looks at memory leak susceptibilities in the API level. Aside from a few little tweaks in oad_service.c and oad_profile.c to add some diagnostic counters (which have not revealed obvious resource issues) these are the example source versions provided in SLP SDK 7_40.     

    Haven't yet figured out how ROV can be used to check the queue health (some instructions needed there folks), but we have operated the transfer process in slow-motion (adding delay in the app side after the notification) and that has not made an impact.

    My thinking is the team there has not tried a large FW image over OAD, but kept things quick with smaller images (like < 80KB) and has not seen this issue.

    Your thoughts?

  • Hello Stuart,

    Thanks for reaching out and providing valuable insights.

    Please allow me today to go through it and return back to you.

    BR,

    David.

  • Hello Stuart,

    Thanks. I see the issue is being addressed by the R&D team based on what has been posted on the other thread.

    As for this other potential issue, in case the ICall_malloc() function fails then it will not allocate mem and return NULL, therefore no need to free. The two allocations you are referring too are ICall_malloc() inside the condition  if(oadWriteCB != NULL) correct?

    BR,

    David.