This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/BLE-STACK: MSP432 SimpleLink Bluetooth Plugin memory leaks and error handling

Part Number: BLE-STACK

Tool/software: TI-RTOS

Hi,

I'm porting some code from an MSP432P401R to an MSP432P401M so my SRAM has reduced from 64KB to 32KB. I avoid dynamic memory allocation in my code, but it's used extensively throughout your drivers and the bluetooth stack, previously I allocated 16KB for the heap, but no longer have room for that so have had to reduce it to 7 KB. This isn't super relevant, but the point is that simplelink_msp432_sdk_bluetooth_plugin_1_25_00_42 has inadequate error handling and will leak memory if initialisation fails

Relevant config: I'm using UART transport with sapParams.port.remote.stackSize = 2048.

There's a few different cases that I will go through, but the relevant function that's called from my code is always SAP_open.

Going through its code, it initially does default parameter, callback configuration. None of this requires cleanup if it fails, so that's fine. Then SNP_open is called which doesn't allocate memory, but does require that SNP_close is called to destroy its semaphores. So if the passed in portType is invalid, we return an error without calling SNP_close. This doesn't leak memory or infinitely loop, but is still not ideal. Eventually we get into NPITask_open which is where most of the issues happen (npi_task_posix.c).

We have the same sem_init call that'll never be destroyed, but the main issues are that npiTxQueuenpiRxQueuenpiSyncRxQueuenpiSyncTxQueue and npiPacketQueue are all opened (which does an allocation) without checking the result. So some of these could have been allocated, while others may not have. A little further down, if npiTaskStack fails to be allocated, then we've just leaked several kilobytes of RAM that we'll never get back. We could potentially call SAP_close even if SAP_open fails, but that will be very error prone because it assumes that SAP_open was successful, and if for example one of the queues failed to allocate, then we'll try to call mq_free on -1 which then tries to call Memory_free on it, which will cause all sorts of issues. So, basically, NPITask_open desperately needs error handling and to free any allocated resources if it fails. Also, it seems that npiTaskStack is completely redundant and never even used? pthread_attr_setstacksize will cause pthread_create to allocate the stack anyway, so we've just wasted some more RAM for no reason at all.

Moving onto NPITL_openTL we get both memory leaks (npiRxBufnpiTxBuf), potential memory corruption if one of the buffers fails to allocate (in reality it'll be trying to overwrite 0x00000000 which will be a no-op because it's flash, but still not good practice) and infinite loops within transportOpentransportOpen is aliased to NPITLUART_openTransport for me, so off to there I go.

There's two more queues that are potentially leaked (uartQueueRecuartQueueSend) but more annoyingly, the "error handling" for pthread calls is to simply infinitely loop. I'm writing firmware for a battery operated device where bluetooth is mainly a convenience for configuration. If it fails, that's a shame and I guess I could reset if worst came to worst. However, if the initialisation never actually returns and prevents low power modes, I'm going to run out of battery fairly quickly which makes the entire device useless. 

Sorry for the long winded post, but can we please get some error handling within this supposedly release quality code? I haven't listed all of the leaks, etc. but I feel like I've illustrated my point. I'm already compiling it myself, so I'm going to go through and fix issues that affect me, but I can't be the only person bothered by this. At a minimum can we get:

  1. Return value for NPITLUART_openTransport that indicates error (after cleaning up any allocated resources). Presumably there's the same issues with SPI.
  2. Resource cleanup in NPITask_open and SAP_open.
  3. Guarantees when SAP_open doesn't return SNP_SUCCESS that we at least haven't leaked any resources.
  4. Ability to configure queue depths so we can adjust the memory requirements for more resource constraint devices. I wonder if it makes sense to just put _npiFrame_t on the queues in npi_task_posix.c instead of pointers, to save 4x16x4 = 256 bytes.

  • Campbell,
    Thanks for your excellent, detailed post alerting us to these needed improvements. I'll put the request in to our software development team to address these issues.

    Thanks,
    Bob L.
  • Campbell,

    Thank you for bringing these issues to our attention. I am working on resolving them now. Please expect a release early next month (April).

    Thanks and Best Regards,
  • Thanks for the speedy response, I look forward to the next release :)

    While you're looking at this code is it also possible to ensure the following is also fixed in the next release?

    e2e.ti.com/.../660592

    I believe there's also a few issues with how threads are closed, for example NPITLUART_closeTransport does pthread_join to wait for the uart thread, but this doesn't work if the thread hasn't attached yet (e.g. SAP_open, then immediately do SAP_close, yield execution in calling thread then uartThread is still scheduled to run and will dereference structures that were released). You need the fixes in my above link before you can properly open/close SAP and notice the issues.

    Would be good to test this a bit as there's probably more issues that I haven't come across yet, but I appreciate the fast response and ongoing work, so thank you!

**Attention** This is a public forum