This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC2640R2F: Increase data transfer rate in BLE

Part Number: CC2640R2F


Hi,

I want to receive serial data from Host-test, that is connected via BLE to a peripheral device that samples A/D at 1KHz and sends BLE data to Host-test when a 100-word data buffer is full, which happens every 0.1ms. Host-test is connected to PC and is supposed to transfer data at using UART. Data transfer takes about 1 sec which is equal to .2KBytes/Sec, and much slower than BLE data transfer rate. This is also the case when I tried data transfer using BTool. How can I increase data transfer rate?

Bests

  • Hi Ke,

    I increased UART Baudrate to 921600, and BLE_CONN_Interval to 8. Now reading 200 bytes of data takes 0.2 sec, which is about 1KBytes/sec. Is that the maximum achievable speed?

  • ti claimed 11kbps hroughput out of box, that's approximate to 1KByte/sec, so your result seems acceptable. but that's far from 1Mbps BLE speed, it seems that the example need to be optimized further for better throughput.

    some considerations might be experimented:
    the Notification might be sent directly as soon as possible in the app task instead of through the profile and by limitation of the connection event interval. use DMA to move data, ringbuffer might be used to buffer data in the app instead of using List.

    how did you measured the speed? I could repeat your experiment if you could provide your method. presently I could use only 2 2640r2f boards to run the simpler serial socket example successfully, and do not know how much the throughput is out of box.

  • Out of box means data out of LunchPad kits?

    I have two LaunchXL.CC2640R2. I have defined three characteristics (Notification, Buffer Count, and DataValue (200 Bytes)).

    I implemented HCI in Python, and whenever I get into/out of the function that reads from Host-Test data (200Bytes characteristic data only, ignoring Notification & Buffer Count for now), time is measured. That's how I got how much it takes to flush 200Bytes into PC, but I cannot figure out what causes this delay (BLE or UART transfer).

    Could you explain these parts a little bit more? I don't know what you mean by getting Notification by app-task directly, and DMA to access data?

    the Notification might be sent directly as soon as possible in the app task instead of through the profile and by limitation of the connection event interval. use DMA to move data, ringbuffer might be used to buffer data in the app instead of using List.

  • out of box mean use the example project without modification. fill up data through server uart input from server to client, and measure receive speed at client uart output.

    to measure BLE throughput, it's seems should fill up data from soc buffer directly to server notification and measure ariving speed at client notification. so modification is needed for the example. to make such modification, use DMA move data from soc heap buffer to the notification packet at server side. and use DMA move data from notification packet to soc heap buffer at client side. at server side, packet notification and call GATT_Notification directly in app task to send the notification. without using uart at both client and server side for the transmition.

  • I tested transferring data from ProjecZero, and it took 0.15 sec, for receiving 80 Bytes of "It's pretty a long sentence! Isn't it!", which is about 500 Bytes/Sec. So, the problem is not with the modification I made in the code.

    Could you please explain more on how to uses DMA for this data transport? I couldn't find relevant info using the keywords.

  • Hi Ke,

    Could you please verify that I got the points you mentioned correctly?

    To incrase transfer rate you suggested two different strategies:
    1) Transfer data using BLE notificaiton (and not indication), which does not need acknowledgement from the client, and therefore speeds up transfer rate.
    2) Use DMA to transfer data between the notification buffer (which I guess is on the ARM Corex-M0?), and heap buffer (which I guess is on the ARM Cortex-M3?). This should be done on both client (Host-Test) and served (Project-Zero) devices.


    If that makes sense, to implement the second part, I found this code to transfer data in server (project-zero)
    memcpy(ADCBuffService_ADCBuffValVal, value, len);
    Should I implement memory-memory data transfer using uDMA instead of using memcpy function?
  • test with memcpy  show the BLE Notification output speed could reach 28K bytes/sec at the server. BLE5 could reach 56KBytes/sec. the function for test

    static uint16_t
    transmitNoti (uint8_t *pData, uint16_t len)
    {
      static attHandleValueNoti_t sssNoti;
      sssNoti.handle = connHandle;
      static uint16_t allocLen;
      static bStatus_t ret;
      for (uint16_t offset = 0; offset != len;)
        {
          ret = SUCCESS;
          allocLen = len - offset;
          if (allocLen > 62)
            allocLen = 62;
          sssNoti.len = 0;
          sssNoti.pValue =
            (uint8 *) GATT_bm_alloc (sssConnHandle, ATT_HANDLE_VALUE_NOTI,
                                     allocLen, &sssNoti.len);
          if (sssNoti.pValue)
            {
              memcpy (sssNoti.pValue, pData + offset, sssNoti.len);
              ret = GATT_Notification (sssConnHandle, &sssNoti, FALSE);
              if (ret == SUCCESS)
                offset += sssNoti.len;
              else
                GATT_bm_free ((gattMsg_t *) & sssNoti, ATT_HANDLE_VALUE_NOTI);
            }
    //       else
    //     ret = bleMemAllocError;
        }
      return len;
    }
    

    modification 2 needed:

    static uint16_t cntN;
    static void
    SimpleStreamServer_incomingDataCB (uint16_t connHandle,
                                       uint8_t paramID, uint16_t len,
                                       uint8_t * data)
    {
      if ( (len > 2) && (*data == 's'))
        { 
          cntN = *(data + 1);
          SimpleSerialSocketServer_enqueueMsg (SSSS_SPEED_TEST_EVT, 0, NULL, *(data + 1));
        }
    

    modification 2, define a new event id SSSS_SPEED_TEST_EVT and its case:

        case SSSS_SPEED_TEST_EVT:
         {
           if(cntN == pMsg->arg0)
             tstamp[0] = AONRTCCurrentCompareValueGet ();
    //z        for(uint8_t i = 0; i < pMsg->arg0; i++)
               transmitNoti (0x20000000, 1024);
               cntN--;
           if(pMsg->arg0 - cntN == 100)
             {
               tstamp[1] = AONRTCCurrentCompareValueGet ();
               uint32_t t = tstamp[1] - tstamp[0];
               uint32_t b = 100 <<24;
               uint16_t s = b / t;
               uint16_t si = s >> 8;
               uint16_t sf = s & 0xFF;
               int fraction = (int)((double)sf * 100 / 256);  // Get 2 decimals
               static uint8_t printAddress[sizeof("SSSS SPEED: ") + 5 + sizeof("KByte/sec\n\r")];
               uint8_t len = sprintf(printAddress, "SSSS SPEED: %d.%02u KByte/sec\n\r", si, fraction);
               UART_write(uartHandle, printAddress, len);
             }
           else
             SimpleSerialSocketServer_enqueueMsg (SSSS_SPEED_TEST_EVT, 0, NULL, pMsg->arg0);
           break;
         }
    

    and  the speed could reach 10K  bytes/sec at the client with standard 115200bps configured. UART receive speed could reach 25K Bytes/sec with 300000bps UART.  

    run the python script on the PC that the client connected after client having connected to server, use your serial device name with correct baudrate if it's different from a 115200bps /dev/ttyUSB1, 

    import serial 
    import time
    
    ser = serial.Serial('/dev/ttyUSB1', 115200, timeout=60)
    for i in range(0,7):
        line = ser.readline()
    ser.write(b'\x73\x65\x00')
    while True:
        data = ser.read(1)
        t0 = time.time()
        data = ser.read(10240)
        t1 = time.time()
        print("BR = %f KB"%(10.00/(t1-t0)))
    

    dma improvement is almost neglectable. u

  • ke fan said:
    1Mbps BLE speed



    I believe this is the symbol rate which your application cannot achieve because some of the bits make up non-data (PDU payload) parts of the wireless packets such as the header, CRC, etc. Only a fraction of the bit rate makes up actual data.

    Reyhaneh Bakhtiari said:
    I don't know what you mean by getting Notification by app-task directly, and DMA to access data?


    Notifications are a way to send data as opposed to reads. Consider what must happen during a read:

    Client: "Can I have data1"
    Server: "here is data1"
    Client: "I have received data1"
    Client: "Can I have data2"
    Server: "here is data2"
    Client: "I have received data2"

    2/3 packets are not data. Notifications go:

    Client: "Subscribe to notification"
    Server: "here is data1"
    Server: "here is data2"
    Server: "here is data3"
    Server: "here is data4"
    ...

    Notifications are a method of exchanging data with the protocol that does not concern the DMA. You need more than 80 bytes to see the data rate. 80 bytes can be sent in one single packet. This does not take advantage of notifications.

    Reyhaneh Bakhtiari said:
    BLE_CONN_Interval to 8


    The peripheral can only make suggestions for the connection.  The central has final say, and may reject requests.

    The connections are meaningful in other ways. I believe 4 notifications can happen per connection interval vs only 1 read request. Look into the standard for an authoritative numbers.

  • the code calculated test result is payload speed at each end without consideration of data loss/error. there may be some errors. and BLE notification speed is faster than UART receiving speed. so there must be data loss.
    the notification payload is 62 byte with 65 byte mtu.
    each transmition transmit 1024byte payload continuously , and continuously repeat transmition for 100+ times and calculate speed for sending 102400 bytes. uart side calculate speed for receiving 10240 bytes 10times. since there's data loss, more bytes have to be sent for uart to receive total 102400 bytes.
    GATT_Notification might be replaced by ATT_HandleValueNoti for better result. dma might be used to transfer data from sram to Noti.pValue instead of memcpy for further speed gain.

  • Hi,
    I used ble5_host_test_cc2640r2lp_app for the client, and connect to it through HCI implemented in python.
    How can I insert these functions on this code?

    Where can I find SimpleSerialSocketServer (which seems to be the project you've modified).
  • Hi Reyhaneh,

    You can find the Simple Serial Socket examples here:  

  • Hi Ke,

    Could you please help me with these errors:
    1) "sprintf" redeclared with incompatible type: null: symbol "sprintf" redeclared with incompatible type: simple_serial_socket_server_cc2640r2lp_app C/C++ Problem

    2) function "AONRTCCurrentCompareValueGet" declared implicitly simple_serial_socket_server.c /simple_serial_socket_server_cc2640r2lp_app/Application line 1067 C/C++ Problem

    3) #20 identifier "sssConnHandle" is undefined simple_serial_socket_server.c /simple_serial_socket_server_cc2640r2lp_app/Application line 1268 C/C++ Problem

    Thanks a million,
  • 1. sprintf need define SHOW_BD_ADDR, just look the source code at the beginning you will know this.
    2.AONRTCCurrentCompareValueGet need #include <driverlib/aon_rtc.h>
    3.try use connHandle instead of sssConnHandle first. but check that connHandle is not 0xFFFF. only when it is 0. client is connected to server and data could be sent and received.
  • Thanks a lot. This is such a neat code. I could get to 49KByte/sec Baud rate with it.

    Using notification on my own implementation (Host_test/Project_zero), I could transfer data with rate of 1.8 KByte/sec, which is almost half of what I need.

    I'd like to know what are the differences between simple_serial_client and host_test host_test that result in such huge disfference in their data transfer. A part of this is due to ATT_header info included in notification received on HCI-UART, which is about 1/3 of all data, but there is more to that.

    I'm wondering if it makes senses to switch to simple_serial_client/sesrver, or the speed in Host_test/project_zero can be increased as well.
  • Thanks for your explanation.

    Using DMA is suggested based on this discussion:

    e2e.ti.com/.../2822993

    and use DMA move data from notification packet to soc heap buffer at client side. at server side, packet notification and call GATT_Notification directly in app task to send the notification

  • Hi Reyhaneh,

    If you're just looking to receive data over BLE you can use simple_serial_socket client/server. If you need the Host commands then you will have to use Host test.

    One big difference is that host test uses an NPI (network processor interface) layer which adds to the overhead of each packet (yet this layer is necessary).

    How are you measuring the throughput in each case?

    Can you look at fundamental parameters of the UART driver implementation to see if there is something to gain (e.g. baud rate, buffer size).
  • Hi Marie,

    Thanks for your explanation. It totally makes sense. I do not really need HCI commands, just connecting to the server and streaming ADC data sampled at 1KHz, with acceptable refreshing rate is good enough. After acquiring 100 ADC samples (0.1 sec), an interrupt is generated, that transmits the ADC buffer into 11 of 19+1 Byte notification packet (1 Byte is kept for data buffer counter) 

    In my python script, whenever I start reading from UART incoming data, I start measuring elapsed time, it takes about 0.1 sec for flushing 6 notification packets, which means I always miss about 40% of data (It is always the last 4-5 notifications that are lost).

    UART BR is set to 921600, and I don't know how to get/set UART buffer size.

    Thanks again,

    Reyhaneh

  • Hi Reyhaneh,

    Did you use a logic analyzer to see whether all the data goes over the uart line? (To see whether it's lost on the CC2640R2 side or the laptop side?
  • Hi Marie,
    We just ordered a logic analyzer and will be able to provide a detailed info, once I get it.
    Thanks for your help,
    Reyhaneh

  • with both end using Uart, the Uart is a critical bottle neck for end to end throughput.
    with both end using a 32-byte ring buffer ,the end to end throughput is around 1Kbyte/sec.
    with both end using a 1024-byte ring buffer, and optimized flow control, the end to end throughput could reach 12Kbyte/sec.
    the ring buffer size could be changed in the CC2640R2_LAUNCHXL.c file:

    uint8_t uartCC26XXRingBuffer[CC2640R2_LAUNCHXL_UARTCOUNT][1024];

    directly operating the Uart by register or drivelib to send and receive data instead of using the TI-RTOS driver might improve the throughput.

    the sss stream is not robust, it easily get data loss or error. other measures are needed for data integrity.