This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC2642R: Peripheral disconnects with LL Response Timeout

Part Number: CC2642R
Other Parts Discussed in Thread: BLE-STACK

Hi.

We are having some problems holding longer connections to some Android devices. The connection abruptly ends after about 900 connection events (about 40s with default Android 45ms interval and 20min with our preferred 145ms interval and latency of 10). The disconnect reason given by the TI Stack is LL_STATUS_ERROR_LL_TIMEOUT (0x22). After searching through the forums I tried disabling Data Length Extension in the local feature set after which the connections seems to hold. I've attached sniffer traces for both cases, where the peripheral is running SimplePeripheral from SDK 6.20.00.29 (with slight modification to disable DLE, use public addresses and disable the connection parameter request).

traces.zip

I'm looking for answers to the following questions:

1. Based on the results of the forum search, the current assumption is that the BLE-Stack of the Android devices is faulty. Since disabling DLE in the local feature set seems to make it work, the problem seems to be connected to the LL_LENGTH_REQ send by the peripheral which is responded with an LL_REJECT_EXT_IND by the Android device. Can you point out the error here? Is LL_REJECT_EXT_IND not a valid response to a LL_LENGTH_REQ? I've tried looking through the BLE Specification, but was unable to determine a definite answer which responses are valid here. This would help in communcation with the supplier of the Android devices.

2. In the working trace the central still sends a LL_LENGTH_REQ which is answered with 251/2120ms bounds by the peripheral. Does this mean that the connection can still profit from larger payloads although the DLE bit in the local feature set is cleared? If yes this would make this approach a valid workaround.

3. In the BLE Specification the LL response timeout is given as 40s (Vol 6, Part B, Section 5.2). Why does the timeout change with the connection parameters in our case?

Sincerely,

Alexander Wenner

  • Hey Alexander,

    Thanks for posting on the e2e forums. We've received the logs and data and will take a look and follow up with our comments.

  • Hi Alexander,

    You mentioned that the issue occurs on some devices, does this means you have some devices where the problem does not occur? Could you provide a list of devices (along with their OS versions) that exhibit this issue and those that do not?

    Best Regards,

    Jan

  • Hi.

    The issue only presents itself with a WF1032T tablet running Android 10 we recently acquired for prototyping. Apparently the manufacturer recently upgraded to a new wireless stack that supports BLE 5.0. With all other Android devices we have tested so far, even an older WF1032T, the issue does not present itself. This again leads me to believe that the BLE Stack on the new tablet is at fault here.

    I'm looking forward to your answers to my questions above.

    Sincerely,

    Alexander Wenner 

  • Hi Alexander,

    Got it. Thank you for the information. Please see my answers to your questions below:

    1. Based on the results of the forum search, the current assumption is that the BLE-Stack of the Android devices is faulty. Since disabling DLE in the local feature set seems to make it work, the problem seems to be connected to the LL_LENGTH_REQ send by the peripheral which is responded with an LL_REJECT_EXT_IND by the Android device. Can you point out the error here? Is LL_REJECT_EXT_IND not a valid response to a LL_LENGTH_REQ? I've tried looking through the BLE Specification, but was unable to determine a definite answer which responses are valid here. This would help in communcation with the supplier of the Android devices.

    When an LL_LENGTH_REQ PDU is received, the link layer must respond with an LL_LENGTH_RSP PDU even if DLE is not supported. It should contains the parameters that the device supports. Information on this can be found in section 5.1.9 Data Length Update procedure of the 5.2 Bluetooth Core Specification.

    The LL_REJECT_EXT_ID has an ErrorCode section in the CrtData field.  I am unable to parse the provided trace fields as BLE packets on my end. The [Vol 1] Parf F, Controller Error Codes section of the Bluetooth Core Specification contains a list of all the error codes that could be generated here as well as their purpose.

    One of the reasons an LL_REJECT_EXT_IND (or LL_REJECT_IND) can be sent is due to a procedure collision. Section 5.3 Procedure Collisions of the specification may provide further insight here:

    In the working trace the central still sends a LL_LENGTH_REQ which is answered with 251/2120ms bounds by the peripheral. Does this mean that the connection can still profit from larger payloads although the DLE bit in the local feature set is cleared? If yes this would make this approach a valid workaround.

    I am unable to parse the provided log in BLE. For some reason, the packets show up as UDP packets (see image below). Do you have access to an Ellisys Bluetooth LE Sniffer? If so could you provide equivalent captures taken in the Ellisys? It would provide us with a lot of valuable insight into what may be going on.

    If DLE is not enabled on the peripheral side, then I would expect the peripheral to limit its request to 27 bytes instead of 251. Is the DLE feature marked as disabled in the feature request packets?

    3. In the BLE Specification the LL response timeout is given as 40s (Vol 6, Part B, Section 5.2). Why does the timeout change with the connection parameters in our case?

    Can you identify the packet that is not responded to in time? Is it the LL_LENGTH_REQ packet?

    Best Regards,

    Jan

  • Hi Jan,

    thank you for the in depth answers.

    Sadly I do not have access to an Ellisys Bluetooth LE Sniffer. However the traces were generated with TI tools following these instructions. As I understand it the Smart RF Sniffer Agent wraps the BLE packages in Network Packages for some reason. With the proper plugins (see instructions) Wireshark is able to parse the wrapped BLE packages just fine.

    Frame 35: 59 bytes on wire (472 bits), 59 bytes captured (472 bits) on interface 0
    Internet Protocol Version 4, Src: 192.168.1.3, Dst: 192.168.1.3
    User Datagram Protocol, Src Port: 17760, Dst Port: 17760
    TI Radio Packet Info
    TI BLE Packet Info
        Connection Event: 16
        Info: 0x01
            .... ..01 = Direction: Master -> Slave (0x1)
    Bluetooth Low Energy Link Layer
        Access Address: 0x6cca75c6
        [Master Address: c4:3c:b0:0a:ff:82 (c4:3c:b0:0a:ff:82)]
        [Slave Address: TexasIns_31:cc:a8 (80:6f:b0:31:cc:a8)]
        Data Header: 0x0303
            .... ..11 = LLID: Control PDU (0x3)
            .... .0.. = Next Expected Sequence Number: 0
            .... 0... = Sequence Number: 0
            ...0 .... = More Data: False
            000. .... = RFU: 0
            Length: 3
        Control Opcode: LL_REJECT_IND_EXT (0x11)
        Reject Opcode: LL_LENGTH_REQ (0x14)
        Error Code: Different Transaction Collision (0x2a)
        CRC: 0x68ea50
            [Expert Info (Note/Checksum): CRC unchecked, not all data available]
                [CRC unchecked, not all data available]
                [Severity level: Note]
                [Group: Checksum]
    

    The error code in the LL_REJECT_EXT_IND seems to indicate a transaction collision.

    Frame 6: 65 bytes on wire (520 bits), 65 bytes captured (520 bits) on interface 0
    Internet Protocol Version 4, Src: 192.168.1.3, Dst: 192.168.1.3
    User Datagram Protocol, Src Port: 17760, Dst Port: 17760
    TI Radio Packet Info
    TI BLE Packet Info
        Connection Event: 1
        Info: 0x02
            .... ..10 = Direction: Slave -> Master (0x2)
    Bluetooth Low Energy Link Layer
        Access Address: 0x76e629b6
        [Master Address: c4:3c:b0:0a:ff:82 (c4:3c:b0:0a:ff:82)]
        [Slave Address: TexasIns_31:cc:a8 (80:6f:b0:31:cc:a8)]
        Data Header: 0x091b
        Control Opcode: LL_FEATURE_RSP (0x09)
        Feature Set: 0x00000000000059dd
            .... ...1 = LE Encryption: True
            .... ..0. = Connection Parameters Request Procedure: False
            .... .1.. = Extended Reject Indication: True
            .... 1... = Slave Initiated Features Exchange: True
            ...1 .... = LE Ping: True
            ..0. .... = LE Data Packet Length Extension: False
            .1.. .... = LL Privacy: True
            1... .... = Extended Scanner Filter Policies: True
            .... ...1 = LE 2M PHY: True
            .... ..0. = Stable Modulation Index - Transmitter: False
            .... .0.. = Stable Modulation Index - Receiver: False
            .... 1... = LE Coded PHY: True
            ...1 .... = LE Extended Advertising: True
            ..0. .... = LE Periodic Advertising: False
            .1.. .... = Channel Selection Algorithm #2: True
            0... .... = LE Power Class 1: False
            .... ...0 = Minimum Number of Used Channels Procedure: False
            0000 000. = Reserved: 0
            Reserved: 0000000000
        CRC: 0xdd3087
    

    The DLE feature shows as disabled in the peripheral feature response.

    I can not definitely identify the LL_LENGTH_REQ packet as the one causing the timeout, however it seems to be the main difference between the working and failing traces. There are some packages exchanged afterwards (attribute reads and parameter update requests by the central), but it seems to be the last LL control request initiated by the peripheral. In my mind it is currently the most likely candidate.

    Sincerely,

    Alex

  • Hi Alex,

    Got it. According to the specification, the central device should only respond with the LL_REJECT_EXT_IND if it receives a procedure from the peripheral that conflicts with the procedure that was last sent/started by the central device.  Is there another LL procedure being started before the central receives the LL_LENGTH_REQ? For example, does the central send an LL_LENGTH_REQ before the peripheral sends its own? If so, then what should happen is that the peripheral's procedure is rejected, and the central's procedure proceeds to be completed.

    Best Regards,

    Jan

  • Hi Jan,

    Not as far as I can tell. The central sends its own LL_LENGTH_REQ a few connection events later. There is an LL_CONNECTION_UPDATE_REQ before, but its instant has already passed by the time the peripheral sends the LL_LENGTH_REQ.

    traces-txt.zip

    Even then, as I understand it, a LL_LENGTH_REQ does not have an instant, and thus Vol 6, Part B, Section 5.3 should not apply here since collisions require incompatible procedures, which are defined as two procedures involving instants?

    Sincerely,

    Alex

  • Hi Alex,

    My apologies, you are correct. Section 5.1.9 Data Length Update procedure (shown below) details what should happen when an LL_LENGTH_REQ is receives and what should happen if a LL_LENGTH_REQ is received as a response to a LL_LENGTH_REQ. It seems that the Android device is not handling the back-to-back LL_LENGTH_REQ when it should be able to simply use the parameters in the request to perform the procedure.

    This collision should be harmless if the spec is followed.

    Best Regards,

    Jan

  • Hi Jan,

    thank you for that confirmation. I'm still waiting to hear back from the tablet manufacturer, but at least I am now convinced that the error is on their side.

    In the meantime I'm still trying to evaluate if disabling DLE on the peripheral is a valid workaround for us (Question 2 of the initial post). We need to be able to move larger amounts of data fairly quickly. In the current workaround we clear the DLE bit in the local feature set, but still set the suggested default data length to 251/2120ms. As mentioned I've observed that the peripheral then still allows the larger PDU size and so far there seems to be no further impact on the connection. Essentially disabling DLE only seems to stop the peripheral from initiating its own LL_LENGTH_REQ at the start of a connection. Can you confirm this, or are there any other internal mechanisms that get disabled?

    Sincerely,

    Alex

  • Hi Alex,

    I am glad to help!

    Can you clarify how you are disabling DLE on the peripheral side? The Disabling DLE at Runtime section provides instructions on how to disable DLE.

    Best Regards,

    Jan

  • Hi Jan,

    currently I'm doing only the second part, clearing the bit in the local feature set. And it seems to do exactly what I need, disabling the initial LL_LENGTH_REQ send by the peripheral (the one that the central responds to wrongly). I'm not changing the MaxDataLen because I actually want to use the larger packages if possible. The question is whether only doing half will lead to problems later down the line, or if the stack will just deal with it.

    Sincerely,

    Alex

  • Hi Alex,

    Got it! Thank you for clarifying. If that is the case, then you are not truly disabling DLE, but causing the device not to send the data length request. Since DLE is not disabled, when the central's data length request comes in it will permit DLE lengths. At a glance, I don't think there would be an issue with this, but I cannot be 100% sure. I would highly recommend doing intensive long-term testing on your end to ensure the device behaves as expected .

    Best Regards,

    Jan

  • Hi Jan,

    thank you. I will run some long-term test then.

    Sincerely,

    Alex