This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/CC3220MODA: MQTT connection would not succeed after many failed attempts.

Genius 3100 points
Part Number: CC3220MODA

Tool/software: TI-RTOS

I am testing MQTT on CC3220MODA with the SDK 2.10.00.04 and found the below issue.

MQTT connection is being run on a loop as shown in below steps.

  1. Check if wlan is connected
  2. call MQTTClient_create with appropriate parameters
  3. Set will param, user Id and password
  4. invoke MQTTClient_connect
  5. if return value is greater than or equal to zero subscribe to topics. Then handle incoming message. When disconnects or if return value is less than zero, continue to step 6
  6. Sleep for 1 second
  7. Unsubscribe from topics,
  8. invoke MQTTClient_disconnect
  9. invoke MQTTClient_delete
  10. Loop to step 1

Everything works fine in normal connection, disconnection, reconnection etc, however if the internet fails for a long time, the wlan would be connected and the MQTT steps would cycle with a 1 second delay. This works for some time, but if this happens for an hour or so, and when the internet connection restores, the connection would never succeed. The MQTTClient_connect fails and returns less than zero, and I have to reset the board for a successful connection. 

I have checked Run Time Object view to analyse heap and everything look normal i.e the heap memory usage is constant throughout the MQTT cycle. 

Appreciate quick help.

Regards,

Zac

  • HI Zac,

    When the MQTTClient_connect() fails and returns an error code, what error code do you get?
    If it's a -1 or other generic negative error code, it would be helpful if the source of the error is pinpointed by digging into the MQTT library and finding exactly where the failure in MQTTClient_connect() occurs. If the connect is failing, there are a number of possible error points. For example, it could be that the actual socket connection is failing, in which case the error would originate from the netConnect() function (in client_core.c). Try finding where the error occurs and I'll be able to better figure out what might be causing the problematic behavior you are seeing.

    Regards,
    Michael
  • Hi Michael,

    For first few retry attempts (wlan connected and internet disconnected), the return code is -111 (Connection refused) and later it becomes -2006 (Parameters are invalid ??) and continues to be -2006.

    I find it hard to replicate the issue and will try to debug. Is the error code -2006 acceptable ?

    Regards,
    Zac
  • Hi Zac,

    Having error code -2006 indicates that your connect failed due to invalid parameters passed to the connection call for some reason. Usually, this is due to the socket ID being passed to the internal sl_Send() being invalid.

    When you only disconnect for short periods of time due to lack of internet connectivity, does the reconnect work correctly? 

    Regards,
    Michael

  • Hi Michael,
    The reconnect works in both cases i.e for -111 and -2006. The only issue I have noticed is, after many failed (around 3000) attempts with -2006 the connection would not succeed when the internet connectivity restores until I reset the board.
    I am trying to replicate the issue.

    Regards,
    Zac
  • Hi Zac,

    Are you using one of the MQTT examples provided within the SDK, or are you running your own code using the MQTT APIs? Looking at the mqtt_client example, it should be cleaning up the MQTT context after an MQTT connection is closed, and it also resets the network processor through a sl_Stop() + sl_Start() cycle.

    Is there any major difference between the mqtt_client example and your code with regards to the reconnect behavior? If you manage to replicate this issue consistently and give us instructions as to how to reproduce what you're seeing, that would be greatly appreciated.

    Regards,
    Michael
  • HI Michael,

    I am using MQTT example from SDK, however with some changes. In my case I am not invoking sl_Stop() & sl_Start() for MQTT disconnections, but other steps are followed. Is the sl_Stop() & sl_Start() mandatory during disconnection?
    I am unable to replicate the issue now, the system runs fine for few days now.

    Regards,
    Zac
  • Hi Zac,

    Using sl_Stop() and then sl_Start() to restart the NWP shouldn't be required. I suggested it as a debug step that might have helped things, since that would guarantee that the NWP would be reset into a known good state without a full board/MCU reset. Since it looks like the problem you were encountering has not reoccurred, you probably don't need to use it for MQTT reconnection. However, if you encounter this issue again, try including that into your code and see if it helps things.

    Regards,
    Michael