Hi everyone,
I'm having some difficulties to handle mqtt disconnection when the device encountered it.
I want realize a system that if the cc3200 lose connection with broker, it retries to connect.
Any suggestion?
Thanks in advance.
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi Aeromechs,
Is the new issue about the sl_MqttDisconnect() callback not being triggered, or the device doesn't enter hibernation?
Hi Victor,
Thanks for getting back to us.
While it is very difficult for me to provide that info precisely, I can comment on it to help steer the conversation.
the firmware has some logic to that is described as:
while( MqttConnect() )
{
Message("Error: Can't connect to broker, retry in in 5 seconds\r\n");
osi_Sleep(5000);
}
MqttConnect() sets up the parameters needed and then connects to the broker. (port, authentication, topic, timeout, lastwill, ...etc reading from config file saved)
The last UART log message we get is the above. and never retries after 5 seconds or anything like that. just dies/hangs.
Web access to the device is working to a degree, however functionality is not 100% correct, meaning a reset button on the page that would allow for a remote
reset doesn't execute successfully and the device never resets (hibernate cycle).
The above was logged via an ESP8266 module acting as a wireless uart logger since as I mentioned the devices are placed in not easy to stay-next-to places
so keeping my laptop attached to the uart is near impossible.
This is from sdk 1.1.0
After migrating to 1.2 thinking that it would be better, what I see now is more stuck/hanging in the chip that happens either after many retries to connect to the broker. What we also noted is that now, we get stuck after connecting and disconnecting from the wifi. Meaning of for any reason wifi decides to disconnect, whether because signal issues, or getting kicked off for any technical/protocol reasons, after reconnect attempts, the CC3200 just hangs.
No more logging and no more cycles executed.
the only solution is to literally power-cycle the device. But even then no guarantees that connecting to wifi would be smooth, it might still fail to connect easily and hang again.
I know that my issue might be broader than the original post, but I know we share the same issue partially with MQTT disconnects and failing to get reset properly after that.
Thank you,
Fadi
Hi Fadi, Aeromechs,
It looks to me that there are multiple issues involved in the MQTT application:
Hi Fadi,
Can you please provide more details on the issue?
Hi Victor,
Thanks for getting back to us.
I can provide you with the information requested. Just need time to do the test for you and get things sorted.
Should we decide on using SDK1.2 for the test or do you recommend we stay with the prev. version?
I have not tested the same MQTT example that came with the SDK due to the fact that had to make some changes
for our own pruposes. (i.e reading data from UART before pushing back via MQTT, certain config parameters done via html page...etc)
Let me try and study this on my end to see how I can present you with consistent information from a scientific perspective in an effort
to help you (TI) troubleshoot this with us for a proper solution.
Please feel free to reach out to me via PM also.
Fadi
Hi Aeromechs,
These are async events coming from the NWP. You can setup a flag to detect these events and perform corresponding actions. Please see chapter 19 of NWP Programmer's Guide.
Hi,
by connection fail I mean MQTT connection failing.
I can't reconnect to MQTT if MQTT connection fail and when a connection established with MQTT encounter sock error (by server die or internet connection lost).
I want handle this issues because the internet connection is not very stable.
When I try to reconnect after a connection fail, even if the mqtt broker goes on, the connection fail.
console return
[SOCK ERROR] - close socket (81) operation failed to transmit all queued packets
[SOCK ERROR] -close scoket (81) operation connection less mode, rx packet fragmentation
> 16K, packet is being released
sometimes
[SOCK ERROR] - close socket (82) operation failed to transmit all queued packets
but the errors are the same
when the cc3200 is connected to broker and I close the server the errors are
[SOCK ERROR] -close scoket (81) operationconnection less mode, rx packet fragmentation
> 16K, packet is being released[SOCK ERROR] -close socket (81) operationremote side down from secure to unsecure
unknown sock async event: 2
[SOCK ERROR] - close socket (81) operation failed to transmit all queued packets
Finally I've arrived to a partial solution of problem.
I read the known issues of sdk and I've modified the file as suggests. This works when the cc3200 is connected to the broker, then I kill the broker on an AWS server. But doesn't work when I turn off internet connection (not wifi, only unplug the adsl cable)
now the mqtt_client method is called when there is a disconnection.
Then I've implemented a method that unlock a sync object that unlock a task
//these are two task void retry_connection(void *pvParameters){ int lRetVal=-1; while(1){ osi_SyncObjWait(&semafore_down_position,OSI_WAIT_FOREVER); disconnect_from_broker(); lRetVal = Network_IF_ConnectAP(wifi_ssid, SecurityParams); if(lRetVal < 0) { MAP_UtilsDelay(80000000); //LOOP_FOREVER(); } } } void retry_broker_connetion(connect_config *local_con_conf){ int lRetVal; while(1){ osi_Sleep(4000); osi_SyncObjWait(&semaphore_retry_broker_connection,OSI_WAIT_FOREVER); reconnect_to_broker(); } } //end //modified Network_IF_ConnectAP as long Network_IF_ConnectAP(char *pcSsid, SlSecParams_t SecurityParams) { #ifndef NOTERM char acCmdStore[128]; unsigned short usConnTimeout; unsigned char ucRecvdAPDetails; #endif long lRetVal; unsigned long ulIP = 0; unsigned long ulSubMask = 0; unsigned long ulDefGateway = 0; unsigned long ulDns = 0; // // Disconnect from the AP // Network_IF_DisconnectFromAP(); // // This triggers the CC3200 to connect to specific AP // lRetVal = sl_WlanConnect((signed char *)pcSsid, strlen((const char *)pcSsid), NULL, &SecurityParams, NULL); ASSERT_ON_ERROR(lRetVal); // // Wait for ~10 sec to check if connection to desire AP succeeds doveva essere 15 al posto di 10 // while(g_usConnectIndex < 10) { #ifndef SL_PLATFORM_MULTI_THREADED _SlNonOsMainLoopTask(); #else osi_Sleep(1); #endif MAP_UtilsDelay(8000000); if(IS_CONNECTED(g_ulStatus) && IS_IP_ACQUIRED(g_ulStatus)) { break; } g_usConnectIndex++; } #ifndef NOTERM // // Check and loop until AP connection successful, else ask new AP SSID name // while(!(IS_CONNECTED(g_ulStatus)) || !(IS_IP_ACQUIRED(g_ulStatus))) { // // Disconnect the previous attempt // Network_IF_DisconnectFromAP(); CLR_STATUS_BIT(g_ulStatus, STATUS_BIT_CONNECTION); CLR_STATUS_BIT(g_ulStatus, STATUS_BIT_IP_AQUIRED); UART_PRINT("Device could not connect to %s\n\r",pcSsid); if(wpa_errata==1){return -1;} /*do { ucRecvdAPDetails = 0; UART_PRINT("\n\r\n\rPlease enter the AP(open) SSID name # "); // // Get the AP name to connect over the UART // lRetVal = GetCmd(acCmdStore, sizeof(acCmdStore)); if(lRetVal > 0) { // remove start/end spaces if any lRetVal = TrimSpace(acCmdStore); // // Parse the AP name // strncpy(pcSsid, acCmdStore, lRetVal); if(pcSsid != NULL) { ucRecvdAPDetails = 1; pcSsid[lRetVal] = '\0'; } } }while(ucRecvdAPDetails == 0);*/ // // Reset Security Parameters to OPEN security type // //SecurityParams.Key = (signed char *)""; //SecurityParams.KeyLen = 0; //SecurityParams.Type = SL_SEC_TYPE_OPEN; UART_PRINT("\n\rTrying to connect to AP: %s ...\n\r",pcSsid); // // Get the current timer tick and setup the timeout accordingly // usConnTimeout = g_usConnectIndex + 15; // // This triggers the CC3200 to connect to specific AP // lRetVal = sl_WlanConnect((signed char *)pcSsid, strlen((const char *)pcSsid), NULL, &SecurityParams, NULL); //UART_PRINT("%d\n\r",lRetVal); ASSERT_ON_ERROR(lRetVal); // // Wait ~10 sec to check if connection to specifed AP succeeds // while(!(IS_CONNECTED(g_ulStatus)) || !(IS_IP_ACQUIRED(g_ulStatus))) { #ifndef SL_PLATFORM_MULTI_THREADED _SlNonOsMainLoopTask(); #else osi_Sleep(1); #endif MAP_UtilsDelay(8000000); if(g_usConnectIndex >= usConnTimeout) { break; } g_usConnectIndex++; } } #endif // // Put message on UART // UART_PRINT("\n\rDevice has connected to %s\n\r",pcSsid); // // Get IP address // lRetVal = Network_IF_IpConfigGet(&ulIP,&ulSubMask,&ulDefGateway,&ulDns); ASSERT_ON_ERROR(lRetVal); // // Send the information // UART_PRINT("Device IP Address is %d.%d.%d.%d \n\r\n\r", SL_IPV4_BYTE(ulIP, 3),SL_IPV4_BYTE(ulIP, 2), SL_IPV4_BYTE(ulIP, 1),SL_IPV4_BYTE(ulIP, 0)); if(check_disconnection==1){ reconnect_to_broker(); } return 0; } void reconnect_to_broker(){ //RECONNECT_BROKER tPushButtonMsg sMsg; osi_messages var = RECONNECT_BROKER; sMsg.received=var; osi_MsgQWrite(&g_PBQueue,&sMsg,OSI_NO_WAIT); } //in the queue handler in main function else if(RECONNECT_BROKER == RecvQue.ricevuto){ if((sl_ExtLib_MqttClientConnect((void*)local_con_conf[iCount].clt_ctx, local_con_conf[iCount].is_clean, local_con_conf[iCount].keep_alive_time) & 0xFF) != 0) { UART_PRINT("\n\rBroker connect fail for conn no. %d \n\r",iCount+1); } else { UART_PRINT("\n\rSuccess: conn to Broker no. %d\n\r ", iCount+1); local_con_conf[iCount].is_connected = true; //iConnBroker++; } lRetVal=sl_ExtLib_MqttClientSub((void*)local_con_conf[iCount].clt_ctx, local_con_conf[iCount].topic, local_con_conf[iCount].qos, TOPIC_COUNT); if(lRetVal<0){ sl_ExtLib_MqttClientDisconnect((void*)local_con_conf[iCount].clt_ctx); osi_SyncObjSignal(&semaphore_retry_broker_connection); } //end
I am having a similar problem. This happen both on SDK 1.1.0 and 1.2.0. My code is suppose to reconnect when MQTT broker connection is lost. I simulated connection lost by removing network cable of the WiFi router. It behaves as expected for about 4 times. But then it consistently fail to reconnect to MQTT broker at the fifth reconnection. After that it cannot reconnect again. The MQTT_client example in the SDKs were not designed to reconnect upon disconnection. For successful broker connection there seems to be a small delay for the sl_ExtLib_MqttClientConnect function call to return. For the fifth and subsequent reconnection, the sl_ExtLib_MqttClientConnect function call return immediately. It is as if the MQTT library does not even make an attempt to connect to the broker.
Eruan Abdul Razak said:I am having a similar problem. This happen both on SDK 1.1.0 and 1.2.0. My code is suppose to reconnect when MQTT broker connection is lost. I simulated connection lost by removing network cable of the WiFi router. It behaves as expected for about 4 times. But then it consistently fail to reconnect to MQTT broker at the fifth reconnection. After that it cannot reconnect again. The MQTT_client example in the SDKs were not designed to reconnect upon disconnection. For successful broker connection there seems to be a small delay for the sl_ExtLib_MqttClientConnect function call to return. For the fifth and subsequent reconnection, the sl_ExtLib_MqttClientConnect function call return immediately. It is as if the MQTT library does not even make an attempt to connect to the broker.
I also have the similar problem and dispite some other TI forum posts, there exists no final and reliable answer to this fundamental MQTT library problem.
After each reconnection with sl_ExtLib_MqttClientConnect(), the "net number" is incrementing. It seems like if the library ends up at the limit of 4 brokers.
I did not find a way to reset the net number to 17 because the MQTT library programming and implementation style is not very transparent.
In my application, a total reset is not possible because user application must continuously run.
Terminal screenshots:
Version: Client LIB 1.0.4, Common LIB 1.1.1.
C: FH-B1 0x10 to net 17, Sent (43 Bytes) [@ 3]
C: Rcvd msg Fix-Hdr (Byte1) 0x20 from net 17 [@ 3]
C: Cleaning session for net 17
C: Msg w/ ID 0x0000, processing status: Good
...
After 4 reconnections:
C: FH-B1 0x10 to net 20, Sent (43 Bytes) [@ 527]
C: Rcvd msg Fix-Hdr (Byte1) 0x20 from net 20 [@ 527]
C: Cleaning session for net 20
C: Msg w/ ID 0x0000, processing status: Good
...
After next reconnection attempt, the library hangs. No debug message, no disconnection callback.
Dear TI employees, could you please provide a final and reliable solution th this fundamental issue?
(Answers like "You can simply call the sl_ExtLib_MqttClientConnect() API to connect to the broker again" in an earlier post do not help)
Thanks and regards
Klaus
Hello Klaus,
Can you please open a new thread as this one is closed?
You acn add a link to this one as reference.
Regards,
Shlomi
Problem solved. Once you detect broker disconnected call sl_ExtLib_MqttClientCtxDelete function before re-connection. I suspect some memory allocation must be released before new connection memory allocation.
Here is a snippet of a pseudo code
while(1)
{
if((broker_config.is_connected==false) && wifi_is_connected && IP_ACQUIRED)
{
sl_ExtLib_MqttClientCtxDelete(broker_config.clt_ctx);
mqttbrokerconnect();
}
Task_sleep(5000);
}