This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC3220MODA: ClockSync_get gets stuck for a long time when internet connection is unstable

Genius 3100 points
Part Number: CC3220MODA

I am using latest SDK - 6.10.00.05, and found that ClockSync_get can get stuck for a very long time when internet is very unstable. This can be very rare, however I have seen this happening in production and couldn't get debug logs.

Does ClockSync_get have a default timeout or can we set some timeout?

Zac

  • I think the problem is in the sl_NetAppDnsGetHostByName (called internally within ClockSync_get when trying to get SNTP host address) which can block in such case.

    You can implement a watchdog and call sl_Stop in case it expires to stop the NWP and the sl_NetAppDnsGetHostByName execution.

  • Will it not timeout after a certain period of time or can it be fixed at the ClockSync code itself?

    Zac

  • if I remember correctly, there is no timeout for this API (it can only get blocked in case of internet access loss).

  • theoretically, this SNTP access can be done by the app (with an external DNS implementation or without DNS if you know the ip addresses of the SNTP servers).

  • Calling sl_Stop does not seems like an elegant solution to the problem. I guess the same issue would happen for other socket connections like http or mqtt! Moreover internet access loss isn't something we can avoid completely. 

  • Again implementing another solution for SNTP can be an unnecessary rework. Can we not fix this issue at the method itself by implementing some kind of timeout mechanism? 

  • Hi,

    Why not use this way?

    SlNetAppDnsClientTime_t dnsTime;
    dnsTime.MaxResponseTime = 1500;
    dnsTime.NumOfRetries = 5;
    retVal = sl_NetAppSet(SL_NETAPP_DNS_CLIENT_ID,
                          SL_NETAPP_DNS_CLIENT_TIME,
                          sizeof(dnsTime),
                          (unsigned char *)&dnsTime);

    Jan

  • I have seen this in the documentation, however I guess it works under normal circumstances. i.e when the internet is turned off while making a DNS call. For that the default timeout also kicks in after few seconds. The scenario which we have come across is when the internet is very unstable, this call does not return and waits for ever. It's very hard to replicate, however I have seen this happening in some production devices.

  • Hi Zac,

    I haven't seen such issue with sl_NetAppDnsGetHostByName() at my implementation with ~20k which I have out. I use CC3220 and generic API sl_NetAppDnsGetHostByName() but I have on top implemented own DNS cache which decrease number number of DNS API calls.

    But I agree that you application may to be different and you see issues which I never experience. I think only way for you is to use own DNS implementation. For example you can use DNS resolver from LwIP. DNS is relatively simple protocol and own implementation should be easy task.

    Jan

  • Hello Jan,

    My devices don't seem to have sufficient resources to implement own DNS. Regardless, I guess if there is an issue in the API (Kobi suspected that it could be the sl_NetAppDnsGetHostByName which can get stuck) then it must be fixed. It's hard to get logs at the moment however I will try. Meanwhile if anyone knows or have seen such issues, please share further details which might help us to debug further

    Zac

  • I'm not sure it will be possible to fix this in a service pack. We will need to check before we can commit to such fix.

    In the meanwhile there are the 2 solutions in the host (watchdog+sl_Stop, adding DNS implementation). After adding a DNS implementation, you can update the network libraries (ClockSync/HTTP/MQTT) by replacing the sl_NetAppDnsGetHostByName calls.