This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC3235S: TI Device as TCP client sometimes fails to respond ACK to servers's TCP keepalive packet using Network Terminal Example

Part Number: CC3235S

Hi,

I currently encounter a problem that my TI CC3235S device fails to respond ACK to my server's TCP keepalive packet in every 5mins - 4hours. I orginally thought maybe it was the problem of my application, so I tried the TI network terminal example, and the result is the same. Also, I cross checked with an broadcom wifi device, and it works fine on responding my server's keepalive packets. 

Here is some test envrionment and backgrounds:

     On the TI device side, the SDK version I am using is 4.30.00.06. My TI device runs the Network Terminal example of it. I used 'wlanconnect -s "Amba_tliu" -t WPA2 -p "Ambarella2015"' to connect the device to my router. After that, I entered 'recv -c 10.0.0.5 -p 8888' that connects the device to my TCP server at 8888 port. 

     On the server side, my TCP server runs on my Linux PC, and it connects to my local router. It has default keepalive interval value of 3 seconds and default retry times of 10 times. I created the simplest TCP server in order to make the test environment clean. It only accepts the TCP connection and will not send any data to TCP client. 

Below is the wireshark log captured when the issue was happened. Service IP: 10.0.0.5;  CC3235S IP:10.0.0.67. Please take a look at the row with timestamp of 218 seconds. There is no ACK appeared after keepalive packet. In addition, when TCP server retried to send the keepalive packet, CC3235S responded a RST message, which directly broke the TCP connection.

By looking through the network log, there seems to be two issues from CC3235S device. 1. No ACK at 218 seconds after receiving keepalive packet from TCP server. 2. Intead of responding ACK on server's retries, it breaks the connection by sending RST message to TCP server.

As you known, the purpose of using TCP keepalive function is to check and make sure the TCP connection/client is alive on long run. Both of the issues on CC3235S side create the obsticles for this purpose. Do you have any idea on how to fix these issues? 

 

  • Hi, 

    I don't see the retransmission after #155 (i.e. at 218s).

    So the client (assuming SL_SO_KEEPALIVETIME was set to 3 sec) got disconnected after 3 seconds of idle connection.

    This explains the RST after the next message.

    Did you get any socket event (or sl_Recv returned with error or 0)?

    if you want to send k/a every 3 seconds and to disconnect only after 10 failures - the SL keep alive value should be 30.

    If you can get an air sniffer, we can try to get more info on the reason for missing the keepalive.

    Or you can get the NWP logs (see ch 20.1 in https://www.ti.com/lit/pdf/swru455) to better understand what was happening in the CC3235 side.

    Br,

    Kobi

     

  • Hi Kobi,

    I think the #156 is the first retry from the TCP server. If there is no retry setup, my server will close the TCP connection after #155 since no ACK is received.

    In terms of SL_SO_KEEPALIVE setup on CC3235S side, I leave it as default (5 mins). As I known, TCP keepalive settings only take effect when it plays as a server. Keepalive message is initiated and counted by TCP server, and the TCP client is only responsible for responding ACK if a keepalive message is received.

    No, I do not get any socket event and error message from CC3235S side. This is why the problem becomes tricky. Both of keepalive and ACK messages are handled under TCP/IP layer, which application/socket does not notice.

    Do you know how to avoid the RST message comes from CC32325S when the TCP server does keepalive retries? By looking at the content difference between #155 an #156, there is only identification column in IP header differs and the value in #156 increased by one. Does the CC3235S check for it? What is the reason why CC3235S rejects with RST message?

    On the other hand, I will try to get more info about the missing ACK. Unfortunately, I do not have an air sniffer on my hand, but I will try to take a look at the link and follow the instructions to get the NWP logs. I will keep you updated. Thank you. 

    Br,

    Allen (Tao Liu)

  • The RST is received because the CC3235 from some reason think that the connection is closed (thus respond with RST for any data message on the socket).

    I'm not sure if the reason for the disconnection is the keep-alive timeout or anything else. We will the other logs to get more info.

     

  • Hello Kobi,

    I finally got the nwp log from the uart port. I found this problem happens randomly. I just did a long run test through last weekend and it did not fail.

    Anyway, the attachment is the log captured by today. After the issue is triggered, I closed the serial port, so I belive the issue part is located at the very end. 

    putty-20210202.log

    Hopefully this is helpful. Thank you for the help.

     Best Regards,

    Allen (Tao) Liu

  • I couldn't find in the log anything that explains the disconnection. 

    Are you on an enterprise network?

    Can you try with testing with a different AP?

  • Hi Kobi,

    I am testing on a local router, which is only connected to my PC with an ethernet cable. Sure, I will try to test with an another AP.

    The keepalive message is sent every 5 seconds. Could you please check how long the time diff is beween the RST message and last keepalive message through the log? If the time difference is 10 seconds, what happens at this time period? Also, is there any clue why the RST message is sent by CC3235?

  • Do you have a receiver on the connection (i.e. a thread that blocks on read, i.e. calling sl_Recv)?

    If so, this one should be invoked when the connection gets closed (e.g. sl_Recv returns 0 or a negative error code).

    We are not sending RST with timeout. The RST is a response for packet received on non-active (i.e closed) connection.

     

  • What SP are you using?

    please try the latest one - SP4.9 (from sdk4.40).

  • I switched to sp 4.9 this morning and the issue happens again. Meanwhile, sl_Recv function returns 0 when the issue happens.

    Hmmmm, I am kinda confused. Based on your explanation, I guess the TCP connection was closed either by server or by cc3235s before the last keepalive message was sent. That's the reason why the CC3235 responses RST message, since CC3235 thought that the last keepalive was sent on a closed tcp sock.

    From your experience, what part can be the potential issue?  and I will try to cross verify it. On the other hand, do you know if anyone else manage to run the TCP keepalive feature with CC3235S as the TCP client? I found some threads in E2E forum that TCP keepalive is being used but CC3235S is used as TCP server.

  • I'm not familiar with any open issues regarding TCP KA and I don't think being the server or the client has any impact on this feature.

    It seems that your server didn't close the connection as it keeps sending the KA and we don't see any FIN message (is it possible that it is filtered out?).

    So there should be some other reason that the cc3235 think the connection is closed which I still couldn't find. 

    If you can get an air sniffer log it will be very helpful.

    And if you can test with a different AP, it would also help us focus on the root cause.

    Br,

    Kobi