This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DSP is occasionally very slow to send ACK on TCP connection

Hi

I am developing for a Keystone 2 device using:

  • PDK 3.0.3.15
  • NDK 2.24.1.18
  • SYS/BIOS 6.37.2.27
  • CCS 5.5

I have a simple  project running on a DSP core that acts as a TCP server and connects to a TCP client running on a Windows 7 laptop.  The TCP server exchanges messages sent across the TCP connection, with other DSP cores, using QMSS.

The TCP server works ok most of the time.  But there is an occasional failure with it.  Occasionally, the DSP delays sending an ACK to the client for around 16s.  In that time, the Windows client does some retransmissions and then starts sending ARP packets. I guess the client assumes that the connection is broken.

Why might the NDK delay sending an ACK for 16s?

Best regards

David

  • Hi David,

    Are you using wireshark to determine the ACK delay ? If so, please share the wireshark log.

    I also wanted to know what is the TCP packet size that you are sending ? I found a post where it was reported that a TCP packet size greater than the MTU size led to packet delays and re-transmissions (on keystone devices). I am wondering if the same is happening in your project. (Link to other E2E thread: e2e.ti.com/.../925842)

    Best,
    Ashish
  • Hi Ashish

    Thanks for your reply.


    Are you using wireshark to determine the ACK delay ? If so, please share the wireshark log.

    Yes, I will attach the log.  The long delay can be seen after line #27.


    I also wanted to know what is the TCP packet size that you are sending ? I found a post where it was reported that a TCP packet size greater than the MTU size led to packet delays and re-transmissions (on keystone devices). I am wondering if the same is happening in your project. (Link to other E2E thread: e2e.ti.com/.../925842)

    The MTU is the standard size (~1500) . The Windows PC is sending very large messages (~50,000 bytes) but, of course, the network stack segments these into the MTU or less.  The thread you referred to is very interesting but doesn't have a conclusion.  

    Best regards

    David

    2654.SlowAck_1May.zip

  • As I wrote above, we need to send messages >MTU size. On the Windows side, the network stack or the NIC will segment the messages into the MTU size or less.

    On the DSP side, it seems, by experience, that the NDK also does such segmentation. However, SPRU523i section 3.5.1 has a paragraph labelled "Sending and Receiving UDP Datagrams over MTU size", which implies that additional configuration is necessary, such as rebuilding the library.

    Would I be better off doing the segmentation at the application level?

    Best regards

    David
  • Hi

    I have changed my code such that messages passed from the application to the NDK for transmisssion are always <=MSS (Maximum Segment Size = 1460 bytes). But I am still seeing long delays in the NDK sending an ACK to the remote client.

    Any comment please?

    Best regards

    David
  • Hi David,

    Which example code are you working ?

    Is that code from TI or your own code ?

    You can find PA EMAC example that would send/receive the packets via internal/external.

    What are all the changes that you made if its provided TI ?

    Help us to reproduce the problem.

  • Hello,

    I was wondering if this issue was ever resolved.  I am having a very similar issue with TCP acknowledgements.

    We are using C6655 processor, although I can see the problem on C6657 EVM board also.

    I am using the following software configuration:

    Code Composer 6.1

    Compiler 8.0.3

    BIOS 6.42.01.20

    XDC Tools 3.31.02.38.core

    mcsdk 2.01.02.06

    pdk c6657.1.1.2.6

    NDK 2.24.03.35

    When the system first starts up, UDP packets on the C66 device are received correctly.  However, when I try to establish the TCP connection, I am seeing a delay of up to 40 seconds before the TI stack acknowledges the SYN request.

    After this initial delay, the stack appears to be working correctly.  

    The attached snapshot of the wireshark window demonstrates the problem.

    Any responses and suggestions would be greatly appreciated.

    Best Regards,

    Dmitry.

    3404.tcp delay.zip

  • Dmitry,

    Interesting. Are you able to ping the target when this delay is happening?

    Also, can you please confirm who is who regarding the IP addresses in the screenshot and capture?

    When this problem is happening, can you halt the target in CCS and then open ROV? Please take a screen shot of the "Task" view, in particular the "advanced" view.

    Steve
  • Hi Steven,

    Thank you for your response.

    In the capture and the screenshot, address 192.168.0.100 is the PC, 192.168.0.63 is the C6655 device.

    When this problem is happening, pings from the PC are also not responded to.

    However, UDP broadcasts from the PC are read and responded to.

    I also noticed that in cases where the problem does not occur, I see gratuitous ARP packet that is emitted by the DSP NDK on startup.  This packet is not always sent out.  It appears that the absence of the packet corresponds to the occurrence of this problem, but I am still trying to verify this, so I am not entirely sure whether this correlation is correct.

    Here is some additional information that will help you interpret the task information below:

    We are running with the default NDK stack, but we are using 32 priorities, so the stack is configured to accommodate us on this.  Kernel priority is set to 26, High to 24, normal to 12 and low to 1. 

    We are doing a lot of data transfer over a gigabit connection, so I've rebuild the stack to use jumbo pbm.

    This is really all the modifications I made to the stock NDK.

    We are running on a custom board using C6655 and Marvell 88E1512 phy.  We started with the NIMU and EMAC driver that came with the 6657 EVM, and modified it to work with our phy.  I had to tweak NIMU_EMAC_PKT_SIZE to allow Ethernet packets larger than 1510 bytes (I posted about in in the C66 forum, but apparently this was also mentioned in a previous post by Travis S. earlier).

    One other tweak I made to the NIMU was in EmacSend function to address the issue of FSC errors on short Ethernet packets.

    In the capture below, the stack is configured and started from the Network task, running at priority 25.  The NDK Stack Thread is added by RTSC tools to connect the lltimer() for the NDK, and then terminates.  NDK code generation in RTSC tools is disabled.

    I am working on a simplified example that will show this problem.

    Here is the screen capture that you were asking for:

  • Dmitry Gringauz53811 said:

    When this problem is happening, pings from the PC are also not responded to.

    However, UDP broadcasts from the PC are read and responded to.

    Very interesting that pings are not getting through, but UDP broadcasts are.

    Do you know if there are any statistics in your Ethernet driver? such as frames dropped, (good) RX frames received, etc?  It might be interesting to look at those (if supported in the driver).

    Dmitry Gringauz53811 said:
    Kernel priority is set to 26, High to 24, normal to 12 and low to 1. 

    I noticed in the screen shot that your network task is currently blocked, but it is at priority 25.  Can you please confirm these priority settings?

    IAlso, what is your network topology?  Are you running on a private LAN?

    Steve

  • Hi Steve,

    Sorry for the delayed response, I've been a bit tied up.

    To answer your questions:

    I did not yet check any statistics from the driver yet.  I do believe that C6657 driver supports stats, I just did not yet had a chance to retrieve them and print them out.

    Yes, the network task is indeed started at priority 25.  

    And yes, I am running on a private non-routing LAN, 192.168.0.0.  

    I also believe that I have remedied my problem, or at least the manifestation of it that I was seeing.  

    I believe that the issue had to do with the priority at which the default NDK thread (NDK Stack Thread) was being started.  In my original configuration (seen in a previous post), this thread was started at priority 5.  This priority is one of the lowest ones in our system.  Therefore, it is likely that this task did not get a chance to run after I started the BIOS until very late in the system initialization.  I believe that it ran after the NDK stack was already up and running.  The NDK Stack Thread initializes the timer that strokes the stack every 100ms, llTimerTick.  I do not know if this is the root cause of the issue, but it appears that if I make the NDK stack thread priority high enough so that it runs as one of the first tasks during system startup, the problem that I was seeing goes away.

    Another approach that I found helps resolve this issue is delaying the IP address assignment until after the NDK stack is up and running, in the NetworkOpen() callback.  Whatever the condition that causes the delay that I was seeing apparently gets cleared up if I perform the late IP address assignment.

    Either one of the two methods above works.  I chose to keep both in, since the late IP address assignment works well with the industrial networking application that I will be integrating in the future.

    Now, I have found a workaround for the problem that I am seeing, but I am bothered that I still do not know the root cause of the problem.  I would appreciate any hints as to whether my workarounds and my assumptions are real.

    Best Regards,

    Dmitry.

  • Hi Dmitry,

    Yes, task priority levels are very important in the NDK. If you change priorities, you must maintain the level of separation that existed in the original NDK priority values.

    Please refer to the NDK User's Guide, sections "5.2.3 Choosing the llEnter()/llExit() Exclusion Method", and "3.1.2.2 Priority Levels for Network Tasks" for more info.

    Steve