This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/AM3356: UDP transmit packet drop

Part Number: AM3356
Other Parts Discussed in Thread: SYSBIOS

Tool/software: TI-RTOS

Hello,

I have an application that has two different threads transmitting UDP data frames to two different IP addresses.    Each transmits at 100 frames per second.

As long as both threads are sending to existing IP addresses on the network the transmissions occur as expected.

The problem occurs when one of the target IP addresses do not exist.     When the one thread is attempting to provide UDP data frames to the non-existent target IP address I see periodic momentary UDP transmit lapses in the UDP data being sent by the second thread to the associated  second target IP address.    Note that the second target IP address/node detects a problem when the expected periodic 10 millisecond frame is not received within approximately 30 milliseconds.   I am not sure if the frames are delayed or dropped.

I noted that there is a 20 second period in the disruption of the communication between the second thread and its associated target IP address/node.    I am guessing this correlates to the 20 second timeout in the ARP'ing to the non-existent target IP address/node associated with the first thread.

I am currently configured for the NDK to be lower priority than my application threads.

Is the problem the result of the ARP'ing that results from first thread trying to transmit UDP frames?    Will configuring the NDK to be higher priority than my application threads resolve this problem?    Are there any other options to consider to resolve this issue.

Thank you,

Mark

  • The RTOS team have been notified. They will respond here.
  • Thanks Biser,

    I have done additional exploration of the NDK (2.24.1.18) and believe that I may have a handle on what might be causing my issue.

    Examining the following:   file lliout.c,  function LLITxIpPacket,  starting at line 141:

    When my communication issue is occurring,  I believe that my entire packet pool ends up on the ArpPktQ due to one of my threads sending UDP frames at a  100 frames per second rate to a nonexistent IP address (address resolution does not succeed).     A second thread that is also sending UDP frames at a similar rate cannot send due to the empty packet pool.     the problem periodically resolves when the ARP timeout occurs and the ArpPktQ for the nonexistent IP address is dumped back into the packet pool.   the second thread will again be able to send UDP frames until the pool again empties.

    I believe that the problem involves the fix to SDOCM00088612 that involved the placing of packets on the ArpPktQ queue.    It appears that this fix was done to address a problem involving transmission of a frame that requires fragmentation to an IP address that did not have a resolved IP address.    The queue'ing insured that the entire fragmented frame was sent once the IP address was resolved.

    The possibly unforeseen consequence of this fix was that it would be possible to temporarily place a large quantity of packets on the ArpPktQ when sending frames to a nonexistent IP address at a higher rate.   In the extreme this could cause loss of all Ethernet communication during intervals when the packet pool goes empty.

    One possible solution (at least for my application) is to back the SDOCM00088612 fix out since my application does not send frames of sufficient size to require fragmentation.    I worry about other unforeseen consequences of doing this.

    I am hoping that the appropriate expert can comment on my assessment of the issue and also suggest a possible fix.

    Thanks,

    Mark   

  • Hi Mark,

    Can you try the earlier NDK release 2.24.0.11 (software-dl.ti.com/.../index_FDS.html) to confirm your assessment? SDOCM00088612 fix was on top of this major release.

    You can also find all NDK releases here: software-dl.ti.com/.../index.html.

    As you can see from the release notes, there were multiple issues addressed after the maintenance release 2.24.1.18, however those fixes might be irrelevant to your application.

    Regards,
    Garrett
  • Hi Garret,

    I have looked at the NDK 2.24.0.11 and the latest release.     I believe that there have been no fixes apparent in the latest release that would address the problem I am seeing.

    I am in the process of trying to add a change to the NDK 2.24.1.18 LLIOUT.C as highlighted in the snapshot below.       The intent of the change is to keep ONLY the last UDP frame with an unresolved MAC address on the ARP queue.    The change will allow all fragments of a message to be placed on the ARP queue,  but insures that there is never more than one full frame on the queue.

    I am currently trying to rebuild the NDK and am having a problem with the FTD/SOCKET.C with the SA_IN structure element being undefined.

    Below is how I modified the NDK.mak file.

    I am sure that I have done something wrong -- just not sure what!!!     Any guidance would be appreciated on getting the build to work,

    Once I have the NDK successfully built I plan to test with the fix as indicated above and see if that small code addition to NDK 2.24.1.18 resolves the problem.

    Thanks for your support,

    Mark

  • Mark,

    The portion you describe how to modify NDK.mak file is missing, nevertheless, here is the instruction to build NDK: processors.wiki.ti.com/.../Rebuilding_The_NDK_Core_Using_Gmake

    You can try this:

    set XDC_INSTALL_DIR=C:/ti/xdctools_3_31_02_38_core 
    set SYSBIOS_INSTALL_DIR=c:/ti/bios_6_45_00_19
    set gnu.targets.arm.A8F=c:/ti/ccsv6/tools/compiler/gcc-arm-none-eabi-4_8-2014q3
    path=%path%;C:\ti\xdctools_3_31_02_38_core
    gmake –f ndk.mak clean
    gmake –f ndk.mak all

    Regards, Garrett

  • Hi Garret,

    Let me clarify the problem I am having with the NDK rebuild (version 2.24.1.18).

    When I do the rebuild (make -f ndk.mak all) I see unexpected compile errors in source code that I have not modified. Specifically I see errors at lines 107,216,308,394,488,733,1175,1295 of "FDT/SOCKET.C". The error is: identifier "SA_IN" is undefined.

    HOPEFULLY attached is the text file from:  make -f ndk.mak all > ndk_make_out.txt in case that might help determine the cause of the rebuild problem.

    building ndk packages ...
    making all: Wed Jul 19 12:42:55 EDT 2017 ...
    ======== .interfaces [./packages/ti/ndk] ========
    ======== .interfaces [./packages/ti/ndk/config] ========
    ======== .interfaces [./packages/ti/ndk/hal/eth_stub] ========
    ======== .interfaces [./packages/ti/ndk/hal/ser_stub] ========
    ======== .interfaces [./packages/ti/ndk/hal/timer_bios] ========
    ======== .interfaces [./packages/ti/ndk/hal/userled_stub] ========
    ======== .interfaces [./packages/ti/ndk/netctrl] ========
    ======== .interfaces [./packages/ti/ndk/nettools] ========
    ======== .interfaces [./packages/ti/ndk/os] ========
    ======== .interfaces [./packages/ti/ndk/productview] ========
    ======== .interfaces [./packages/ti/ndk/rov] ========
    ======== .interfaces [./packages/ti/ndk/stack] ========
    ======== .interfaces [./packages/ti/ndk/tools/cgi] ========
    ======== .interfaces [./packages/ti/ndk/tools/console] ========
    ======== .interfaces [./packages/ti/ndk/tools/hdlc] ========
    ======== .interfaces [./packages/ti/ndk/tools/servers] ========
    .interfaces files complete: Wed Jul 19 12:43:05 EDT 2017.
    ======== .libraries [./packages/ti/ndk] ========
    ======== .libraries [./packages/ti/ndk/config] ========
    ======== .libraries [./packages/ti/ndk/hal/eth_stub] ========
    ======== .libraries [./packages/ti/ndk/hal/ser_stub] ========
    ======== .libraries [./packages/ti/ndk/hal/timer_bios] ========
    ======== .libraries [./packages/ti/ndk/hal/userled_stub] ========
    ======== .libraries [./packages/ti/ndk/netctrl] ========
    ======== .libraries [./packages/ti/ndk/nettools] ========
    ======== .libraries [./packages/ti/ndk/os] ========
    ======== .libraries [./packages/ti/ndk/productview] ========
    ======== .libraries [./packages/ti/ndk/rov] ========
    ======== .libraries [./packages/ti/ndk/stack] ========
    clea8fnv fdt/socket.c ...
    
    >> Compilation failure
    

    I do not understand why there are references to the "SA_IN" since the release notes state:

    Removal of sa_len, sin_len and sin6_len socket address struct members

    Starting in NDK 2.24, the socket address structures have been redefined to remove the following fields:

    sa_len

    sin_len

    sin6_len

    I am using SYS/BIOS 6.41.00.26, CCSv6, XDCtools 3.30.04.52, TI code generation tools CORTEX-A8 5.1.9. These are all meeting the compatibility information in the NDK 2.24.1.18 release notes.

     

    I have modified the following lines in the ndk.mak as shown below:

    DESTDIR ?= C:\ti\ndk_2_24_01_18

    XDC_INSTALL_DIR ?= C:\ti\xdctools_3_30_04_52

    SYSBIOS_INSTALL_DIR ?= C:\ti\bios_6_41_00_26

    ti.targets.arm.elf.A8Fnv ?=C:\ti\ccsv6\tools\compiler\arm_5.1.9

    The A8Fnv target is consistent with the target in CCSv6.

     

    It does not seem to me to be a tool chain issue -- but rather a code problem specific to the FDT/SOCKET.C source file and I definitely need help in determining how to proceed.

     

    In reference to my original problem of the NDK stack stall when sending UDP frames to a non-existant IP address at a high frequency, below is the code I am going to add to lliout.c -- inserting at approximately line 156 immediately above the PBMQ_enq call.

     

     

    /////////////////////////////////// STACK STALL FIX ADDITION //////////////////////////////////////////////////////////////

    /*

    * additional fix to SDOCM00088612:

    * if this is the first or only packet, then clear the

    * ARP PACKET QUEUE. Note that test is for FRAGMENT OFFSET

    * being zero -- which is the case for either a fragmented or

    * unfragmented packet. Once ARP QUE is empty then add the

    * new first/only packet. If additional fragmented packets

    * come, they will also be added to ARP QUEUE

    */

    if ( (HNC16( ( (IPHDR *)(pPkt2->pDataBuffer + pPkt2->DataOffset))->FlagOff) & 0x1fff) == 0)

    {

    // a FIRST/ONLY packet is being processed and ARP required -- so if there are

    // other packets on the ARP QUEUE then return them to the pool and put the latest

    // packet(s) on the ARP QUEUE

    while ((pCurPkt = PBMQ_deq(&(plli->ArpPktQ))) != NULL)

    {

    PBM_free(pCurPkt);

    }

    }

    /////////////////////////////////// end of ADDITION //////////////////////////////////////////////////////////////////////

     

    Thanks for your help on this,

     

    Mark

  • Garrett,

    Please disregard the rebuild issue. I just downloaded a fresh copy of the NDK 2.24.1.18 and compared to my working copy. I found that there was a CHANGE MADE to an include in the working copy that caused the compile errors that I was experiencing.

    I will be testing the code change this afternoon -- but would appreciate any comments on the viability of the change.

    Thanks,

    Mark