Bad fragmented UDP message transmission

JPH-FR

Hello,

I have a problem when sending fragmented UDP messages using the NDK (2.23) on C6678 EVM, when a MAC address resolution has to be done.
The situation is the following:
   - I want to send one UDP message that must be split in 3 IP frames.
   - When trying to send the first IP packet, the NDK stack detects that a MAC address resolution must be done.
      * the IP packet is attached to the lli
      * the ARP is sent
   - then, it is tried to send to second IP packet. As the ARP response is not received, this packet is no sent and attached to the lli.
   The problem is that the previously attached packet is discarded (file nkd/packages/ti/ndk/stack/lli/lliout.c, line 357)
   - when it is tried to send the third packet, the same thing occurs: the second packet is discarded
   - finally, when the ARP response is received, the packet attached to the lli is sent (file nkd/packages/ti/ndk/stack/lli/lliin.c).
   But only one packet is attached. The 2 other packets are lost.

   Could you confirm this analysis ?
   Is there a way to send the 3 packets even if a ARP has to be done ?

   Thanks

   JP

over 11 years ago

0 Tom Kopriva over 11 years ago

TI__Mastermind 20480 points

JP,

how are you fragmenting the UDP message? Are you doing it or are you expecting the NDK stack to do it for you?

It's strange that a message can be sent before a MAC address is known. What APIs are you using to send the UDP message?

How were you able to determine your findings? Did you use Wireshark?

0 JPH-FR over 11 years ago in reply to Tom Kopriva

Intellectual 271 points

Tom,

I expect NDK to do the fragmentation for me: I use the "sendto" API with a size of 3000 bytes (so, I expect to have 3 frames at the output).

The message is no sent before MAC address is known: the last frame of the message is stored until the ARP response is received and then sent. The two first frames are wasted.

I used Wireshark to determine my findings. I attached a screenshot of the wireshark situation. The C6678 EVM IP address is 220.1.11.30.

Since my last post I read in spru523H, page 57: "If sending to a new IP address, the very first send may be held up in the ARP layer while the stack determines the MAC address for the packet destination. While in this mode, subsequent sends are discarded." Perhaps this has a relationship with my problem.

Thanks for you helps

0 Tom Kopriva over 11 years ago in reply to JPH-FR

TI__Mastermind 20480 points

JP,

I just briefly looked into the NDK code. The sendto doesn't fragment data. Are you checking the return of sendto for any errors (-1)? If there is an error, what does fdError()) return?

JP said:
Since my last post I read in spru523H, page 57: "If sending to a new IP address, the very first send may be held up in the ARP layer while the stack determines the MAC address for the packet destination. While in this mode, subsequent sends are discarded." Perhaps this has a relationship with my problem.

Perhaps. I'm curious as to what return code you're getting on the 1st packet of ~1000 bytes. If you just send the 1st packet, do you see the ARP request followed by the UDP packet?

0 JPH-FR over 11 years ago in reply to Tom Kopriva

Intellectual 271 points

Tom,

Thanks for your answer.

Sendto doesn't do the fragmentation because the fragmentation is done later.

For upd messages, sendto calls UdpOutput() which call IPTxPackets(). The fragmentation is done in IPTxPacket() in the loop beginning at line 344 of file ipout.c.

When calling sendto with a length of 3000, the return value is 3000 and fdError() returns 0. It that case, there is the problem described before.

When calling sento with a length of 1000, the return value is 1000 and fdError() returns 0. In that case, I see the ARP packet followed by the UDP packet.

Thanks

JP.

0 Steven Connell over 11 years ago in reply to JPH-FR

TI__Mastermind 45025 points

Hi JP,

I'm looking into this issue and trying to reproduce it. I'll get back to you on this forum thread as soon as I have more details or questions.

Steve

0 dzhou over 11 years ago in reply to Steven Connell

TI__Genius 9065 points

JP,

In order for us to quickly reproduce your issue, could you please let me know:

1) The version of BIOS MCSDK you are using

2) hw version of the C6678 EVM

3) Can you attach your CCS project which shows the failure? You might want to cut unrelated stuff.

Thanks!

regards,

David

0 JPH-FR over 11 years ago in reply to dzhou

Intellectual 271 points

Hi David,

Please find the following information in order to reproduce the issue.

1) I use Bios MCSDK 2.1.1.4

2) The HW revision of the EVM C6678 is 1.0

3) Here is attached a ccs projet which shows the failure. I just have to run the executable on core 0. No other software is needed to answer to udp messages. I join also a screen capture of the analyser window, on with you can see that only one UDP fragment is sent for a message of 2000 bytes.

Thanks,

udp_test_fragmentation.zip

0 JPH-FR over 11 years ago in reply to JPH-FR

Intellectual 271 points

Hi David,

Here is attached the screen capture is was talking before.

Thank,

0 lding over 11 years ago in reply to JPH-FR

TI__Guru* 95265 points

JP,

Thanks for providing the project, I am trying to reproduce it. I only changed the IP address of local and send and MAC address of send, I got:

QMSS successfully initialized

CPPI successfully initialized

PA successfully initialized

PASS successfully initialized

Ethernet subsystem successfully initialized

Ethernet eventId : 48 and vectId (Interrupt) : 7

Registration of the EMAC Successful, waiting for link up ..

Network Added: If-1:158.218.109.89

Socket successfully created socket : -2145036092

Did you meet this? I saw the ARP request and ARP response on Wireshark, but I didn't see any UDP packets out of 6678 EVM no matter I use data_len to 2000 or change to 1000. In data = sendto( sudp, pBuf, data_len, 0, (PSA)&sin2, sizeof(sin2) ); the "data" returned is 2000 or 1000 respectively.

Attached is the packet log for data_len =2000 case. Do you have any comments on using your test project?http://e2e.ti.com/cfs-file.ashx/__key/communityserver-discussions-components-files/639/2335.send_5F00_2000.pcapng

Regards, Eric

0 JPH-FR over 11 years ago in reply to lding

Intellectual 271 points

Eric,

The prints you got are the same than the ones I have.

There is nothing particular to do to have the UDP packets sents.

You must just be carefull to let the core run in order to let the NDK code execute. If there is a breakpoint just after the "sendto" call to see the "data" value and if the code is not continued after this breakpoint, the data won't be sent (even if the data value is correct).

Regards,

Jean-Philippe.

0 lding over 11 years ago in reply to JPH-FR

TI__Guru* 95265 points

Jean-Philippe,

Thanks! Issue can be reproduced and we will analyze this.

Regards, Eric

0 Steven Connell over 11 years ago in reply to JPH-FR

TI__Mastermind 45025 points

JP,

I was able to reproduce the problem and am looking into the cause.

JP said:
Since my last post I read in spru523H, page 57: "If sending to a new IP address, the very first send may be held up in the ARP layer while the stack determines the MAC address for the packet destination. While in this mode, subsequent sends are discarded." Perhaps this has a relationship with my problem.

I see this too and I think you may be on to something. It may be a bug.

This code shouldn't execute if there is already an ARP mapping for the host you are trying to send UDP packets to.

You can get an ARP table entry for the other host by first pinging it from the NDK/6678 host, before your task that sends UDP packets runs.

Can you try doing that?

You could do this by trapping the "UDP sending function", with an infinite loop, for example:

static volatile int trap_udp_loop = 1;

Void UDP_sender_task(...)
{
    while (trap_udp_loop) {
        Task_sleep(1000);  // set break point here
    }

    ...
    <UDP sockets code>
    sendto(...);
    ...
}

Then, if you have the Telnet console linked into your app, Telnet into the 6678 and then ping out to the other host. This should cause the ARP table to update (if the host responds to the 6678's pings).

Once the pings have been responded to in your telnet window, you should halt the 6678 (at Task_sleep break point) and "untrap" your UDP sending function (set loop variable to 0), and allow it to then run the UDP sockets code with the sendto() call that was problematic.

Can you please give that a try?

Steve

0 JPH-FR over 11 years ago in reply to Steven Connell

Intellectual 271 points

Steve,

Thanks for your investigation.

I'm not able to do the exact manipulation you suggest because the telnet console is not avalaible in my environment.

Instead, I used the LLIAddStaticEntry() function to tie the MAC address of the other host to its IP address. Doing this, the sendto() that was problematic runs OK with a size of 2000 bytes.

This is a good information, but in my application, I can't do the association IP@<->MAC@ statically or manually. I need to have a dynamic association.

Do you have a workaround for this situation ?

Thanks,

0 Steven Connell over 11 years ago in reply to JPH-FR

TI__Mastermind 45025 points

JP said:
Instead, I used the LLIAddStaticEntry() function to tie the MAC address of the other host to its IP address. Doing this, the sendto() that was problematic runs OK with a size of 2000 bytes.

This is good news. No worries about not using the Telnet console, this was "just an easy way" for you to be able to try the experiment. Adding the static entry should be OK, too.

JP said:
This is a good information, but in my application, I can't do the association IP@<->MAC@ statically or manually. I need to have a dynamic association.

What I think you can do in this case is run a ping from your app code to the other host, in order to "prime it" before doing the real work, the UDP transfers. The ping command will be sent to the host's IP address, but since there's no ARP mapping for the other host, the NDK will be forced to use ARP to find the host's MAC address.

So by the time the ping goes out, the ARP table will have been updated, which should have the same effect of the code you added in the above experiment. Then your UDP send should not see dropped packets, since it won't hit that code you found that's dropping them in the ARP layer.

Lastly, the NDK contains example code for sending a ping. It's just the Telnet console's code for the ping command. I recommend making a copy of it in your app. You would need to modify it to get rid of the "ConsolePrintf" calls, though. Also, that ping command will send something like 5 ping packets. You would just need to change it to send a single ping.

You can find the ping example code in:

ti/ndk/tools/console/conping.c

Steve

0 Steven Connell over 11 years ago in reply to Steven Connell

TI__Mastermind 45025 points

JP,

For the record, this issue is being tracked by the following bug:

SDOCM00088612 ARP Resolution Queue Len is 1

Steve

0 JPH-FR over 11 years ago in reply to Steven Connell

Intellectual 271 points

Steve,

Thanks for this detailled proposition of workaround.

This is a way to be able to send a first fragmented upd message.

My next question is what happens when the arp entry has to be validated again because the associated timeout expires.

In spru524H, §A.7.1, it is said that the revalidation is done before the arp entry expires. This should ensure that new udp fragment errors are prevented. But this case is difficult to validate by experimentation.

If you have any information concerning this point, they would be appreciated.

Thanks

0 Steven Connell over 11 years ago in reply to JPH-FR

TI__Mastermind 45025 points

JP,

Perhaps you could provide some more details on your use case, as this may help with thinking up a more robust work around for you.

Do you intend to send to multiple devices over the lifetime of your application? How many? What's known about these devices? Will they always be the same or possibly changing (devices added, removed from the network)?

I'm wondering if we can do the ping on an initial communication step, extrapolating the MAC address somehow and adding a static ARP entry. Then subsequent sends to a particular host would be found in the ARP table.

But knowing more about your use cases may bring better insights and ideas to the table.

Steve

0 JPH-FR over 11 years ago in reply to Steven Connell

Intellectual 271 points

Steve,

Our first use case is basic: ethernet communication with one distant device present at start-up and not removed.

However, we intend, in the future, to use the C6678 in a product that must be able to fit different Ethernet communication situations.

The present possibilities of the NDK with the ping workaround should be compatible with our first use case. The other use cases will come later and will use a future NDK function in which the bug will perhaps be corrected.

Thanks for your help undestanding this situation,

JP.

Processors

Processors forum

Bad fragmented UDP message transmission