This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Missing UDP receive packets (broadcast)



We are implementing a media application where we receive an MPEG transport stream and demultiplex the MP3 audio streams contained within.

In all transport streams which have constant bit-rate the audio extraction is flawless. However, when a particular variable rate stream is sent to our unit the diagnostic data indicates that some UDP packets are missing: rate varies between about 2 to 12Mbit/s. WireShark I/O graphs show a variable bit-rate with some pronounced peaks.

What is interesting is that tests with higher constant bitrate transport streams up to about 20Mbit/s do not generate problems.

I wonder (but cannot yet prove) if it is a timing problem related to a burst of UDP packets arriving and some are being missed. I have implemented the checks described in SPRU523G section 3.4.1 regarding ensuring that there are enough receive UDP buffers.

The problem still persists.

The problem does not appear to be with the transport stream itself because a VLC player running on another computer connected to the same closed network plays it OK. Also I have performed a "follow UDP stream" in WireShark, saved it, and then played that in VLC player: The result was OK. therefore, what's on the wire as seen by Wireshark and what it seen by another PC and played by VLC player are both OK.

 

Is it possible for UDP packets to be missed by the NDK if they all arrive very quickly (i.e. in bursts), even if there is enough buffering? I have looked at the WireShark traces and it appears that some arrive within about 6uS of each other. Howver, streams that work correctly show arrivals up to about 8uS worst case so this may not be significant.

I've looked at the EMAC_RxServiceCheck in the csl_emac module and consider the possibility that as it is processing an interrupt for another packet (or more than one) would be missed or is that not possible.

We're running a C6457 at just under 1GHz with a Vitesse 8221 PHY

 

Any ideas?

  • Do you have a telnet server running in your program?  If you do, can you telnet into the board and use the following command to get some statistical data?

     

    telnet <C6457's IP address>

    >stat udp

    and

    >stat ip

    This command will display some UDP info such as the number of packets lost.  For example, I'm running the NDK client example and I see the following:

    >stat udp

    UDP Statistics:
      RcvTotal       = 0000271291
      RcvShort       = 0000000000
      RcvBadLen      = 0000000000
      RcvBadSum      = 0000000000
      RcvFull        = 0000000000
      RcvNoPort      = 0000000006
      RcvNoPortB     = 0000271242
      SndTotal       = 0000000043
      SndNoPacket    = 0000000000

    RAW Statistics:
      RcvTotal       = 0000172428
      RcvFull        = 0000000000
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000
    >

    >stat ip

    IP Statistics:
       Total         = 0000444508     Odropped      = 0000000000
       Badsum        = 0000000000     Badhlen       = 0000000000
       Badlen        = 0000000000     Badoptions    = 0000000000
       Badvers       = 0000000000     Forward       = 0000000000
       Noproto       = 0000000000     Delivered     = 0000444427
       Cantforward   = 0000000000     CantforwardBA = 0000000000
       Expired       = 0000000000     Redirectsent  = 0000000000
       Localout      = 0000001046     Localnoroute  = 0000000000
       CacheHit      = 0000000114     CacheMiss     = 0000000058
       Fragments     = 0000000081     Fragdropped   = 0000000001
       Fragtimeout   = 0000000001     Reassembled   = 0000000040
       Ofragments    = 0000000080     Fragmented    = 0000000040
       Cantfrag      = 0000000000     Filtered      = 0000000000
    >

     

    It would be useful if you could run these commands

    A. Before the UDP transfer starts

    B. After the UDP transfer is complete

    If A. and B. were done for both the fail case (variable bit rate stream) and the success case (constant bit rate stream), I would like to see the differences in the stats that this command shows.

  • I've implemented a telnet console and added the stat ip/udp commands since your answer above. I have obtained the following statistics:

    FAIL CASE

    Software started and no media running

    >stat ip

    IP Statistics:
       Total         = 0000000072     Odropped      = 0000000000
       Badsum        = 0000000000     Badhlen       = 0000000000
       Badlen        = 0000000000     Badoptions    = 0000000000
       Badvers       = 0000000000     Forward       = 0000000000
       Noproto       = 0000000000     Delivered     = 0000000072
       Cantforward   = 0000000000     CantforwardBA = 0000000000
       Expired       = 0000000000     Redirectsent  = 0000000000
       Localout      = 0000000071     Localnoroute  = 0000000000
       CacheHit      = 0000000010     CacheMiss     = 0000000001
       Fragments     = 0000000000     Fragdropped   = 0000000000
       Fragtimeout   = 0000000000     Reassembled   = 0000000000
       Ofragments    = 0000000000     Fragmented    = 0000000000
       Cantfrag      = 0000000000     Filtered      = 0000000000
    >stat udp

    UDP Statistics:
      RcvTotal       = 0000000000
      RcvShort       = 0000000000
      RcvBadLen      = 0000000000
      RcvBadSum      = 0000000000
      RcvFull        = 0000000000
      RcvNoPort      = 0000000000
      RcvNoPortB     = 0000000000
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000

    RAW Statistics:
      RcvTotal       = 0000000000
      RcvFull        = 0000000000
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000

    Media streamed from customer hardware

    >stat ip

    IP Statistics:
       Total         = 0000358303     Odropped      = 0000000000
       Badsum        = 0000000000     Badhlen       = 0000000000
       Badlen        = 0000000000     Badoptions    = 0000000000
       Badvers       = 0000000000     Forward       = 0000000000
       Noproto       = 0000000000     Delivered     = 0000358303
       Cantforward   = 0000000000     CantforwardBA = 0000000000
       Expired       = 0000000000     Redirectsent  = 0000000000
       Localout      = 0000001287     Localnoroute  = 0000000000
       CacheHit      = 0000000238     CacheMiss     = 0000000004
       Fragments     = 0000000000     Fragdropped   = 0000000000
       Fragtimeout   = 0000000000     Reassembled   = 0000000000
       Ofragments    = 0000000000     Fragmented    = 0000000000
       Cantfrag      = 0000000000     Filtered      = 0000000000
    >stat udp

    UDP Statistics:
      RcvTotal       = 0000357016
      RcvShort       = 0000000000
      RcvBadLen      = 0000000000
      RcvBadSum      = 0000000000
      RcvFull        = 0000022518
      RcvNoPort      = 0000000000
      RcvNoPortB     = 0000000002
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000

    RAW Statistics:
      RcvTotal       = 0000000000
      RcvFull        = 0000000000
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000
    >

    PASS CASE

    Software started again and no media running

    >stat ip

    IP Statistics:
       Total         = 0000000037     Odropped      = 0000000000
       Badsum        = 0000000000     Badhlen       = 0000000000
       Badlen        = 0000000000     Badoptions    = 0000000000
       Badvers       = 0000000000     Forward       = 0000000000
       Noproto       = 0000000000     Delivered     = 0000000037
       Cantforward   = 0000000000     CantforwardBA = 0000000000
       Expired       = 0000000000     Redirectsent  = 0000000000
       Localout      = 0000000036     Localnoroute  = 0000000000
       CacheHit      = 0000000003     CacheMiss     = 0000000001
       Fragments     = 0000000000     Fragdropped   = 0000000000
       Fragtimeout   = 0000000000     Reassembled   = 0000000000
       Ofragments    = 0000000000     Fragmented    = 0000000000
       Cantfrag      = 0000000000     Filtered      = 0000000000
    >stat udp

    UDP Statistics:
      RcvTotal       = 0000000000
      RcvShort       = 0000000000
      RcvBadLen      = 0000000000
      RcvBadSum      = 0000000000
      RcvFull        = 0000000000
      RcvNoPort      = 0000000000
      RcvNoPortB     = 0000000000
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000

    RAW Statistics:
      RcvTotal       = 0000000000
      RcvFull        = 0000000000
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000
    >

    Media streamed from VLC player on PC

    >stat ip

    IP Statistics:
       Total         = 0000354436     Odropped      = 0000000000
       Badsum        = 0000000000     Badhlen       = 0000000000
       Badlen        = 0000000000     Badoptions    = 0000000000
       Badvers       = 0000000000     Forward       = 0000000000
       Noproto       = 0000000000     Delivered     = 0000354436
       Cantforward   = 0000000000     CantforwardBA = 0000000000
       Expired       = 0000000000     Redirectsent  = 0000000000
       Localout      = 0000001213     Localnoroute  = 0000000000
       CacheHit      = 0000000258     CacheMiss     = 0000000006
       Fragments     = 0000000000     Fragdropped   = 0000000000
       Fragtimeout   = 0000000000     Reassembled   = 0000000000
       Ofragments    = 0000000000     Fragmented    = 0000000000
       Cantfrag      = 0000000000     Filtered      = 0000000000
    >stat udp

    UDP Statistics:
      RcvTotal       = 0000353223
      RcvShort       = 0000000000
      RcvBadLen      = 0000000000
      RcvBadSum      = 0000000000
      RcvFull        = 0000000008
      RcvNoPort      = 0000000000
      RcvNoPortB     = 0000000004
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000

    RAW Statistics:
      RcvTotal       = 0000000000
      RcvFull        = 0000000000
      SndTotal       = 0000000000
      SndNoPacket    = 0000000000
    >

    I would not expect the number of packets to be the same because the customer hardware and the VLV player might be doing different things with the media I used for the latest test (a transport stream) before it is streamed.

    However, I have noticed a significant difference: The UDP statistics show the RcvFull has a large number for the fail case and a small number for the pass case.

    I expect that this is the cause. I'll study spru523g.pdf section 3.4.1 (and 5.3.1) again ...

  • Performed a few more tests after inspecting spru523g.pdf section 3.4.1 "UDP application drops packets on recv() calls."

    • Increasing PKT_NUM_FRAMEBUF from 192 to 384 in pbm.c (after adding pbm.c to my build) made no significant difference. Increasing to more than this does not link because sections cannot be placed.
    • CFGITEM_IP_SOCKUDPRXLIMIT has always been set to 8192 and it would not need to be increased. I'm only inputting a 4mbit/s stream which has about 400 UDP packets per second.
    • Changed NC_SystemOpen from NC_PRIORITY_LOW to NC_PRIORITY_HIGH which made no significant difference. Therefore, there is not a scheduling issue.
    • Increased PKT_MAX in ethdriver.c from 64 to 128 which again made no significant difference.

    I have followed what I believe to be all the recomendations outlined in spru523g and I am still getting lost data according to the RcvFull status data in the Telnet 'stat udp' command.

    The customer hardware broadcasting to ours always results in high RcvFull values but VLC player running on a PC receiving the same stream plays it OK and results in low RcvFull values. Also, using VLC player running on a PC broadcasting to our hardware works OK.

    In summary:

    • Customer hardware -> our hardware: RcvFull is HIGH, memory capture on our hardware shows corrupted media
    • Customer hardware -> Ethernet switch -> our hardware: RcvFull is HIGH, memory capture on our hardware shows corrupted media
    • Customer hardware -> PC VLC (open network...): VLC plays media OK
    • Customer hardware -> PC VLC (open network...): VLC plays media OK
    • PC VLC (streaming) -> our hardware: RcvFull is LOW
    • PC VLC (streaming) -> Ethernet switch -> our hardware: RcvFull is LOW

    I cannot blame the customer hardware because it works when sending to VLC on a PC, and it also works OK when sending through a 'cheap' Ethernet switch to a PC and the 'stat udp' running on the Telnet console of the C6457 clearly shows high RcvFull values when there are media problems.

    In order to attempt to see where UDP packets may be dropped I placed a breakpoint on 'memory_squeeze_error++' in ethdriver.c RxPacket function: This point would be reached if PBM_alloc fails and is described as "Increment the statistics to account for packets dropped because of memory squeeze". This breakpoint is NOT reached indicating that UDP packets are not being dropped in RxPacket.

    So where are they being dropped, and any idea what's going on here?

    I'm guessing that the packets are being dropped between the call to NIMUReceivePacket in nimu.c and when the data appears at the application socket layer? If (according to the description in udpif.h for RcvFull) the packets are being rejected because the receive socket queue is full then how can I increase the size of the queue for the socket. I have tried the increasing the OS_TASKSTKNORM to OS_TASKSTKHIGH or even using a value of 65536 in the DaemonNew where the dtask_udp is created but with no improvement.

    I have measured the latency of the dtask_udp (i.e. how fast the system gets back to it after processing other tasks in the software) and it can go up to 90ms.

    So, if the RcvFull problem is caused by the socket queue filling up, how do I increase the size of the socket queue?

     

     

  • Just had breakthrough ...

          CfgAddEntry( hCfg, CFGTAG_IP, CFGITEM_IP_SOCKUDPRXLIMIT,
                       CFG_ADDMODE_UNIQUE, sizeof(uint), (UINT8 *)&rc, 0 );

    Increased rc from 8192 to 65536. Now RcvFull is zero for the worst stream that I have. I figure for a 4Mbit/s stream where the task may not be serviced for up to 90ms that is the size that I need.

  • Hi Rob Beck

    I have encountered a problem.

    I am working on a GIGE camera with DM648, the rcvFull statistics are high.

    I have already increased rc from 8192 to 65536. It doesn't change the situation.

    Do you have any suggestions?

     

    thanks

  • Look at your packet arrival rate (packets per second) and ensure that your UDP buffer is large enough to store packets before your application gets to them. For example, if you have 100, 000 packets/s and you do socket reads every 100ms and do it until the call blocks so you know the USP RX queue is empty then you would need at least 100,000 * 0.1 = 10,000

    Try increasing until the RcvFull goes away and look at how you are reading the UDP stack. I read until it blocks to ensure it is empty each time.