This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

UDP Packets go missing at the NDK Stack, when MCSA is enabled.

My NDK is up on the EVM C6670, and the UDP Socket is listening for packets, and the MCSA is also enabled.

I can clearly see the logs in the MCSA of the tasks running in the system as well as the dchild() and daemon() tasks of the NDK Stack.

Now i keep sending packets from a differnet machine at the rate of 1ms to the EVM C6670 over the ethernet interface which would be received at the NDK Stack.

I also run the wireshark on a different machine trying to capture the packets sent by the MCSA and packets received by the NDK stack at rate of 1ms.

after some time, i notice that the wireshark shows the packets sent but they are not received by the UDP sockets at the EVM C6670.

I wanted to know if any changes need to be done to the PDSP Interrupt Configuration to interrupt the core for Packet Accelerator Sub System Queue used in sending and receiveng ethernet packets.

  • I wanted to clarify your setup some.

    From your standpoint System Analyzer is working properly (e.g. getting UDP packets that contain log data from the c6670), correct?

    The problem is on receiving side on the c6670 after some period of time. What is the period when the failures start to occur?

    Todd

  • You are right,

    The System Analyzer is working properly, and i can see the execution graphs well over there.

    The period for the failure is not definite, as it crashes down within 2 minutes of running. Im sending frames of size 560 bytes at the rate of 1ms to the c6670 NDK application.

    It aborts since ive placed a check for the proper reception of packets using Identification Id in the packets to check for loss of packets at the c6670 NDK appl.

    Its seen that it occurs only when MCSA is running, if i switch off the MCSA, the c6670 NDK appl runs smoothly without any packet loss.

    Regards

    Santosh

  • Santosh,

    When the c6670 receives the UDP packets, does it send anything back? Is there any other signs of failure (e.g console output) when the problem occurs? Or is just that the receiver did not get a packet that is out of order? 

    Todd

  • Well the UDP Packets are received by socket every 1ms, there are instances wherein the Remote Machine pumps 3 packets of size approx 530 bytes and pushes it to the c6670 NDK appl.

    The c6670 also sends one single packet to the remote machine every 1ms of size 10 bytes.

    These two operations of sending and receiving are done from Two different Tasks in BIOS. Receiving is done by the NDK task  and ive created a seperate task for sending the packet too, but only One Socket is used for both these operations.

    i can see the packets on the wireshark, but on the c6670 receiver end, it shows a miss.

  • What the priorities of the two Task that are tx and rx on the one socket? I'm trying to look around and see if this is supported. Have you tried to do both sending and receiving on the socket in the same Task?

    Note: I'm assuming you called fdOpenSession() in both Tasks.

    Todd

  • No i have used Two different tasks to send and receive. Both the sending and receiving tasks have the same priority ie 5.

    Yea i have used fdOpenSession() in both the tasks.

  • Are you using fdShare()? Please refer to this post for sharing a file descriptor: http://e2e.ti.com/support/embedded/bios/f/355/t/120158.aspx#427788

    Todd

  • Im currently working on Lyretech EVM.

    Well on further observation and some review of NIMU code as to how it is implemented.

     I observed that the RX_INT_THRESHOLD value was set to 31, which conveys that the PDSP accumulator will wait for 31 IP packets to arrive from PASS and then only interrupt the core for the IP Packet OR The PDSP accumulator will interrupt the core when the "timerLoadCount" which is set to 40, ie 40 * 2.5 msec (The clock cycle for which the PDSP runs) = 1msec expires.

    It was mentioned in the Multicore Navigator document "sprugr9d.pdf", Table 4-51 Command Buffer Field Descriptions.

     After I did these changes with respect to RX_INT_THRESHOLD by setting it to "1", so that the PDSP interrupts the core immediately when it gets even One packet from the PASS, I saw that there was no drift and the packets were received at proper intervals.

    Now by doing these changes, I observed that some packets were missed/lost during the course of the application running, and this led to errors in the  program.

    Then I changed this value to "3", since I observed that I was pumping Packets in such a way that at a time max 3 packets were being sent simultaneously to the NDK and NIMU.

    Doing these changes, the application was running perfectly, with no drift in receiving packets and neither any error in loss of packets.

    But what I observe now is that after some time, the sendto() function fails to send the packets out of the NIMU driver.

     Well so im at this stage now. I need a perfect *FIX* as to how can I set the accumulator register configuration such that the packets are appropriately received and sent by the NIMU driver.

     Pls help me if this can be sorted out.

    I have tested all this on Lyretech EVM.

  • I don't know if it's relevant, but we had a similar issue where we were sending UDP packets every 1 ms (1 kHz) and saw random dropped packets on the order of hours (sometimes a day or two).  They appeared in Wireshark but were never received.

    There were a number of separate issues involved which TI eventually untangled; the final one was that the particular PC sending the packets managed, on rare occasions, to lose packets in the Ethernet driver.  That is, Wireshark saw the packet but it never actually hit the wire.  So be slightly wary of believing Wireshark here, it might (possibly) not be the DSP's fault.

    - Gordon

  • Thanks Gordon for the info.

    But i would like some of the NIMU and NDK experts over here on e2e, to kindly help me out with configuring the PDSP accumulator which manages sending the received packets from the PASS to the Core thro the multicore navigator. I beleive the FIX lies in configuring the PDSP accumulator with the appropriate configuration so that i can receive all the packets and none of them are dropped.

    Pls help me out over this issue here,

  • Pls look into this ... There has been no response since a couple of days.

    Regards
    Santosh

  • Can anyone in experts of NDK and NIMU driver help me out with the problem ?

    Ive observed that some packets are missed and they are not captured by the NIMU driver, maybe due to the improper configuration of the PDSP accumulator.

    Has any of such bugs being found in the NIMU driver for c6670 EVM and are they fixed in the latest release of the transport module in the PDK.

    Pls guide, would be useful.

    Thanks

    Santosh

  • Santosh,

    Let me take a look at this. I will let you know my findings ASAP.

    Thanks,

    Arun.

  • Arun,

    Have you come across any findings , pls do let me know.

    Its hampering my testing since packets tend to get missed at the NIMU since they are not sent across by the PDSP accumulator.

    Regards

    Santosh

  • Santosh,

    Sorry for the delay. Varada from my teak is already working on this issue. She should be able to respond to you by tomorrow on this.

    Regards,

    Bhavin

  • Santosh,

    I also have few questions here though:

    1. When you say the packet is being dropped and not received by DSP, do you know whether the packet is not even made it to the Navigator accumulator queue? OR it is there in the accumulator queue but you are not getting interrupt for it?

    2. If it is not there in the accumulator queue (which could probably your Rx queue of PA) then can you check the PA statistics to identify whether PA dropped that packet OR even PA did not receive that packet?

    3. If PA statistics says that it has not dropped that packet then can you check the statistics and conclude whether PA received that packet or not? There are very high chances that if PA did not drop the packet then it should not have even received that packet.

    4. If PA statistics conclude that it has not received that packet then can you check the Switch sub-system statistics to verify whether the packet is dropped at switch (i.e. even before reaching to PA) or not?

    Switch sub-system and PA are the only places where a packet can be dropped. If packet is dropped at one of those two places then respective statistics will show it with the reason of drop of the packet.

    Regards,

    Bhavin

  • Hi Bhavin,

    I see that the packet is received by the PA Rx driver, and is also present in the Navigator Queue. Now the RX_THRESHOLD is set to "1", indicating that when ever one packet is there in the queue , it shud hit the ISR and present the packet from the queue to the upper level application.

    It might be the case where the packet is not fetched by the PDSP from the Navigator after some time due to some latency, or the ISR buffer is not large enough, since i saw on some posts that there has been some changes done to the NIMU ISR which gets triggered by the PDSP on every interrupt from the PA Rx driver.

    I want to receive the packets every 1msec, and need to know the exact RX_INT_THRESHOLD as well as TimerLoadCount values to be set in my NIMU driver so as to facilitate the same in the application.

    ive read some inputs regarding this in post below:

    http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/134675/502440.aspx#502440

  • Hi Santosh,

     I am looking at all posts you have related to this issue. On Feb 13, you mention that you could resolve the missing packet issue when you tweaked the RX_INT_THRESHOLD = 3.  But then there was other send issue.

    Can you confirm that UDP packets are no longer missing with these settings ?

     Also that the accCfg.interruptPacingMode = 1//  Qmss_AccPacingMode_LAST_INTERRUPT;

    If the packet header is corrupt, the PDSP inside PA will drop and update the statistics. If there is other MAC level corruption(like CRC, code etc) then the Gig Switch will drop the packet and update switch stats.

    Due to pacing, the interrupt (and hence the packet receive) may be delayed, but not lost all together, unless you are reaching a high data rate of  close to ~1GBps.

    So can you confirm the rate at which you are sending packets to the DSP.

    For 1ms periodic interrupt, here is a suggestion

    Interrupt Pacing Mode = 1

    Timer load count = 40

    RX_INT_THRESHOLD = 8, instead of default 4. So that in case of more packets, the list does not fill up and cause early interrupt.



  •  Qmss_AccPacingMode_LAST_INTERRUPT Qmss_AccPacingMode_LAST_INTERRUPTVarada, We have a system where the packets are received by the TI DSP at regular intervals of less than 1msec. I need to receive at least one packet in that periodicity of 1msec, since the remote entity is sending packets to the TI DSP at these intervals.

    Now when the old setting of RX_INT_THRESHOLD was kept at 31, the packets were received late the application level since it used to wait for timerLoadCOunt of 40 ie (1msec) to expire or for the list of 31 to get filled, and since packets sent by the remote entity was within that span of 1msec, there was some delay in receiving the packets at the upper layer application.

    Now when this setting of RX_INT_THRESHOLD was changes to 1, meaning that when only 1 entry gets filled in the PDSP ISR list, i need the interrupt to trigger and send the packet to the upper layer, my requirements were met and packets were not coming in a delayed fashion as it was for the setting to "31".

    All was going smooth, and things were working fine... but when this test was carried on for a longer period of time, i observed that packets were sent by the remote entity as were getting captured in wireshark, but they couldnt receive at the upper layer of the application, which proved that they were getting missed at the PA-PDSP interface, or the NIMU RX ISR was not triggered at the right time for the processing of the packet.

    Since you mentioned to change the RX_INT_THRESHOLD to "8", this would not work under my test scenario, since i need the packets to hit at my upper layer application within 1 msec, now what will happen is that either it will wait for 8 packets to be filled in the list or 1msec timerLoadCount.

    After reading some posts on e2e and setting the RX_INT_THRESHOLD to "1" and Pacing mode to "Qmss_AccPacingMode_LAST_INTERRUPT", and replacing the RxISR function by the latest taken from MCSDK 2.0.7.19, Things seem to be working fine now for the moment and i didnt come across the delay in packets.

    I dont know as to what exactly worked now. Need to know in detail about it. There was a post from tscheck on e2e to replace the NIMU RxISR by the latest NIMU Rx ISR from the latest MCSDK which worked.

  • Hi Santosh,

     Glad to know things are working on your system now.

    I have confirmed that the changes suggested in e2e post http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/134675/502440.aspx#502440   have been incorporated in the release of BIOS-MCSDK 02_00_08_20.

     About the delay in the getting the packet, that is - not getting interrupt every 1ms, is  it still an issue ? This delay  can be also due to system loading.  I will be talking to the internal systems test team if they suggest anything based on their tests.

     Please keep us posted.

  • Varada,

    Had one query. Does the latest NIMU driver support JUMBO frames, so that i can transmit and receive Ethernet Frames over and above 1500 bytes in a single packet.

    Pls let me know about this.

     

    Regards

    Santosh