This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

64 byte packets Flooding and Burst causing NDK Halting

Other Parts Discussed in Thread: OMAPL138

I am using EVM DM648 with NDK2.0. I am using the helloworld application which comes with the board. The application is configured in the ALEBYPASS/PROMISCUOUS MODE. The throughput of the application is very much close to the benchmarks supplied with the boards documentations.
 
The problems is that when i send a burst of 64byte packets the board starts dropping packets abruptly. This is natural and no problem with me. But after sometime when i increase the no. of packets in the burst the board halts and i have to restart the board to make it continue transmitting/receiving the packets. I can live with the dropping of packets but cannot with the halting and restarting of the board by re-powering it. This behaviour is also sometimes experienced with 128byte packets. The board works fine with the packet sizes above 128 bytes.
 
Can you please help me?
 

  • Mark,

    Mark Depp said:
    when i increase the no. of packets in the burst the board halts

    Where is the PC when this happens?  Are you in the UTL_halt function?

    If so, let's try to back trace.  Can you put a break point into the UTL_halt function?  Does the break point hit that function when this problem occurs?

    If so, once the break point is reached, try to find where UTL_halt was called from.  You can see this by going to the address contained in the register "B3".  If you put the value stored in B3 into the dis/assembly window, you should see where halt was called from.

    It may be another abort function.  If so, you will need to put a break point there, in addition to the UTL_halt one, and then try to reproduce the problem again.

    At this point, once the break point is hit, using the same method to back trace via the contents of B3, you would hopefully see the actual line of code that the abort is happening from.

    Once that is known, it should give some more insight into the cause of the problem.

    Steve

  • Steve

    I have uploaded two snapshots in the previous post.

    I did as you said by re-generating the problem several time then checking the B3 value to get where the PC is.

    In 99% of the cases it was in the C$L1: Function but once it was in the NIMUPacketServiceCheck: Function.

    The value of B3 was E00C55E8 all the time except it was E007DF0A when it pointed to NIMUPacketServiceCheck:

     

    Can you reach to any conclusion?

    You help is very much appreciated.

    Thanks

    Regards

    Mark

  • Mark,

    Mark Depp said:
    In 99% of the cases it was in the C$L1: Function

    The "C$L1" is not a function but a label.  If the B3 register is taking you back to C$L1, can you scroll up in the dis/assembly window until you see an actual function name?

    Once we know for sure where you are at in the 99% case, then we can try to pinpoint the place where it's failing and the cause.

    Steve

  • Steve

    I REGENERATED THE PROBLEM AND YES YOU WERE RIGHT IT WAS IN "UTL_HALT". HOWEVER I TRIED MANY TIMES TO CHECK THE REGISTER B3 VALUE FOR PC VALUE. THE FUNCTIONS LEADING TO UTL_HALT HOWEVER WERE SYS_ABORT, C$RL0, C$RL1 AND LOCK. I CANNOT GET ANY FUNCTION CAUSING THE PROBLEM BECAUSE IT LEADS TO UTL_HALT FROM C$RL0: BUT DOESN'T STOPS AT BREAKPOINT AT C$RL0:

    Please Tell What is wrong.

    Mark

  • Mark,

    You are using the standard OMAPL138 board? (i.e. not custom hardware?)

    If so, would it be possible for me to try to reproduce this on my end?  I have an OMAPL138 board at my desk and I think at this point will be easier for me to take over the debug.

    I'm not sure if you have proprietary code in your application or not, if so then you should not post that here as these are public forums.  If you do not want to post your project on the forum then we can share it by becoming "friends" on the forum.  Once we are friends, you can attached files for me to see.

    However, if you are using custom hardware, I won't be able to debug this ...

    Steve

    P.S. I'll send you a friend request now ...

  • Steven

    Thank you so much for your reply. I have accepted your friend request, and would really like to further strengthen our co-operation in understanding and solving problem such as one i m already stuck in.

    I told you in my 1st post on this thread that I have EVMDM648 Lyrtech board also that i m using standard NDK2.0 HelloWorld code. With Board working in switch mode.

    You can download the source code from TI Download Site. Link is http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/ndk/index.html

     

    Regards

    Mark

  • Hi Mark,

    This is good news.  I just wanted to make sure that the hardware, etc was exactly the same.  This will make it easier for me to reproduce the problem you are seeing.

    In one of your previous posts you said:

    Mark Depp said:
    The application is configured in the ALEBYPASS/PROMISCUOUS MODE.

    So, I think you have modified the helloWorld application?  Is that true?

    If you have modified it, then I will need to know the exact modifications in order to reproduce the issue.  The easiest way to communicate the changes to me would be for you to simply zip up and attach your version of the helloworld example to this post.

    Also, I would need to know how you are sending the 64 byte packet bursts to the DSP?  Is there a Windows or Linux app that you are running to do this?

    Lastly, please let me know how your board is connected (with respect to the network).  Are you plugged into a router?  A switch?

    Steve

  • Steven

    I connect the board to a hardware device IXIA IxLoad ( http://www.ixiacom.com/products/ixload/index.php ) on a Gigabit Link, there is no switch or router involved in it. Then the hardware device IXIA Ixload send the 64 byte packet burst to the DSP at full throughput. There is no Windows/Linux app or software involved. This is the standard HelloWorld application of NDK version 2.0 which I use. However I do made the modifications to configure it to the switch mode/ ALE BYPASS mode. You install the NDK 2.0 on your system and then configure it in to ALE BYPASS mode the way I did. I have sent you a PDF file describing how I did it so that you will be on the same page.

    I hope that you will be able to reproduce the problem and also figure out where i am wrong, and correct me.

    Regards

    Mark

  • Hi Mark,

    Thanks for the pdf file.

    Unfortunately I don't have that hardware available to me.  This will make it hard to reproduce.  Do you know of any other way I could reproduce this 64 byte burst test that you are running?

    Steve

  • Basically the board halts when i send excessive no. of small sized packets(64 or 128byte) in a single burst. You can reproduce the problem easily after building the project.

    You can simply use any tool(hardware/software) to send excessive/large no. of packets(64 byte or 128 byte) to the board. You can try by creating a senario in which large no. of ARP packets are sent to the board, this will recreate the problem.

    I hope this clarifies the situation.

    Regards

    Mark

  • Steven

    i am waiting for your reply.

    Please give me a yes or no.

  • Hi Mark,

    I apologize for the lack of response.  I'm working on trying to reproduce your problem.  I found a tool that will allow me to send a packet burst.  I haven't tried it yet but it sounds promising.  I will let you know as soon as I have more information.

    Thanks for your patience.

    Steve

  • Mark,

    I've been following the instructions you sent in the pdf.  The instructions are good.  However, the step for making the updates to the csl_emac.c file (in step 5) is not completely clear.

    There appears to be a block in your version of the code that I don't have (see screen shot - there is a curly brace surrounding the code - what does it belong to?  A different if statement?)

    I'm also suspecting that I have a different version of the files than you do.  I suspect that because the line numbers you state in the pdf do not match for the files that I have.

    Can you attach your updated csl_emac.c file?  Or better yet, attach the entire updated helloWorld project?  This would make things go quickest for me.  (I have limited time for each forum question, so the less time spent doing set up, the sooner I can get to debugging the real problem).

    Steve

    P.S.  Below is the code in my version of csl_emac.c ... see, I don't have the block that is shown in your code:

               /*
                 * If this is the last frag, the forward pointer is (void *)0
                 * Otherwise; this desc points to the next frag's desc
                 */
                if (PktFrags == 1)
                    pDescThis->pNext = 0;
                else
                    pDescThis->pNext = pdc->pDescWrite;

                pDescThis->pBuffer   = pPkt->pDataBuffer + pPkt->DataOffset;
                pDescThis->BufOffLen = pPkt->ValidLen;

                if (pPkt->Flags & EMAC_PKT_FLAGS_SOP)
                    pDescThis->PktFlgLen = ((pPkt->Flags&
                                           (EMAC_PKT_FLAGS_SOP|EMAC_PKT_FLAGS_EOP))
                                           |pPkt->PktLength|EMAC_DSC_FLAG_OWNER);
                else
                    pDescThis->PktFlgLen = (pPkt->Flags&EMAC_PKT_FLAGS_EOP)
                                           |EMAC_DSC_FLAG_OWNER;

                /* Enqueue this frag onto the desc queue */
                pqPush(&pdc->DescQueue, pPkt);
                PktFrags--;
            }
        }

  • Mark,

    Another thing.  While Googling around on the web about the ALE bypass, I saw some old forum posts in which some customers were having a lot of problems getting the DM648 into this mode.

    Would you mind if I posted your PDF instructions to the forum so that others can benefit from this?

    Steve

  • Steven

    I have sent you the csl_emac.c file.

    Unfortunately my organization's policy is against the use of that pdf file publicly, so i m sorry to tell you not to share the file on the forum.

    Can you please tell me the name of the tool you plan using to regenerate the problem, this will help me in checking at my end that is it possible to regenerate problem or not with the tool you selected.

    Hoping that this will help you in reaching to some conclusion.

    Regards

    Mark

  • Hi Mark,

    Thanks for sending that source file.  I understand about the PDF instructions that you have sent, I won't post your PDF file then.

    The tool is called "Colasoft Packet Builder " (http://www.colasoft.com/packet_builder/).  Please let me know if you see any issue with it (or if you have another recommendation...)

    Thanks,

    Steve

  • Steven

    Thanks, the tool promises to be good one and i think it will regenerate the problem.

    I hoping to hear some good news from you soon.

    Regards

    Mark

  • Hi Mark,

    I have made all of the changes to the helloWorld example as stated in the steps of the PDF file that you sent to me privately.  I can build it, however there is a warning related to ALE code which may be problematic.  Do you see these warnings on your side?  If so I'd recommend fixing that.

        "csl_emac.c", line 1077: warning: integer conversion resulted in a change of sign

    This is coming from here:

        if (lpAleConfig->aleModeFlags & EMAC_CONFIG_ALE_ENABLE)
            aleCtlVal |= CSL_FMK(CPSW3G_ALE_CONTROL_ENABLE_ALE, 1);

    Next I tried running the application but I am unable to get an IP address from DHCP, it gets a fault.

    I'll send you the project so you can have a look at it and make sure I've made the changes correctly.

    Steve

    [C64XP_0]
    [C64XP_0] TCP/IP Stack 'Hello World!' Application
    [C64XP_0]
    [C64XP_0] Using MAC Address: 00-21-ba-12-4a-c1
    [C64XP_0] cpsw_MDIO_Init
    [C64XP_0] SetPhyMode:000001E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:0 LPBK:0
    [C64XP_0] cpsw_MDIO_Init
    [C64XP_0] SetPhyMode:000001E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:0 LPBK:0
    [C64XP_0]  EMAC should be up and running
    [C64XP_0] EMAC has been started successfully
    [C64XP_0] Registeration of the EMAC Successful
    [C64XP_0] Service Status: DHCPC    : Enabled  :          : 000
    [C64XP_0] Service Status: DHCPC    : Enabled  : Running  : 000
    [C64XP_0] cpsw_MDIO_FindingState: PhyNum: 0
    [C64XP_0] cpsw_MDIO_FindingState: PhyNum: 1
    [C64XP_0] cpsw_MDIO_PhYReset(0)
    [C64XP_0] Enable Phy to negotiate external connection
    [C64XP_0] NWAY Advertising: FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10
    [C64XP_0] cpsw_MDIO_PhYReset(1)
    [C64XP_0] Enable Phy to negotiate external connection
    [C64XP_0] NWAY Advertising: FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10
    [C64XP_0] Phy: 1, NegMode 01E1, NWAYadvertise 01E1, NWAYREadvertise CDE1
    [C64XP_0] Negotiated connection: FullDuplex 100 Mbs
    [C64XP_0] Link Status: 100Mb/s Full Duplex on PHY 1
    [C64XP_0] Service Status: DHCPC    : Enabled  : Fault    : 002

  • Steve

    My apologies, actually the engineer who made this code wrote incomplete documentation which lead to this confusion.

    Now i modified your code to make it the actual application required.

    Please use the project i sent along with emac_common.h file to build load and run the project. No other change is required at all.

    Moreover ignore that warning relating the change of sign it doesnt matters.

    Kindly reproduce the problem using this code and 64 byte packets.

    Hope to hear from you soon.

    Regards

    Mark

  • Hi Mark,

    Thanks for sending the updated emac_common.h file.  I rebuilt my project using the update and also added a print statement in main() of one of the new macros you added just to double check that I am getting your updated file brought into my build.

    However, I still am not able to run the application successfully; the application gets a DHCP fault (see output below).

    Can you try running the .out file on your hardware setup?  Does it work for you?

    4621.helloWorld.out.txt

    (please rename it to "helloWorld.out")

    Steve

    [C64XP_0]  ------------ EMAC_DSC_FLAG_TOPORT1 = 65536
    [C64XP_0]
    [C64XP_0] TCP/IP Stack 'Hello World!' Application
    [C64XP_0]
    [C64XP_0] Using MAC Address: 00-21-ba-12-4a-c1
    [C64XP_0] cpsw_MDIO_Init
    [C64XP_0] SetPhyMode:000001E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:0 LPBK:0
    [C64XP_0] cpsw_MDIO_Init
    [C64XP_0] SetPhyMode:000001E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:0 LPBK:0
    [C64XP_0]  EMAC should be up and running
    [C64XP_0] EMAC has been started successfully
    [C64XP_0] Registeration of the EMAC Successful
    [C64XP_0] Service Status: DHCPC    : Enabled  :          : 000
    [C64XP_0] Service Status: DHCPC    : Enabled  : Running  : 000
    [C64XP_0] cpsw_MDIO_FindingState: PhyNum: 0
    [C64XP_0] cpsw_MDIO_FindingState: PhyNum: 1
    [C64XP_0] cpsw_MDIO_PhYReset(0)
    [C64XP_0] Enable Phy to negotiate external connection
    [C64XP_0] NWAY Advertising: FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10
    [C64XP_0] cpsw_MDIO_PhYReset(1)
    [C64XP_0] Enable Phy to negotiate external connection
    [C64XP_0] NWAY Advertising: FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10
    [C64XP_0] Phy: 1, NegMode 01E1, NWAYadvertise 01E1, NWAYREadvertise CDE1
    [C64XP_0] Negotiated connection: FullDuplex 100 Mbs
    [C64XP_0] Link Status: 100Mb/s Full Duplex on PHY 1
    [C64XP_0] Service Status: DHCPC    : Enabled  : Fault    : 002

  • Steve

    I can only tell it to you on monday when i get back to office that on my hardware setup do i get the dhcp error or not. However i remember for sure that i took care of this when i modified your project you sent to me in a zip file. In your post you only mentioned about using emac_common.h file i sent. Did you used the project i sent to you or not?. I will reply you in detail about dhcp behavior on Monday when i test it on my hardware.

    Regards

    Mark

  • Steve

    I checked the code you sent to me. Actually  there was no issue in code regarding NDK but it was hardware Link issue. It was incompatible with either FE or GE.

    However i have removed the issue now please recompile and run it.

    Thanks

    Regards

    Mark

  • Please use the RAR file i sent to you today.

  • Hi Mark,

    I am getting closer.  I reviewed the changes and recompiled the project that you sent me.

    I ran the app and have been using the Colasoft packet builder tool to send a continuous stream of ARP packets (64bytes in size) to the board.  However, I couldn't see the stack crash.

    Let me review my set up with you to make sure that I've got it set up correctly:

    1. My PC is running the Colasoft packet builder tool

    2. PC is directly connected to the DM648 using a crossover cable

    3. I run the NDK application and get the following output:

    [C64XP_0]
    [C64XP_0] Using MAC Address: 00-21-ba-12-4a-c1
    [C64XP_0] cpsw_MDIO_Init
    [C64XP_0] SetPhyMode:000021E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:8192 LPBK:0
    [C64XP_0] cpsw_MDIO_Init
    [C64XP_0] SetPhyMode:000021E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:8192 LPBK:0
    [C64XP_0]  EMAC should be up and running
    [C64XP_0] EMAC has been started successfully
    [C64XP_0] Registeration of the EMAC Successful
    [C64XP_0] cpsw_MDIO_FindingState: PhyNum: 0
    [C64XP_0] cpsw_MDIO_FindingState: PhyNum: 1
    [C64XP_0] cpsw_MDIO_PhYReset(0)
    [C64XP_0] Enable Phy to negotiate external connection
    [C64XP_0] NWAY Advertising: FullDuplex-1000 FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10
    [C64XP_0] cpsw_MDIO_PhYReset(1)
    [C64XP_0] Enable Phy to negotiate external connection
    [C64XP_0] NWAY Advertising: FullDuplex-1000 FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10
    [C64XP_0] Negotiated connection: FullDuplex 1000 Mbs
    [C64XP_0] Link Status: 1000Mb/s Full Duplex on PHY 1
    [C64XP_0] Link Status: No Link on PHY 1

    4. In the packet builder tool, I construct an ARP packet.  I then send the same packet in an infinitely repeated burst.  I have set the ARP packet's source MAC address to that of my PC.  The destination address is set to the MAC address of the board.  (I also tried setting the destination address to the broadcast address ... screen shots below show some details ...)

    5. I can see the ARP packets showing up on Wireshark

    6. I can see the LED on the physical Ethernet port on the board blink continuously while sending the packets, so I know the packets are getting through.

    7. But, the application doesn't crash.

    Am I missing anything?  Perhaps my setup isn't correct or somehow different than yours.

    Can you try reproducing using the Colasoft tool on your side please?

    Steve

  • ok fine this is good. now here i tell you how to regenerate the problem using colasoft packet builder. information you provided tells that your board is connected to pc with a GE link.  you just have to send the same arp packet you are sending. just change this setting of arp packet as destination mac 00:00:00:00:00:00 destination ip 0.0.0.0 source mac 00:00:00:00:00:00  source ip put here ip address of your system. when you generate packet burst with this setting an ip address conflict will be generated ignore it and application will crash. then you can trace back the cause of crash.

  • please see this also

  • Hi Mark,

    I've again tried to reproduce the problem using your instructions but still am not having luck.  Please review the below steps that I've taken in trying to reproduce the issue.  Please let me know if I am missing something or if any step is incorrect.

    Steve

    Scenario 1:  PC connected directly to DM648

    - PC
        - running Colasoft and Wireshark
        - connected to DM648 board using a crossover cable (directly, i.e. no switch or router in between).
        - IP address is 146.252.161.73 (obtained from DHCP server before direct crossover connection was made to the DM648 board).

    - DM648 board
        - runs the test application we have been working on (modified hello world).
        - It does not have an IP address.
        - directly connected to the PC via crossover cable.  Crossover cable plugged into Ethernet port labeled "J8".

    a. load and run test application on the DM648 board

    b. run Wireshark

    c. run Colasoft and configure a ARP packet as you describe in the post above ("just change this setting of arp packet as destination mac 00:00:00:00:00:00 destination ip 0.0.0.0 source mac 00:00:00:00:00:00  source ip put here ip address of your system.")



    d. send the packet in burst mode in an infinite loop.



    e. I see the ARP packet show up in Wireshark.  I see the LED continuously lit up on the J8 Ethernet port of the DM648 (indicating the data is coming in there).



    f. I do not see the NDK test application crashing.

    Scenario 2: Two PCs and DM648

        For this scenario I am trying to follow the instructions you sent in the attached Word document.

    - PC 1:
        - running Colasoft and Wireshark
        - IP address is 146.252.161.73 (obtained from DHCP server).
        - connected to Netgear 1Gbps switch
        - Netgear switch is connected to the TI network


    - PC 2:
        - running Colasoft
        - connected to DM648 board using a crossover cable into Ethernet port J7 (directly, i.e. no switch or router in between).
        - IP address is 146.252.161.242 (obtained from DHCP server before direct crossover connection was made to the DM648 board).

       
    a. send packets from PC 2 using Colasoft.  I see the LED on Ethernet port J7 lit up indicating that the data is coming into the DM648.



    b. send the packets from PC 1 using Colasoft.  I *do not* see the Ethernet port J8 lit up continuously.  It blinks the same as when normal network traffic is coming in.  Not sure if any of the ARP packets are coming into the board on that connection


    c. Also the NDK application still does not crash.

  • Steve

    I am sorry that i didnt replied you for few days, actually i was a little bz with some other assignments.

    Actually i have seen your last reply and now what stirs me is some problem between your configuration and yours. I request you to give it a final try.

    I am sending you a PPT file with pictorials and instructions that will reproduce the problem.

    Please give it a shot.

    Regards

    Mark

  • Hi Mark,

    I have been on vacation and am just seeing your response.  I can definitely give this another shot and will let you know the result.

    Steve

  • Mark,

    I wanted to give you an update.  I have been trying to reproduce the crash again today following your revised instructions and diagram for h/w set up.

    The good news is that I finally was able to see the crash!  The bad news is that it took me a while, and I was changing different variables in order to make it happen because I didn't see it right away using the colasoft settings you specified.  I didn't realize that the app had gone into UTL_halt after I had already tried various settings (I was changing Colasoft settings such as the destination MAC to be broadcast, different IP addresses, etc.  Also changed the h/w setup slightly)

    So I need to try again and find the exact scenario which caused it.  Once I am able to make it fail consistently, I will be able to debug the failure case.  I have a couple of questions ...

    When you see it, is the failure very consistent?  Or does it only happen sporadically?  Also, how long does it take for you to see the problem after you start the packet bursts?

    Steve

  • Mark,

    I've been working on your problem some more.  I'm still *not* able to reproduce a crash scenario (except for the one occurence that I saw last week), however I am able to see the connection slow down and get hung up after sending the packet bursts using Colasoft.  I also see some error messages coming out of the driver about invalid packet ("EMAC_sendPacket() returned error").

    Before going into further details, the error message gave me a hint to a problem with the DM648 driver.  The driver contains printf() functions within ISR context.  This is not allowed within BIOS and is known to cause programs to abort, as you have reported seeing.  Can you try checking the LOG_system log within CCS at the point you see the program halt?  Do you see any error messages in there?  In any case, I think you should change all of the printf() calls in the driver code (I believe you should have copies of the Ethernet driver C files in your modified helloWorld project) to instead call LOG_printf().  See this for more details: http://e2e.ti.com/support/embedded/bios/f/355/t/68883.aspx

    Now, regarding the hang up, this is what I see.  Once the modified helloWorld application is loaded and running onto the DM648 board, I am then able to get an IP address from PC2 (The PC connected to the DM648 which is in turn connected to a switch that connects to the TI network).

    At this point I can use a web browser and visit various web sites.  I can also ping other machines on the network.

    I then run the colasoft tool, and I see the Windows warning message regarding a duplicate IP address, as expected (I'm using slightly different settings in order to get this problem to happen, I'll paste a screen shot after this).  I only need to run for about 5 seconds, then I stop sending the packet bursts.

    At this point, I can no longer ping from PC2.  Nor can I connect to web pages from within the browser.  But, what I noticed at first was that if I just left everything running for a while, everything would recover, I was able to ping again and connect to the web after 5 minutes or so.

    After looking at various stats in the NDK stack, I see that there isn't really anything going on.  I guess this makes sense, as you've configured the hardware for bypass mode.  I think this causes the network stack to be skipped entirely, and all the data is just going through the driver, is that correct?

    After further experimenting, I found that I can get the entire setup to recover immediately by allowing the Windows network stack on PC2 to reset.  I did this by pulling the Ethernet cable out, waiting for a few seconds, then plugging it back in.  So, I can run the Colasoft tool, causing the network connection on PC2 to hang up (along with the Windows duplicate IP address warning), and then get it to work again by unplugging and replugging in the Ethernet cable.  Based on this, I believe this hang up that I'm seeing is caused by the Windows side network stack being hung up, as the cable unplug/replug allows all to work again.

    Please let me know what you think.

    Steve

    P.S. here are the Colasoft settings that allowed me to see the duplicate IP message and the network hang up on PC2:

  • Dear Steven,
                Thank you very much for the help and your concern regarding the issue. I have been testing the board throughput and its behavior  since long. Board choking and stack crashing issue is still under examination. Some of the statistics are as under. Cola-soft generates approx 19000 packets of 64 bytes in 1 sec (1G connection), whereas according to our hardware testing (via ixia) stack crashes at the rate of approx 200000 packets. In any case the board should not choke indefinitely, i.e. after the removal of incoming data the board should regain its original condition and one need not to restart it.
                    
    I have replaced all the printf with LOG_printf but the problem is still there. Currently trying to put that benchmark  load using software by running multiple network applications simultaneously (i.e. Cola-soft, Packet builder, ping, mails etc ). Is there any way to soft-reset the board in this condition?

  • Dear Steve,

              

     I have now been able to create a scenario in which board crashes at a specific software load

    1. Cola-soft packet builder: 57720 ARP packets of 64 byte each  and send in an infinite loop with no delay between the loops. Pictorially shown below

     

     

    1. Ping: To any intra network system
    2. e-mail: of size > 12 MB send in a loop i.e. to the same generating station in intra network (this process should be repeated twice or thrice )

     

    Following the afore mentioned procedure will halt the board for indefinite time. Even releasing all these loads is of no use.

     

    CCS status  and Wire-Shark display at the time of halt is shown below for reference

     

     

     

     

    1.       CCS Snap shot

    2. Wireshark Snapshot

     

    Is there any way to soft reset the board in this condition?

     

     

    Please kindly try to regenarate the problem at your end also use latest csl_emac.c file i have sent you on private chat, use this csl_emac.c in the earlier project to get new .out file.

    Regards

     

  • Hi Mark --

    Can you use the Kernel/Object View tools (available via one of the CCS tools menus) to look at the state of your tasks?   It would be useful to see which task is in the "running" state at the time of the crash.   It would also be helpful to see the state of the task stacks.   One of your task stacks might have overflowed.

    If that doesn't help, it would be useful to check the value of 'B3'.   B3 will contain the return address of the last function call.   This might give some clues.    Use disassembly view and enter 'B3' and it should bring up some code.  Scroll up and we'll know which function was executing right before the failure.

    -Karl-

  • I am using CCSv3.3 kindly clearly specify the tasks. Moreover i think you have just started following this post, Steven had already done this regarding the B3 register values it doesn't helps.

    Please tell me which Kernel tool to use i will try to do it, i agree that the Stacks may have been overflowed but i dnt know how to detect and rectify it.

    Regards

    Mark

  • The Kernel/Object View tool is a CCS tool.   You can find it in your top-level CCS menus.  DSP/BIOS->Kernel/Object View.

    You should be able to see Task info and other info that might help by poking around in this tool.

    -Karl-

  • Karl

    I have been trying to do it, i will reply you asap.

    @Steven I hope you are back, i am waiting for a reply from your side.

  • Mark,

    The stack can be rebooted

    Mark Depp said:
    Is there any way to soft reset the board in this condition?

    You can reboot the stack by calling:

    NC_NetStop(1);

    So if you can catch this error condition, you can decide if you want to reboot at that time by calling the above.

    Steve

  • Dear Mark,

    Does this mean that NDK stack will be crashed and the board halted with a "ping flood" attack?

    This is the effect that I can reproduce in a Beaglebone wit SYS/BIOS and NDK with ping flood attack. The same is happening if a video streaming is being sent into the network by RTP protocol (Real-time Transport Protocol) with multicast frames.

    Is any way to avoid this effect? Is any configuration for the NDK stack to reject frames before overflow its buffers memory or the stack of the tasks?

    Thanks and regards

  • Hi All!

    Gilen, the NDK does crash with "ping flood" attack. In the project when there is only NDK and nothing more 

    hping --flood --udp --ipproto 1 -d 20 192.168.1.100

    causes its crash very soon. Unfortunately, not only attack crashes it but quite reasonable amount of data but it takes a little longer. (as you have seen from http://e2e.ti.com/support/embedded/bios/f/355/p/223029/838245.aspx#838245)


     Mark, have you tried NC_NetStop(1)? I tried it and it seems that NDK successfully restarts only once. (http://e2e.ti.com/support/embedded/bios/f/355/t/239456.aspx)