I am using EVM DM648 with NDK2.0. I am using the helloworld application which comes with the board. The application is configured in the ALEBYPASS/PROMISCUOUS MODE. The throughput of the application is very much close to the benchmarks supplied with the boards documentations. The problems is that when i send a burst of 64byte packets the board starts dropping packets abruptly. This is natural and no problem with me. But after sometime when i increase the no. of packets in the burst the board halts and i have to restart the board to make it continue transmitting/receiving the packets. I can live with the dropping of packets but cannot with the halting and restarting of the board by re-powering it. This behaviour is also sometimes experienced with 128byte packets. The board works fine with the packet sizes above 128 bytes. Can you please help me?
Mark,
I wanted to give you an update. I have been trying to reproduce the crash again today following your revised instructions and diagram for h/w set up.
The good news is that I finally was able to see the crash! The bad news is that it took me a while, and I was changing different variables in order to make it happen because I didn't see it right away using the colasoft settings you specified. I didn't realize that the app had gone into UTL_halt after I had already tried various settings (I was changing Colasoft settings such as the destination MAC to be broadcast, different IP addresses, etc. Also changed the h/w setup slightly)
So I need to try again and find the exact scenario which caused it. Once I am able to make it fail consistently, I will be able to debug the failure case. I have a couple of questions ...
When you see it, is the failure very consistent? Or does it only happen sporadically? Also, how long does it take for you to see the problem after you start the packet bursts?
Steve
I've been working on your problem some more. I'm still *not* able to reproduce a crash scenario (except for the one occurence that I saw last week), however I am able to see the connection slow down and get hung up after sending the packet bursts using Colasoft. I also see some error messages coming out of the driver about invalid packet ("EMAC_sendPacket() returned error").
Before going into further details, the error message gave me a hint to a problem with the DM648 driver. The driver contains printf() functions within ISR context. This is not allowed within BIOS and is known to cause programs to abort, as you have reported seeing. Can you try checking the LOG_system log within CCS at the point you see the program halt? Do you see any error messages in there? In any case, I think you should change all of the printf() calls in the driver code (I believe you should have copies of the Ethernet driver C files in your modified helloWorld project) to instead call LOG_printf(). See this for more details: http://e2e.ti.com/support/embedded/bios/f/355/t/68883.aspx
Now, regarding the hang up, this is what I see. Once the modified helloWorld application is loaded and running onto the DM648 board, I am then able to get an IP address from PC2 (The PC connected to the DM648 which is in turn connected to a switch that connects to the TI network).
At this point I can use a web browser and visit various web sites. I can also ping other machines on the network.
I then run the colasoft tool, and I see the Windows warning message regarding a duplicate IP address, as expected (I'm using slightly different settings in order to get this problem to happen, I'll paste a screen shot after this). I only need to run for about 5 seconds, then I stop sending the packet bursts.
At this point, I can no longer ping from PC2. Nor can I connect to web pages from within the browser. But, what I noticed at first was that if I just left everything running for a while, everything would recover, I was able to ping again and connect to the web after 5 minutes or so.
After looking at various stats in the NDK stack, I see that there isn't really anything going on. I guess this makes sense, as you've configured the hardware for bypass mode. I think this causes the network stack to be skipped entirely, and all the data is just going through the driver, is that correct?
After further experimenting, I found that I can get the entire setup to recover immediately by allowing the Windows network stack on PC2 to reset. I did this by pulling the Ethernet cable out, waiting for a few seconds, then plugging it back in. So, I can run the Colasoft tool, causing the network connection on PC2 to hang up (along with the Windows duplicate IP address warning), and then get it to work again by unplugging and replugging in the Ethernet cable. Based on this, I believe this hang up that I'm seeing is caused by the Windows side network stack being hung up, as the cable unplug/replug allows all to work again.
Please let me know what you think.
P.S. here are the Colasoft settings that allowed me to see the duplicate IP message and the network hang up on PC2:
Dear Steven, Thank you very much for the help and your concern regarding the issue. I have been testing the board throughput and its behavior since long. Board choking and stack crashing issue is still under examination. Some of the statistics are as under. Cola-soft generates approx 19000 packets of 64 bytes in 1 sec (1G connection), whereas according to our hardware testing (via ixia) stack crashes at the rate of approx 200000 packets. In any case the board should not choke indefinitely, i.e. after the removal of incoming data the board should regain its original condition and one need not to restart it. I have replaced all the printf with LOG_printf but the problem is still there. Currently trying to put that benchmark load using software by running multiple network applications simultaneously (i.e. Cola-soft, Packet builder, ping, mails etc ). Is there any way to soft-reset the board in this condition?
Dear Steve,
I have now been able to create a scenario in which board crashes at a specific software load
Following the afore mentioned procedure will halt the board for indefinite time. Even releasing all these loads is of no use.
CCS status and Wire-Shark display at the time of halt is shown below for reference
1. CCS Snap shot
2. Wireshark Snapshot
Is there any way to soft reset the board in this condition?
Please kindly try to regenarate the problem at your end also use latest csl_emac.c file i have sent you on private chat, use this csl_emac.c in the earlier project to get new .out file.
Regards
Hi Mark --
Can you use the Kernel/Object View tools (available via one of the CCS tools menus) to look at the state of your tasks? It would be useful to see which task is in the "running" state at the time of the crash. It would also be helpful to see the state of the task stacks. One of your task stacks might have overflowed.
If that doesn't help, it would be useful to check the value of 'B3'. B3 will contain the return address of the last function call. This might give some clues. Use disassembly view and enter 'B3' and it should bring up some code. Scroll up and we'll know which function was executing right before the failure.
-Karl-
I am using CCSv3.3 kindly clearly specify the tasks. Moreover i think you have just started following this post, Steven had already done this regarding the B3 register values it doesn't helps.
Please tell me which Kernel tool to use i will try to do it, i agree that the Stacks may have been overflowed but i dnt know how to detect and rectify it.
Mark
The Kernel/Object View tool is a CCS tool. You can find it in your top-level CCS menus. DSP/BIOS->Kernel/Object View.
You should be able to see Task info and other info that might help by poking around in this tool.
Karl
I have been trying to do it, i will reply you asap.
@Steven I hope you are back, i am waiting for a reply from your side.
The stack can be rebooted
Mark DeppIs there any way to soft reset the board in this condition?
You can reboot the stack by calling:
NC_NetStop(1);
So if you can catch this error condition, you can decide if you want to reboot at that time by calling the above.
Dear Mark,
Does this mean that NDK stack will be crashed and the board halted with a "ping flood" attack?
This is the effect that I can reproduce in a Beaglebone wit SYS/BIOS and NDK with ping flood attack. The same is happening if a video streaming is being sent into the network by RTP protocol (Real-time Transport Protocol) with multicast frames.
Is any way to avoid this effect? Is any configuration for the NDK stack to reject frames before overflow its buffers memory or the stack of the tasks?
Thanks and regards
Hi All!
Gilen, the NDK does crash with "ping flood" attack. In the project when there is only NDK and nothing more
hping --flood --udp --ipproto 1 -d 20 192.168.1.100
causes its crash very soon. Unfortunately, not only attack crashes it but quite reasonable amount of data but it takes a little longer. (as you have seen from http://e2e.ti.com/support/embedded/bios/f/355/p/223029/838245.aspx#838245)
Mark, have you tried NC_NetStop(1)? I tried it and it seems that NDK successfully restarts only once. (http://e2e.ti.com/support/embedded/bios/f/355/t/239456.aspx)