Tool/software: TI-RTOS
hi all,
I am trying to do a UDP traffic load benchmark for C6678. Found out that the result is quite disappointing. My benchmark shows that the DSP can receive and send around 12k to 15k packets in a second. The latency is about 700us for the DSP to receive a packet and send it out.
my setup is
The host PC and the DSP is connected to a common switch.
I had written an udp echo client on a PC host to send UDP packets in size of 400 bytes. Due to the nature of the application we are trying to develop, the packet size is small and we need to send a lot of packets. On the DSP, I am running the ndk helloworld example in pdkc6678 which is essentially the echo server. On my host PC, I send a packet and wait for the packet to come back before sending another one. The PC host use the same socket for send and receive. The DSP uses the same socket for send and receive.
I am using mcsdk 2.1.2.6, ndk 2.21.2.43, MCSDK 6678 1.1.2.6 and CCS6.1.3
The board vendor's support package only support the versions of package i listed above.
Benchmark result:
My result shows that the DSP can receive and send around 12k to 20k packets per second. For my application, the DSP will receive at least 15k packets and send 15k packets at maximum load. As you can see, the benchmark shows that the DSP is almost max out when at max load and it leaves no cpu run time for doing anything else.
The reason why i believe the DSP is the bottleneck is:
In wireshark, i can see that it takes 700us for a packet to go to the DSP and come back. I think this delay is huge. In wireshark, i can also see that the PING packets latency is very small. Therefore, I don't think the Ethernet switch is causing the problem. It only takes my PC 70 us to receive a packet and send another one out. Theoretically, the DSP should not be slower than my PC.
I am wondering whether there is anyway i can optimize ndk and sysbios setting to reduce the latency. Thanks.