This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

NDK 2.0.x + C6474 EVM low throughput

Other Parts Discussed in Thread: CCSTUDIO

Hello,

 

I just recently got a spectrum digital dual C6474 cpu evm board. I have also installed the latest NDK and I'm trying to test out the transfer throughput rates. There are a couple of example demo programs with the NDK and even some benchmarking programs that they used to generate some of their benchmark figures.

 

When testing the NDK programs such as recv and send windows apps and their respective DSP programs I am getting about 50-70 KBytes/s over a 100MBit link. This is much slower than I was hoping for.  Do you guys have any suggestions?

I am running this on Core 0 Device 0 so it has the largest amount of L2 RAM allocated to it.

Thanks in advance,

Josh

  • I'm not an NDK guy and I haven't worked much with the C6474 evm yet, but one thing you may want to check is which memory range is being used. Each core has its own "local" memory, and accesses to another core's memory will be extremely slow (see here for an example). This could potentially have something to do with the slowdown.

  • Thank you,

     

    I read that thread previously. I don't think I'm having that problem since this is all just running out of a single core which is Core 0 (or 1) whatever the first one is which should have the most local memory assigned to it.

     

    Josh

  • Alright here is some more data points. I tried the send test included with the NDK which sends data from the Desktop PC to the C6474. It sends 8192 sized packets over 100 MBit connection. Theoretical transfer rates would be 12.5 MBps. I would hope for atleast 2/3 that data rate having no link loss or packet retries on a very under utilized local network.

    Here is what I am getting...

    send 192.168.1.109 1000
    1000 Sending 8192 bytes...passed - 4096000 bytes/s
    2000 Sending 8192 bytes...passed - 5461333 bytes/s
    3000 Sending 8192 bytes...passed - 4915200 bytes/s

    So I'm getting about 4-5 MBps which is better. However when comparing this to the NDK User Guide Quickstart (SPRAAX4.pdf) they run the same test and get these results in section 8.3

    send 192.168.1.41 500
    10 Sending 8192 bytes...passed - 8192000 bytes/s
    20 Sending 8192 bytes...passed - 8192000 bytes/s
    30 Sending 8192 bytes...passed - 12288000 bytes/s

    I would love to be getting 12.2 MBps!

     

    Now here is another interesting test. The recv test sends a request to the NDK client program requesting data to be sent back to the desktop PC. The default program comes with it hard coded to request 40,000 bytes worth of data. Then it averages the time to receive the packet and thats where I get my 50-80 KBps. If I lower the request size to 1024 my throughput goes up to 2 MBps (still not quite what I hoped for) but much better. I'm not sure exactly why the performance boost. I would figure that the NDK would segment up the 40,000 bytes to the MTU anyways. Unless when the DSP goes out to get the 40K of junk data to send its going out to DDR RAM and that is where the bottle neck is.

     

    Any thoghts would be extremely appreciated right now.

    Thanks,

    Josh

  • Josh,

    The speed you obtained is typical in systems with switches and routers using 100Mbps PHYs. Tests showed that even on a under utilized network the performance suffers heavily due to these, that's why the numbers above were obtained when connecting boards back-to-back to other boards or to host PCs. Even still, keep in mind the 12.2MB/s above may be too rough since it relies on the not-so-precise timer of the host PC. More precise performance numbers can be obtained in the companion benchmark numbers (typically located at C:\CCStudio_v3.3\ndk_2_0_0_0\packages\ti\ndk\docs\evm6474)

    A suggestion is to build and run the benchmark tests on the directory below:

    C:\CCStudio_v3.3\ndk_2_0_0_0\packages\ti\ndk\benchmarks

    Also, please refer to the tips below to help improve NDK performance under such conditions.

    Important considerations on performance measurement results (the note in red): http://tiexpressdsp.com/index.php?title=NDK_benchmarks

    Tips to extract additional performance from NDK: Question 29 of NDK FAQ: http://tiexpressdsp.com/index.php?title=Network_Developers_Kit_FAQ

    Specifically for C6474, did you set the option PHY_MODE_100MBPS accordingly? This is necessary to obtain reasonable performance in 100Mbps networks. Also, make sure you do not have a Gigabit router/switch in your network, since low performance was experienced when one was present on the link.

    About the receive tests, more packet fragmentation means worse overhead – tests here also indicated what you observed in practice. No matter what is the agreed TCP MSS, the MAC/PHY combination will always have a hardware limitation. Jumbo packets help a lot in such scenarios, but their usage is very limited by the network conditions.

    I hope this helps.

    Best regards,

    Rafael

  • Thank you for your comments Rafael,

     

    I have tried a direct cross-over cable connection with no real noticable speed gains or changes. Yep, I have seen the benchmark numbers which is what makes me envious and I want to achieve those same numbers.

    So I have tried the benchmark project however there didn't seem to be any windows counterpart to the benchmark program. I noticed the Tester and Testee projects but I did not see any windows (winapp) counterpart. I went to your links above and I found spraaq5a.pdf. Here they talk about the Benchmark process and producure. Very good! However they also tell about the winapp tester program which does not seem to be included in the NDK 2.0. Do you know why this was left out? I am going to read through this document more thoroughly and see if there is anything I had missed on the benchmark project.

    As far as the option PHY_MODE_100MBPS yes that was set. It actually was the project defaults for all the projects I had openned.

     

    Since my last posting I have tried gigabit ethernet as well. Here are some of the performance metrics I had gotten from the send.exe

    ~50 MBps = 400 Mbit/s

    With a slightly modified recv.exe that requests 1024 sized packets only once at startup and then just listens to data I have gotten

    ~30 MBps = 240 Mbit/s

     

    If you could get back to me on the winapp for benchmarking NDK 2.0 I would appreciate it.

     

    Thanks!

     

    Josh

  • Josh,

    Are you sure you downloaded the production release of NDK2.0? Make sure the install directory says ndk_2_0_0_0 (the confusing version numbers for NDK 2.0 are listed on this wiki page) since the winapps should be compiled - I think one of the intermediate versions was not.

    As for the NDK benchmarks, you can also use an open source network tester called iperf. The NDK benchmarks page was updated with useful information on how to use it.

    BTW, the numbers in GigE seem fine given the PC might also be the bottleneck. Check the above page for details.

    Hope this information helps,

    Rafael

     

  • Rafael,

     

    Thanks thats great. I did not know about the iperf. I was able to run it and get about 70 MBit on a 100 MBit link. When doing the same test from pc->pc over the same network I get about 96 MBit. The only thing that I was wondering is wheter you can use iperf to test sending from the DSP to the PC. 

    As I understand it the testee program running on the DSP simply receives the data from iperf. My end application will only be sending data from the DSP to a PC connected via ethernet. I'm not sure if I can use the tester application on the DSP and send data to iperf? Thats the only direction I really want to make sure its performing at top notch.

     

    Thanks,

    Josh

  • H,i

    Joshua Hintze said:

    As I understand it the testee program running on the DSP simply receives the data from iperf. My end application will only be sending data from the DSP to a PC connected via ethernet. I'm not sure if I can use the tester application on the DSP and send data to iperf? Thats the only direction I really want to make sure its performing at top notch.

     

    Have you succeeded in improving the throughput of communication from the DSP to the PC? if so can you please let me know how was this solved?

     

    Many thanks,

    Maroun

  • I don't think I ever got it higher than what I stated previously. But this project has been put on hold for quite a while.

  • How to configu emac of C6474 to 1Gbps?

    I am using the NDK 2.1.0
    I am running the example called "client" in "ndk_2_1_0\packages\ti\ndk\example\network\client".
    It is OK when run using the default configuration.The results are all correct.

    Now I want to try 1Gbps,
    I have deleted the predefine "PHY_MODE_100MBPS",
    But it said "Link Status:100Mb/s Full Duplex" also.

    How to set emac to 1Gbps using the NDK?

    Thank you for your help.

    Best Reguards.

  • HI Wang,

    The solution for this  to recompile the ethdriver intothe clent appliction,

    How todo this:

    1 download the source ode fo nk2_0_0 from TI

    http://tiexpressdsp.com/index.php/Source_code_for_the_NDK

    2.remove the hal_eth_c6474.lib from the project, and add the ethdriver.c & evminit.c files to the project, and recompile,
    resolve any compilation errors by adding the appropriate c files to the project.

    3. remove the PHY_MODE_100MBPS define and compile

    4. you should now see the phy running in 1000MBPS,
    5. make sure that the switch or device connected to the card is 1000MBPS.
     because AUTO_NEGOTIATE is disabled, this means that if the router/switch is 100MBPS then the card
     will not be able to run in 1000MBPS - at least this is what I get.

    Have you tried to boot from EMAC?

  • Hi,mfarah

    Thank you very much!

    I did what you told me,and EMAC can drive in gigabit mode.
    I use NDK 2.1.0.

    Our project is going to use EMAC boot mode,but things about booting are  not started now.
    If I get it , I will tell you what I know as soon as possible.

    Thanks again.
    Best Regards.

  • Hi Rafael.

    I am using the EVMC6474 Dev board, and I am having an issue with the following:

    There are good low level examples for the EMAC in the C6474 CSL. but no TCP/IP stack library with high level examples. Our application requires the use of TCP/IP over Ethernet, but the example code only covers the Ethernet layer. The only available high level TCP/IP examples requires the use of DSP/BIOS and the learning curve for understanding them is quite steap.  

     

    So, I am looking for a proper TCP/IP stack library that does NOT require DSP/BIOS and has good and proper example projects (Just like the other CSL example projects). Can you help??

     

    Regards.

    Estian.

  • Hi Estian,

    The examples coming with the NDK stack installation are high level..

    I got into plenty of problems when I tried to run core2 and core3 while TCP/IP running on core1 , plenty of crashes, interrput problems, I waisted huge amount of problems debugging the sources of the problems with no success, add to this that core1 was almost shutoff due to high cpu-usage running the stack.

    At end, I wrote my own stack doing udp and low level ethernet, including a special second level bootloader, got a throughput of almost 600Mbits/second between a PC and the DSP where dsp usage is almost bellow 10%.

    For such a solution you will need some low level programming of the mac module of the DSP.

     

    Regards

    Maroun

  • Hi Maroun.

     

    Thank you for your reply. I know that the TCP/IP examples for the NDK is high level, but they require you to use DSP/BIOS to run them, and that is exactly what I DONT want.

    DSP/BIOS is an operating system that takes away control over exactly what the core is doing, and I dont want that. I want to have a simple and optimised library with structure and examples like: Chip Support Library (CSL), IQMath Library, RTS Library etc...

     

    Where can I find this?

    Estian.