TDA4VH-Q1: CPSW9G network throughput test

Part Number: TDA4VH-Q1
Other Parts Discussed in Thread: TDA4VH, DRA821, TDA4VM

Tool/software:

Hardware:TDA4VH custom board

SDK:ti-processor-sdk-linux-adas-j784s4-evm-09_01_00_06

Eight ports are configured with eight independent SERDES lines in SGMII mode

#1. These ports are tested separately using the iperf3 tool. One server(ifconfig eth1 160.0.0.1 && iperf3 -s). one client(iperf3 -c 160.0.0.1 -t 120).

test result of single port:

  • The port speed is 93Mb/s under mode of: 100Mb/s  Full Duplex.
  • The port speed is 943Mb/s under mode of: 1000Mb/s  Full Duplex.

The communication throughput meets expectations.

#2. Test multiple ports together, the bandwidth rate dropped a lot, For example, Use four boards as servers.

(iperf3 -s)  one client(iperf3 -c 160.0.0.1 -t 120  /   iperf3 -c 161.0.0.1 -t 120  /  iperf3 -c 162.0.0.1 -t 120  /  iperf3 -c 163.0.0.1 -t 120)

test result of multiple ports running at the same time:

  • The port speed is 46Mb/s of each at 100Mb/s  Full Duplex mode.
  • The port speed is 303Mb/s of each at 1000Mb/s  Full Duplex mode. total throughput of 4 ports= eth2 + eth3 +eth4 +eth5 = 304 + 304 + 307 + 304 =1219Mb/s. 

Although used separate SRDES lines, but it affect each other from test result.

Can each port achieve 1Gbps when use multiple ports at the same time.  How to improve the throughput.

  • Hi, 

    Each Serdes lane is independent and can support max 1Gbps in SGMII interface. 

    Can you check CPU load once? If CPU load is high you can adjust interrupt pacing of each ethernet interface and improve the throughput. 

    Please refer to SDK documentation for interrupt placing. 

    software-dl.ti.com/.../CPSW2g.html

    We have seen max throughput of 3.5Gbps due to s/w limitation (general network stack & processing of data from application) . 

    Best Regards, 

    Sudheer

  • Hi,

    We checked the cpu load. It was not high.
    We are based on SDK 09_00_01_06 and the Tx channel shares an interrupt. Does that make it slow down?

    We set  the port 100Mb/s-Full. Each port speed is 40Mb/s. This speed is much less than expected

    Thanks

  • Hi,

    We are based on SDK 09_00_01_06 and the Tx channel shares an interrupt. Does that make it slow down?

    If you share interrupts across cores it can improve some performance.

    We set  the port 100Mb/s-Full. Each port speed is 40Mb/s. This speed is much less than expected

    Can you run with multiple threads and different ports for each iperf.

    Best Regards,
    Sudheer

  • Hi

    1、cmd : taskset -c 6 iperf3 -c 166.0.0.1 One core runs an iperf3. The test result is the same. Set eth2 eth3 eth4 eth5 to 100Mb/s + duplex full( ps: ethtool -s eth2 speed 100 duplex full)
    Our board. The test result: eth2 + eth3 +eth4 +eth5 = 52 + 52 + 33 + 33 =170Mb/s.

    2、We use TI dmeo board testing (QSGMII). Set eth2 eth3 eth4 eth5 to 100Mb/s + duplex full( ps: ethtool -s eth2 speed 100 duplex full)
    TI board. The test result: eth2 + eth3 +eth4 +eth5 = 60 + 60 + 30 + 30 =180Mb/s.

    The rate of each port cannot reach 93Mb/s. If we continue to add ports, the rate of each port will decrease.The total ports bandwidth is about 170Mb/s

    Thanks

  • Hi,

    Can you please check with latest SDK 10.1, we have support for sharing interrupts across cores (affinity)

    Best Regards,
    Sudheer

  • Hi

     We used SDK 10.1 to test. We're just changing uEnv.txt
    -name_overlays=ti/k3-j784s4-evm-ethfw.dtbo ti/k3-j784s4-vision-apps.dtbo
    +name_overlays=ti/k3-j784s4-evm-ethfw.dtbo ti/k3-j784s4-vision-apps.dtbo ti/k3-j784s4-evm-quad-port-eth-exp1.dtbo
    The kernel is stuck when it starts

    The message:
    ALSA device list:
    No soundcaeds list:
    Waiting for root device PARTUUID=3ab21cfa-02
    platform 4fb0000.mmc: deferred probe pending
    platform regulator-sd: deferred probe pending
    platform regulator-dp0-prw: deferred probe pending
    platform regulator-dp1-prw: deferred probe pending
    platform bus@100000:wiz@5020000: deferred probe pending
    platform c000000.ethernet: deferred probe pending

    If k3-j784s4-evm-quad-port-eth-exp1.dtbo is not added, the kernel can be started normally. Did we configure it the wrong way

    Thanks

  • Hi,

    Can you update "k3-j784s4-evm-quad-port-eth-exp1.dtbo" as below i.e. comment out "mux-sel-hog" and use the updated dtbo by rebuilding dtb.

    Best Regards,
    Sudheer

  • Hi

    The kernel can run normally. But the driver continued to report errors. Are devices in the new SDK taking up relevant gpio resources

    The message:

    platform bus@100000:wiz@5020000: deferred probe pending
    platform c000000.ethernet: deferred probe pending

    Thanks

  • Hi,

    The kernel can run normally. But the driver continued to report errors. Are devices in the new SDK taking up relevant gpio resources

    The message:

    platform bus@100000:wiz@5020000: deferred probe pending
    platform c000000.ethernet: deferred probe pending

    Are you checking on TI EVM? or custom Board?

    If custom board, you need to configure as per your board. The same followed in SDK 9.1
    For TI EVM, you can use SDK as is with above change suggested if kernel was not booting.

    Best Regards,
    Sudheer

  • Hi

    Platform:TDA4VH TI  EVM

    SDK:ti-processor-sdk-linux-adas-j784s4-evm-10_01_00_05

    The test results are the same as before.

             speed: 1000Mb/s  Duplex: Full    --->  eth2 + eth3 +eth4 +eth5 ≈ 1219Mb/s.

              speed: 100Mb/s  Duplex: Full    --->  eth2 + eth3 +eth4 +eth5 ≈ 180Mb/s.

    Thanks

  • Hi,

    How you are running with same iperf3 server?

    Is it setup with Two board, one side is server in Bridge Mode of all interfaces (eth2+eth3+eth4+eth5)?
    Other side individual MAC Ports?

    If bridge is enabled, you can see flooding of broadcast packets at H/W layer due to switch mode enable.
    You can check CPSW statistics before test and after test and check whether they is a huge broadcast/multicast packets counters increments or not?

    Can you check two boards as 4 servers & client pairs like eth2 <-> eth2  as server on one side and client on other side similarly for eth3, eth4, eth5.

    Best Regards,
    Sudheer

  • Hi

    We use ti,mac-only mode.
    TI board as the client, connect four boards(our board) as the server.

    TI board NUM1 --> ifconfig eth2 162.0.0.1 && iperf3 -c 162.0.0.1 -t 100               our board NUM1 --> iperf3 -s B 162.0.0.1

    TI board NUM1 --> ifconfig eth3 163.0.0.1 && iperf3 -c 163.0.0.1 -t 100               our board NUM2 --> iperf3 -s B 163.0.0.1

    TI board NUM1 --> ifconfig eth4 164.0.0.1 && iperf3 -c 164.0.0.1 -t 100               our board NUM3 --> iperf3 -s B 164.0.0.1

    TI board NUM1 --> ifconfig eth5 165.0.0.1 && iperf3 -c 165.0.0.1 -t 100               our board NUM4 --> iperf3 -s B 165.0.0.1

    Thanks

  • Hi,

    your board is also with TDA4VH? if not can you connect to PC using bridge and check. 
    or you can check two TI EVMS back to back.

    Best Regards,
    Sudheer

  • Hi Sudheer,

    Does TI have throughput data if multiple ethernet port running at the same time? 

    If not, Do you have test environment to do replicate it on your side?

    From the SOC structure, internal bus bandwidth, is there bottle neck to achieve 8x1Gbps on the up stream port of CPSW-9G?

    As the throughput data impact customer's system design decision, it is time to respin a new prototype board for customer, need your help to get conclusion ASAP. Thanks.

  • Hi

    1、We do the same test using PC and the experimental results are the same.

    2、We don't have two TI boards here. We can't do this test at present.

    Thanks

  • Hi,

    We have some testing on TI EVMs but not on TDA4VH.
    TDA4VM to DRA821 we are observing ~2Gbps throughput all cores together at max.

    Will check once on TDA4VH to TDA4VH, it would be more than ~2Gbps.
    As mentioned above max ~3.3Gbps we may achieve due to s/w limitation (general network stack, Embedded ARM cores)

    Best Regards,
    Sudheer

  • Hi

    Can you add another test? Set ETH* to 100 Mbit/s full-duplex mode and calculate the speed of all ports

    Thanks

  • Hi,

    Can you add another test? Set ETH* to 100 Mbit/s full-duplex mode and calculate the speed of all ports

    We are observing ~90Mbps for each port, 4 ports together more than 300Mbps.

    Best Regards,
    Sudheer

  • Hi

    What is your test command and test environment?

    When we open the third port, Tx will send timeout.
    am65_cpsw_nuss_ndo_host_tx_timeout()  -->  netdev_err(ndev,"txq ....." )

    Thanks

  • Hi,

    am65_cpsw_nuss_ndo_host_tx_timeout()  -->  netdev_err(ndev,"txq ....." )

    Some times we do observe the same but for one port only.

    Can you increase no.of descriptors and check .

    Best Regards,
    Sudheer

  • Hi

    Two ports before we can reach to 90 MB.When you add the third or fourth port,  it's 0Mb/s, and the above error occurs.

    SDK 9.1 is a Tx channel. SDK 10.1 is eight Tx channels.But they are all around ~200Mb/s(TI 4VH BOARD)

    Thanks

  • Hi, 

    Let us check once on TDA4VH board. 

    Best Regards, 

    Sudheer

  • Hi,

    I tried multiple experiments on TDA4VH. My setup was like following:

    server cmd : iperf3 -s -p 5001 & iperf3 -s -p 5002 & iperf3 -s -p 5003 & iperf3 -s -p 5004 & iperf3 -s -p 5005 & iperf3 -s -p 5006 & iperf3 -s -p 5007 & iperf3 -s -p 5008 &

    What i did and observed is as follows

    1. With all the links as 1G, I ran iperf3 TCP on all 4 links simultaneously.
      1. I observed ~240M traffic on all 4 ports
      2. This is because the link which services the traffic is limited to 1G link (external switch to PC)
      3. cmd: iperf3 --bind-dev eth1 -c 192.168.1.230 -p 5001 -T eth1 & iperf3 --bind-dev eth2 -c 192.168.1.230 -p 5002 -T eth2 & iperf3 --bind-dev eth3 -c 192.168.1.230 -p 5003 -T eth3 & iperf3 --bind-dev eth4 -c 192.168.1.230 -p 5004 -T eth4 &
    2. With all the links as 100M, I ran iperf3 TCP on all 4 linnks simultaneously.
      1. I observed ~30M sustained throughput on all 4 port
      2. cmd : iperf3 --bind-dev eth1 -c 192.168.1.230 -p 5001 -T eth1 -b 100M & iperf3 --bind-dev eth2 -c 192.168.1.230 -p 5002 -T eth2 -b 100M & iperf3 --bind-dev eth3 -c 192.168.1.230 -p 5003 -T eth3 -b 100M & iperf3 --bind-dev eth4 -c 192.168.1.230 -p 5004 -T eth4 -b 100M &
    3. With all link as 100M, I ran iperf3 UDP on 4 links bidirectionally.
      1. I observed full ~190M (~95M on tx and ~95M on rx) on all 4 ports
      2. cmd : iperf3 --bind-dev eth1 -c 192.168.1.230 -p 5001 -T eth1 -u -b 100M & iperf3 --bind-dev eth2 -c 192.168.1.230 -p 5002 -T eth2 -u -b 100M & iperf3 --bind-dev eth3 -c 192.168.1.230 -p 5003 -T eth3 -u -b 100M & iperf3 --bind-dev eth4 -c 192.168.1.230 -p 5004 -T eth4 -u -b 100M & iperf3 --bind-dev eth1 -c 192.168.1.230 -p 5005 -T eth1R -u -b 100M -R & iperf3 --bind-dev eth2 -c 192.168.1.230 -p 5006 -T eth2R -u -b 100M -R & iperf3 --bind-dev eth3 -c 192.168.1.230 -p 5007 -T eth3R -u -b 100M -R & iperf3 --bind-dev eth4 -c 192.168.1.230 -p 5008 -T eth4R -u -b 100M -R &

    In my experiment, I never saw the netdev timeout error.

    Regards,
    Tanmay

  • Hi Tanmay,

    Attach the test setup of customer, customer would like to achieve higher and constant throughput of each port. 

    I understand in your test setup, the bottleneck between switch and PC is 1 port Giga network. Please help to test according to customer's use case and share the test data

    Real use case used all 8 port of CPSW-9G, connects to different module in the system, if can test in that way would help a lot, I knew need to modify the ENET module to support 8 port.

  • Hi Tony,

    What is the requirement coming here from the customer. I will try to find our max value with TDA4VH at 1G link with all the ports.

    Regards,
    Tanmay

  • As customer's test result is not good, far from expectation, not sure if it is due to SDK version. So want to know:

    #1. The maximum throughput in theoretically

    #2. Real test result.

    #3. Reason of the gap

    #4. Can it be improved, and how?

    Customer wants to get the throughput data of all port at 1Gbps and all port at 100Mbps(manually set to 100M mode). as some ports will be used as EtherCAT master which should be at mega mode.

  • Hi Tanmay,

    As used all 8 port of CPSW-9G on our product, so set up test with 8 ports: TDA4VH ---> 8*TDA4VH, test as following:

    Server command:  iperf3 -s -B 161.0.0.1 /  iperf3 -s -B 162.0.0.1 /  iperf3 -s -B 163.0.0.1 /  iperf3 -s -B 164.0.0.1 /  ...

    Client default LINK mode: 1000Mbps/FULL
    1、With all the links as 1G,TCP test
            command: iperf3 --bind-dev eth1 -c 161.0.0.1-T eth1 -b 1000M &
                      iperf3 --bind-dev eth2 -c 162.0.0.1-T eth2 -b 1000M &
                      iperf3 --bind-dev eth3 -c 163.0.0.1-T eth3 -b 1000M &
                      iperf3 --bind-dev eth4 -c 164.0.0.1-T eth4 -b 1000M &
                      iperf3 --bind-dev eth5 -c 165.0.0.1-T eth5 -b 1000M &
                      iperf3 --bind-dev eth6 -c 166.0.0.1-T eth6 -b 1000M &
                      iperf3 --bind-dev eth7 -c 167.0.0.1-T eth7 -b 1000M &
                      iperf3 --bind-dev eth8 -c 168.0.0.1-T eth8 -b 1000M
           Throughput: 143+156+160+151+149+155+157+160=1231Mb/s

    2、With all the links as 1G,UDP  test
            command: iperf3 --bind-dev eth1 -c 161.0.0.1-T eth1 -u -b 1000M &
                      iperf3 --bind-dev eth2 -c 162.0.0.1-T eth2 -u -b 1000M &
                      iperf3 --bind-dev eth3 -c 163.0.0.1-T eth3 -u -b 1000M &
                      iperf3 --bind-dev eth4 -c 164.0.0.1-T eth4 -u -b 1000M &
                      iperf3 --bind-dev eth5 -c 165.0.0.1-T eth5 -u -b 1000M &
                      iperf3 --bind-dev eth6 -c 166.0.0.1-T eth6 -u -b 1000M &
                      iperf3 --bind-dev eth7 -c 167.0.0.1-T eth7 -u -b 1000M &
                      iperf3 --bind-dev eth8 -c 168.0.0.1-T eth8 -u -b 1000M
           Throughput212+209+20211+207+212+213+210=1683Mb/s

    Test conclusion: The maximum bandwidth is less than 3.3Gb.

    Our actual use case is 100Mbps/FULL, so modify the mode and test it:

    Command: ethtool -s eth1 100 duplex full && ethtool -s eth2 100 duplex full ...
    1、TCP test
          command: iperf3 --bind-dev eth1 -c 161.0.0.1-T eth1&
                    iperf3 --bind-dev eth2 -c 162.0.0.1-T eth2 &
                    iperf3 --bind-dev eth3 -c 163.0.0.1-T eth3 &
                    iperf3 --bind-dev eth4 -c 164.0.0.1-T eth4 &
                    iperf3 --bind-dev eth5 -c 165.0.0.1-T eth5 &
                    iperf3 --bind-dev eth6 -c 166.0.0.1-T eth6 &
                    iperf3 --bind-dev eth7 -c 167.0.0.1-T eth7 &
                    iperf3 --bind-dev eth8 -c 168.0.0.1-T eth8
          Throughput: 18 + 18 + 18 + 18 + 18 + 18 + 18 + 18 =144Mb/s
    2、UDP test
          command: iperf3 --bind-dev eth1 -c 161.0.0.1-T eth1 -u -b 100M &
                    iperf3 --bind-dev eth2 -c 162.0.0.1-T eth2 -u -b 100M &
                    iperf3 --bind-dev eth3 -c 163.0.0.1-T eth3 -u -b 100M &
                    iperf3 --bind-dev eth4 -c 164.0.0.1-T eth4 -u -b 100M &
                    iperf3 --bind-dev eth5 -c 165.0.0.1-T eth5 -u -b 100M &
                    iperf3 --bind-dev eth6 -c 166.0.0.1-T eth6 -u -b 100M &
                    iperf3 --bind-dev eth7 -c 167.0.0.1-T eth7 -u -b 100M &
                    iperf3 --bind-dev eth8 -c 168.0.0.1-T eth8 -u -b 100M
         Throughput: 94 + 94 + 94 + 94 + 94 + 94 + 94 + 94 = 752 Mb/s

    Questions:

    1. Why can UDP run close to the theoretical speed (95Mbps)? TCP speed is evenly divided among eight ports(20Mbps).

    2. We tested the two ports of CPSW-2g, their speed is not affected by each other.

    3. Is current CPSW9G TCP throughput test result reasonable? can it be improved?

    Thanks