Linux/AM5728: CPSW issues

xixi

Expert 2990 points

Part Number: AM5728

Tool/software: Linux

HI ALL

Just know we use the am5728 to do network distribution and we have two ethernet, eth0 and eth1.

But we countered an issues.

The eth0 can receive or send network package at 300Mb/s and the linux system can work well.

The eth1 also can receive or send network package at 300Mb/s and the linux system can work well.

However if we use the eth0 and eth1 together and the both ethernet card send or receive at 200Mb/s.

the eth1 will crashed and droped package and linux system will print watchdog time out error like below:

root@am57xx-evm:~# [ 22.078811] ------------[ cut here ]------------

[ 22.083590] WARNING: CPU: 0 PID: 3 at net/sched/sch_generic.c:316 dev_watchdog+0x258/0x25c

[ 22.091967] NETDEV WATCHDOG: eth1 (cpsw): transmit queue 0 timed out

[ 22.098347] Modules linked in: bc_example(O) sha512_generic sha512_arm sha256_generic sha1_generic sha1_arm_neon sha1_arm md5 xfrm_user xfrm4_tunnel cbc ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo bluetooth rpmsg_proto xhci_plat_hcd xhci_hcd pru_rproc usbcore pruss_intc pruss rpmsg_rpc dwc3 udc_core usb_common snd_soc_simple_card snd_soc_simple_card_utils snd_soc_omap_hdmi_audio ahci_platform libahci_platform libahci pvrsrvkm(O) libata omap_aes_driver omap_sham pruss_soc_bus omap_wdt scsi_mod ti_vpe ti_sc ti_csc ti_vpdma rtc_omap dwc3_omap rtc_palmas extcon_palmas extcon_core rtc_ds1307 omap_des snd_soc_tlv320aic3x des_generic crypto_engine omap_remoteproc virtio_rpmsg_bus rpmsg_core remoteproc sch_fq_codel uio_module_drv(O) uio gdbserverproxy(O) cryptodev(O) cmemk(O)

[ 22.167354] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G O 4.9.59-ga75d8e9305 #1

[ 22.175739] Hardware name: Generic DRA74X (Flattened Device Tree)

[ 22.181856] Backtrace:

[ 22.184331] [<c020b29c>] (dump_backtrace) from [<c020b558>] (show_stack+0x18/0x1c)

[ 22.191933] r7:00000009 r6:600f0013 r5:00000000 r4:c1022668

[ 22.197619] [<c020b540>] (show_stack) from [<c04cd680>] (dump_stack+0x8c/0xa0)

[ 22.204875] [<c04cd5f4>] (dump_stack) from [<c022e3d4>] (__warn+0xec/0x104)

[ 22.211865] r7:00000009 r6:c0c0fddc r5:00000000 r4:ee8a5dc8

[ 22.217548] [<c022e2e8>] (__warn) from [<c022e42c>] (warn_slowpath_fmt+0x40/0x48)

[ 22.225063] r9:ffffffff r8:c1002d00 r7:ee05e294 r6:ee05e800 r5:ee05e000 r4:c0c0fda0

[ 22.232843] [<c022e3f0>] (warn_slowpath_fmt) from [<c07db6f0>] (dev_watchdog+0x258/0x25c)

[ 22.241052] r3:ee05e000 r2:c0c0fda0

[ 22.244639] r4:00000000

[ 22.247185] [<c07db498>] (dev_watchdog) from [<c02919d8>] (call_timer_fn.constprop.2+0x30/0xa0)

[ 22.255920] r10:40000001 r9:ee05e000 r8:c07db498 r7:00000000 r6:c07db498 r5:00000100

[ 22.263779] r4:ffffe000

[ 22.266323] [<c02919a8>] (call_timer_fn.constprop.2) from [<c0291ae8>] (expire_timers+0xa0/0xac)

[ 22.275143] r6:00000200 r5:ee8a5e78 r4:eed38480

[ 22.279779] [<c0291a48>] (expire_timers) from [<c0291b94>] (run_timer_softirq+0xa0/0x18c)

[ 22.287992] r9:00000001 r8:c1002080 r7:eed38480 r6:c1002d00 r5:ee8a5e74 r4:00000001

[ 22.295769] [<c0291af4>] (run_timer_softirq) from [<c0232fbc>] (__do_softirq+0xf8/0x234)

[ 22.303892] r7:00000100 r6:ee8a4000 r5:c1002084 r4:00000022

[ 22.309576] [<c0232ec4>] (__do_softirq) from [<c0233138>] (run_ksoftirqd+0x40/0x4c)

[ 22.317265] r10:00000000 r9:00000000 r8:ffffe000 r7:c1013ff0 r6:00000001 r5:ee861540

[ 22.325123] r4:ee8a4000

[ 22.327669] [<c02330f8>] (run_ksoftirqd) from [<c024ec74>] (smpboot_thread_fn+0x154/0x268)

[ 22.335970] [<c024eb20>] (smpboot_thread_fn) from [<c024adb0>] (kthread+0x100/0x118)

[ 22.343746] r10:00000000 r9:00000000 r8:c024eb20 r7:ee861540 r6:ee8a4000 r5:ee861580

[ 22.351606] r4:00000000 r3:ee898c80

[ 22.355197] [<c024acb0>] (kthread) from [<c0207c88>] (ret_from_fork+0x14/0x2c)

[ 22.362448] r8:00000000 r7:00000000 r6:00000000 r5:c024acb0 r4:ee861580

[ 22.369202] ---[ end trace 175f3c4f1129894a ]---

Have you encountered this problem?

My hardware is a custom board, the am5728 chip version is ES2.0

My software is Processor SDK 4.2 release.

over 7 years ago

0 xixi over 7 years ago

Expert 2990 points

HI all:

We also refer to the below document

processors.wiki.ti.com/.../Linux_Core_CPSW_User's_Guide

My custom board is dual standalone emac mode and their CPDMA and skb buffers are common for both eth interfaces

So we doubt whether it will have effect

Thanks

regards

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi,

Could you please tell about network topology that your application is trying to do? Are the two ports bridged? Are the two ports on independent subnets? Are you using iperf or are using a custom application to send recieve traffic?

For the linux kernel could please post the results of uname -a? I would like which kernel you are using. Are you using a custom kernel configuration? Is this issue occurring on a custom board or a TI EVM? I am assuming a custom board but I wanted to be sure.

Best Regards,
Schuyler

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

HI Schuyler:

Thanks for your reply;

1 Our application try to do stream forwarding, So it need high network bandwidth.

2 Are the two ports bridged?

I think the two ports is not bridged, because my custom board is dual standalone emac mode。

3 Are the two ports on independent subnets?

Yes there are in different subnets and eth0 is 192.168.0.200, eth1 is 192.168.1.200

4 Are you using iperf or are using a custom application to send recieve traffic?

We use my custom application to test the eth0 and use iperf to test the eth1.

5 For the linux kernel could please post the results of uname -a?

My software is Processor SDK 4.2 release，the uname -a is :

Linux am57xx-evm 4.9.59-ga75d8e9305 #5 SMP PREEMPT Fri Jan 26 11:07:32 CST 2018 armv7l GNU/Linux

6 I would like which kernel you are using. Are you using a custom kernel configuration?

we are using the config file arch/arm/configs/tisdk_am57xx-evm_defconfig and the parameter has not been modified

7 Is this issue occurring on a custom board or a TI EVM?

The issue occur on my custom board and we don't know whether the TI EVM will occur.

My hardware is a custom board, the am5728 chip version is ES2.0。

8 The last few days， we have make some test.

We referd to the below document

processors.wiki.ti.com/.../Linux_Core_CPSW_User's_Guide

We changed the ti_cpsw.descs_pool_size=4096 and we get some new test result:

We use eth0 to do stream forwarding and use eth1(tcp methord) to iperf testing

generally when the eth0 get to 400Mb/s, the eth1’s bindwidth will decrease to 320Mb/s.

when the eth0 get to 300Mb/s, the eth1’s bindwidth will increase to 420Mb/s

That is to say the total bindwidth for eth0 plus eth1 is 720Mb/s, if one’s bindwidth increased then anther one will decrease the bindwidth.

So my question is how to increase the total bindwidth to 900Mb/s?

Thanks

regards

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi,

How are forwarding the traffic between the two ports? Are you setting up the route table? Or is this your application?

Are you running the TI SDK rootfs? Will your final application need the systemD type user space?

Best Regards,
Schuyler

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

HI Schuyler

Thanks for your reply;

1 How are forwarding the traffic between the two ports?

We use my custom application to test the eth0 and use iperf to test the eth1.

My custom application is used to do stream forwarding.

My custom application receive web camera stream(8Mb/s) and then copy it to lots of stream, finally send these stream to PC.

We use iperf to test eth1, my iperf is a TCP client which will send stream to the iperf server.

Generally when the eth0's send bindwidth get to 400Mb/s, the eth1’s bindwidth will decrease to 320Mb/s.

when the eth0'send bindwidth get to 300Mb/s, the eth1’s bindwidth will increase to 420Mb/s

That is to say the total send bindwidth for eth0 plus eth1 is 720Mb/s, if one’s bindwidth increased then anther one will decrease the bindwidth.

2 Are you setting up the route table?

No, We don't set up the route table.

3 Are you running the TI SDK rootfs?

Yes, We used the TI SDK rootfs and the filesystem has not been modified

4 Will your final application need the systemD type user space?

We just use the systemd to setup the boot mode

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi,

Is the custom app running on eth0 always just the streaming app and is it using TCP? You might try enabling interrupt pacing using ethtool, this allows the ACK responses to packets sent to processed in bulk as opposed to when they arrive.

Do you know to boot to a shell? This will bypass all the services that systemD sets up including some network daemons. This would might be a good experiment to see if the system services are taking up processor bandwidth.

Best Regards,
Schuyler

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

HI Schuyler

Thanks for your reply;

1 Is the custom app running on eth0 always just the streaming app and is it using TCP?

Yes, My custom app use the TCP to do streaming and it is always the stream app on eth0;

We have enabled the interrupt pacing by the ethtool, but there is no use.

2 Do you know to boot to a shell?

Just now we start the streaming app and iperf manually after linux bootup.

and later we will use systemd to setup the streaming app.

3 We are so confusing about the CPSW.

My custom board is dual standalone emac mode and their CPDMA and skb buffers are common for both eth interfaces

So we doubt whether it will have effect.

Thanks

regards

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi,

We have a proposed patch that is attached that might solve the watchdog timeout that you are seeing that you can apply to your kernel and rebuild.

Another question about the application that is being considered. I wanted to make sure that I understand what your tcp application is running on eth0, is this receive or send? eth1 is running iperf client with the iperf server on the PC. Is this correct?

What setting have you used for ethtool -C that set interrupt pacing?
Could you please provide the setting for the iperf command that you are using?

When you are measuring the bandwidth between the two applications, what are you using to detect the bandwidth of the two ethernet ports?

Best Regards,
Schuyler

0 Schuyler Patton over 7 years ago in reply to Schuyler Patton

TI__Mastermind 40080 points

Hi,

I forgot to attach the patch:

Best Regards,

Schuyler

Fullscreen 0001-debug-net-watchdog-timeout-fix_2_2.txt Download

From bff1236f42c224ee854d4f4bbc2a053c566149f1 Mon Sep 17 00:00:00 2001
From: Grygorii Strashko <grygorii.strashko@ti.com>
Date: Fri, 2 Feb 2018 10:23:58 -0600
Subject: [PATCH] [debug] net watchdog timeout fix

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
 drivers/net/ethernet/ti/cpsw.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 39c39d6..f6c64a6 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1683,6 +1683,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff *skb,
 		q_idx = q_idx % cpsw->tx_ch_num;
 
 	txch = cpsw->txv[q_idx].ch;
+	txq = netdev_get_tx_queue(ndev, q_idx);
 	ret = cpsw_tx_packet_submit(priv, skb, txch);
 	if (unlikely(ret != 0)) {
 		cpsw_err(priv, tx_err, "desc submit failed\n");
@@ -1693,15 +1694,26 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff *skb,
 	 * tell the kernel to stop sending us tx frames.
 	 */
 	if (unlikely(!cpdma_check_free_tx_desc(txch))) {
-		txq = netdev_get_tx_queue(ndev, q_idx);
 		netif_tx_stop_queue(txq);
+
+		/* Barrier, so that stop_queue visible to other cpus */
+		smp_mb();
+
+		if (cpdma_check_free_tx_desc(txch))
+			netif_tx_wake_queue(txq);
 	}
 
 	return NETDEV_TX_OK;
 fail:
 	ndev->stats.tx_dropped++;
-	txq = netdev_get_tx_queue(ndev, skb_get_queue_mapping(skb));
 	netif_tx_stop_queue(txq);
+
+	/* Barrier, so that stop_queue visible to other cpus */
+	smp_mb();
+
+	if (cpdma_check_free_tx_desc(txch))
+		netif_tx_wake_queue(txq);
+
 	return NETDEV_TX_BUSY;
 }
 
-- 
2.10.5

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

HI Schuyler

Tnanks for your reply;

Just now the total bindwidth of the eth0 and eth1 is 720Mb/s and we have test over 100 hours The stability is OK。

we will test the patch and then tell you the resoult.

My custom TCP application receive web camera stream(8Mb/s) and then copy it to lots of stream (All the stream can be up to 400Mb/s), finally send these stream to (Windows)PC.

We use iperf to test eth1, my iperf is a TCP client which send stream to the iperf server, the iperf server is a AM5728 board.

So we mainly test the sending performance of the eth0 and eth1.

My ethtool -C that set interrupt pacing is like below:

ethtool -C eth0 rx-usecs 500

The iperf command that we are using is like below:

iperf client:

iperf -B 192.168.0.27 -c 192.168.0.77 -i 10 -t 999999 &

iperf server:

iperf -B 192.168.0.77 -s -i 1 &

About detect the bandwidth of the two ethernet ports?

As you know the iperf can detect the bindwidth of eth1, because it will print the bindwidth real-time

My custom sending stream to the windows PC and The windows PC can print the receive bindwidth real-time

Thanks

regards

xixi

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi xixi,

Please try a rx-usecs value of 150 or 100 for interrupt pacing.
I apologize, I still do not understand how you are using eth0. Are you receiving a 8Mbps web camera stream on eth0, and then copying that stream to several other devices that are connected to the eth0 subnet? Are these devices on a multicast address or are there several unicast addresses to these devices?

Right now the traffic on the two ports is de-coupled. Do you plan to synchronize them in the future? Another thing to look at is available processor bandwith with the current traffic load. Please run this for a few seconds and attach the results. This will provide what the core loading is.

mpstat -P ALL 2

Best Regards,
Schuyler

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

Dear Schuyler

Very thanks!

1 Please try a rx-usecs value of 150 or 100 for interrupt pacing

We have tried this parameters and the result is the same as before.

2 About my custom application

My custom use the eth0 and receive a 8Mb/s web camera stream and then coping the stream. lastly sending to Windows PC that are connected to the eth0 subnet.

Just now only one windows PC to receive these stream.

In the feature these stream will send to several devices which is unicast addresses.

3 Right now the traffic on the two ports is de-coupled. Do you plan to synchronize them in the future?

No, we will not synchronize them in the future.

4 The below picture is my system load:

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

Dear Schuyler

Do you have any other advice for me about this issue.

Thanks

regards

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi,
After the reviewing this thread again I believe the solution is most likely an application design in terms of how to balance traffic between the two ports as the kernel will not do it to my knowledge. There are lots of possible limitations here that you will have to determine the cause and effect of. One of them is that there are multiple unicast destinations that will be behaving independently and at their own rates and will affect your application performance. This would be a network interaction issue and beyond the scope of this thread.

Earlier in the thread it was stated that the two ports are de-coupled. If that is the case does balancing the network load between the ports matter or achieving the 900 Mbps more important.

If I understand correctly it is 900Mbps total send bandwith that you are interested in? If so perhaps try running iperf as 2 different instances using different port numbers so sending on eth0 and eth1 can be done at the same time. Please report back the bandwidth seen for both ports.

Best Regards,
Schuyler

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

HI Schuyler

I'm very sorry to reply you so late because of the Chinese Spring Festival holidays.

we are interested in 900Mbps total send bandwith.

We used one iperf to test the eth0 performance.

The result is like below:

My server command is :

iperf -B 192.168.0.77 -s -i 1

We also used two iperf process to test eth0 and eth1.

The result is like below:

My server command is :

iperf -B 192.168.0.77 -s -i 1 &

iperf -B 192.168.1.77 -s -i 1 &

According to the above test, eth0 is about 400Mb/s, eth1 is about 400Mb/s. the total bindwidth of eth0 and eth1 is 800Mb/s

So how can we improve the bindwidth to 900Mb/s

Thanks

regards

0 xixi over 7 years ago in reply to xixi

Expert 2990 points

HI Schuyler：

Can you give us an other advice about this issue.

Thanks

Regards

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi,
The 900 Mbps may not be possible based on the single thread test earlier in the post. While running the test run top to see if there is any additional processor bandwidth available while running at this bit rate. The limitation of iperf is usually an indication of the processor being maxed out from a bandwidth perspective. If that is the case then exploring options of what is really needed and could perhaps be turned off such as other user services or daemons.

The other test to try is run the test in a single shell context. Here is a link that shows how to boot to a shell.

processors.wiki.ti.com/.../5x

This is a test technique and it not meant as a final product, it is only to show what is possible if the burden if any of a desktop like user space is removed. In this type of environment you will have to setup and obtain IP addresses as the systemD network manager will not be running.

Best Regards,
Schuyler

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

Dear Schuyler

Thanks for your reply

According to the below website, the ethernet eth0 TCP performance can get to 1108.8 Mb/s, But why my board can not get to this bitrate

0 Schuyler Patton over 7 years ago in reply to xixi

TI__Mastermind 40080 points

Hi,

This page used to state how the test was performed, but the test is performed using iperf in dual mode which is sending and receiving simultaneously. The ethernet on the EVM is set to run in full duplex mode. The table is a summation of both transmit and receive. The test was done using only a single transmit thread and a single receive thread, combined they achieved a sum of 1109 bps.

Your experiment is only doing send if I am correct, if you add -d to the iperf command line you might see these numbers. There is chance though that the number will be slightly less because you are using two ports.

Best Regards,
Schuyler

0 xixi over 7 years ago in reply to Schuyler Patton

Expert 2990 points

HI Schuyler

Ok, We have know about it

Thanks

Regards

Processors

Processors forum

Linux/AM5728: CPSW issues