This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3874 "NETDEV WATCHDOG: eth0 (cpsw): transmit queue 0 timed out"

Other Parts Discussed in Thread: AM3874, DM385

I'm experiencing a very consistent problem with the CPSW on an AM3874 when transmitting udp packets.

PSP 04.01.00.06

The error shows up in syslog as so:

------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x14c/0x234()
NETDEV WATCHDOG: eth0 (cpsw): transmit queue 0 timed out
Modules linked in: ipv6
Backtrace:
[<c003bb70>] (dump_backtrace+0x0/0x110) from [<c02cabc4>] (dump_stack+0x18/0x1c)
r7:c03bbe20 r6:c0251b54 r5:c039103b r4:00000102
[<c02cabac>] (dump_stack+0x0/0x1c) from [<c005e0c8>] (warn_slowpath_common+0x54/
0x6c)
[<c005e074>] (warn_slowpath_common+0x0/0x6c) from [<c005e184>] (warn_slowpath_fm
t+0x38/0x40)
r9:c041588c r8:c0415c8c r7:c03ba000 r6:c04265ac r5:00000000
r4:cf814000
[<c005e14c>] (warn_slowpath_fmt+0x0/0x40) from [<c0251b54>] (dev_watchdog+0x14c/
0x234)
r3:cf814000 r2:c0391053
[<c0251a08>] (dev_watchdog+0x0/0x234) from [<c00689b4>] (run_timer_softirq+0x148
/0x1e4)
r6:c0251a08 r5:00000100 r4:c0414e80
[<c006886c>] (run_timer_softirq+0x0/0x1e4) from [<c0063254>] (__do_softirq+0x80/
0x108)
[<c00631d4>] (__do_softirq+0x0/0x108) from [<c0063324>] (irq_exit+0x48/0x94)
[<c00632dc>] (irq_exit+0x0/0x94) from [<c002d080>] (asm_do_IRQ+0x80/0xa0)
[<c002d000>] (asm_do_IRQ+0x0/0xa0) from [<c02ccc34>] (__irq_svc+0x34/0xa0)
Exception stack(0xc03bbf50 to 0xc03bbf98)
bf40: 81400181 40000013 c03bbf98 81400081
bf60: c03ba000 c03f3c40 c0029e34 c03be06c 80000000 413fc082 0000001f c03bbfa4
bf80: c03bbf98 c03bbf98 c0038f2c c0038f30 20000013 ffffffff
r5:fa200000 r4:ffffffff
[<c0038ee4>] (default_idle+0x0/0x54) from [<c00394d4>] (cpu_idle+0x50/0x90)
[<c0039484>] (cpu_idle+0x0/0x90) from [<c02c2068>] (rest_init+0x60/0x78)
r5:c03f3c40 r4:c0417ea8
[<c02c2008>] (rest_init+0x0/0x78) from [<c0008c6c>] (start_kernel+0x258/0x2ac)
[<c0008a14>] (start_kernel+0x0/0x2ac) from [<80008044>] (0x80008044)
r5:c03f3d5c r4:10c53c7d
---[ end trace d8282103f1c61381 ]---

To elaborate my root filesystem is NFS mounted and heavy ICMP traffic from another host to this board works well for hours on end.  So far the problem only manifests itself after transmitting in the vicinity of a 1 GB of udp traffic.  Once this trace shows up the system still responds to mdio events and will show the link coming and going as the cable is plugged/unplugged but I cannot get the device to reset and start sending again.

Anybody seen an issue like this before I start digging into the cpsw driver?

Best Regards,

-Ryan

  • A little bit more information:

    Moved root filesystem to NAND which allowed some post mortem.

    I set the message level to a random 1000, with ethtool and that shows what was expected that the transmit dma has timed out.

    # ethtool -s eth0 msglvl 1000

    Here is the syslog:

    net eth0: initializing cpsw version 1.10 (0)
    CPSW phy found : id is : 0x0
    CPSW phy found : id is : 0x0
    net eth0: submitted 64 rx descriptors
    ADDRCONF(NETDEV_UP): eth0: link is not ready
    PHY: 0:00 - Link is Up - 10/Half
    PHY: 0:01 - Link is Up - 10/Half
    ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
    PHY: 0:00 - Link is Up - 1000/Full
    PHY: 0:01 - Link is Up - 1000/Full
    net_ratelimit: 95755 callbacks suppressed
    eth0: no IPv6 routers present
    net_ratelimit: 233772 callbacks suppressed
    net_ratelimit: 184957 callbacks suppressed
    net eth0: desc submit failed
    net eth0: desc submit failed
    net eth0: desc submit failed
    net eth0: transmit timeout, restarting dma
    net eth0: desc submit failed
    net eth0: desc submit failed
    net eth0: desc submit failed
    net eth0: desc submit failed
    net eth0: desc submit failed
    net eth0: desc submit failed
    net eth0: transmit timeout, restarting dma

    At this point bringing the device down and then backup allows another blast of udp packets, before another transmit timeout.

    -Ryan

  • Noticed in davinci_cpdma.c that when "net eth0: desc submit failed" the cpdma_desc_alloc() funtion is failing to return a descriptor from the pool when only 192 of the 256 are in use.  I've not figured out why that is yet, but I ran another test over the weekend to see what would happen with a larger amount of ICMP traffic running at the same time as the 1200 byte udp packets.

    I used the following flood ping from the host that is receiving the udp packets.

    # ping -f -q -s 1200 -l 4  YYY.YYY.YYY.YYY

    This limits the udp traffic to ~1/3 of the original.   Under this scenario the same message:

    net eth0: desc submit failed
    net eth0: transmit timeout, restarting dma

    is logged, however at some point after the timeout and stall occurred the transmission started up again.

    This appears to me to be a race condition where the cpdma some how gets kick started again because of the ICMP traffic.

    -Ryan

  • Still working this issue and tried two things:

    1) Dumping the CPSW registers shows that the:

    (0x460) TXCARRIERSENSEERRORS is repoprting a value of 2.  This seems very strange to me because the other side of the CPSW port is a 1000BaseT switch.

    Most of the ALE entries show the MCAST_FWD_STATE to be 0.  The sprugz7a.pdf says that 0 == forwarding, and 3 == forwarding.  It also states "The forward state test returns a true value if both the RX and TX ports are in the required state."  This makes me think the value should be 3.

    2) I tried the udp transmit application on a Mistral EVM8148, and the application ran perfectly.  On the EVM the app never showed a TXCARRIERSENSEERRORS value greater than 0 and the MCAST_FWD_STATE shows up as 3.  Here are the ALE differences:

    Custom Board with AM3874:

    ale_ent{0000}[1]: Unitcast Address Table Entry:
                  Port_Number=0, BLOCK=0, UNICAST_TYPE="NOT AGEABLE"
                  UNICAST_ADDRESS=40:5f:c2:6b:79:81
    ale_ent{0001}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=0
                  MULTICAST_ADDRESS=ff:ff:ff:ff:ff:ff
    ale_ent{0002}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=0
                  MULTICAST_ADDRESS=33:33:00:00:00:01
    ale_ent{0003}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=0
                  MULTICAST_ADDRESS=01:00:5e:00:00:01
    ale_ent{0004}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=0
                  MULTICAST_ADDRESS=33:33:ff:6b:79:81

    Mistral EVM8148 board with DM8148:

    ale_ent{0000}[1]: Unitcast Address Table Entry:
                  Port_Number=0, BLOCK=0, UNICAST_TYPE="NOT AGEABLE"
                  UNICAST_ADDRESS=40:5f:c2:48:23:15
    ale_ent{0001}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=3
                  MULTICAST_ADDRESS=ff:ff:ff:ff:ff:ff
    ale_ent{0002}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=3
                  MULTICAST_ADDRESS=01:00:5e:00:00:01
    ale_ent{0004}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=3
                  MULTICAST_ADDRESS=33:33:00:00:00:01
    ale_ent{0005}[1]: Multicast Address Table Entry
                  Port_Mask=7, SUPER=0, MCAST_FWD_STATE=3
                  MULTICAST_ADDRESS=33:33:ff:48:23:15
    ale_ent{0061}[1]: Unitcast Address Table Entry:
                  Port_Number=2, BLOCK=0, UNICAST_TYPE="AGEABLE TOUCHED"
                  UNICAST_ADDRESS=00:1b:21:81:d1:06
    ale_ent{0062}[1]: Unitcast Address Table Entry:
                  Port_Number=2, BLOCK=0, UNICAST_TYPE="AGEABLE TOUCHED"
                  UNICAST_ADDRESS=00:13:21:fc:83:7c

    Any ideas why the ALE entries might be so vastly different?  or why  TXCARRIERSENSEERRORS is getting incremented when the EVM and my board are going into the same switch?

    One other thing of note is that my board is using the second slave port of the CPSW i.e. EMAC[1] with GMII, and the EVM is using RGMII.

    -Ryan

  • Hi Ryan,

    From what I understand, you're trying to send UDP packets from your target (AM3874) to your host (Desktop PC), is that correct?

    Are you using the "ping" command? If yes, in what format?

    BR

    Pavel

  • Hi Pavel,

    You are correct.  The fastest and easiest means to invoke the problem is to transmit UDP packets from the AM3874 target to my PC via a simple application that uses sendto() named "udpsource".  I am not using ping to transmit UDP packets, but I have used ping to create some ICMP traffic while trying to diagnose the problem.  The ping command I mentioned in posting 3 is:

    # ping -f -q -s 1200 -l 4  YYY.YYY.YYY.YYY

    In that posting I was using ping to cause ICMP traffic from the PC to the AM3874 target.  In this scenario the "udpsource" application would run indefinitely without manifesting the "NETDEV WATCHDOG: eth0 (cpsw): transmit queue 0 timed out" message where the ethernet interface will no longer transmit.

    Thanks,

    -Ryan

  • Hi Ryan,

    About the "udpsource" application, is it some test application available over Internet? If yes, could you provide me a link?

    I need these details in order to stay close as much as possible to your working environment.

    BR

    Pavel

  • Hi Pavel,

    I've  attached two files below.  The first is udpsource.cpp which I am running from the AM3874, and the second is udpsink.cpp that I have running on a linux  PC.  To date most of the use of this application has been to send UDP packets that would fit in a single 1500 byte ethernet frame so I would invoke the two applications like so:

    $ ./udpsource xx.xx.xx.yy 10002 1200

    $ ./udpsink xx.xx.xx.zz  10001 10002 1200

    The first 2 arguments of udpsink do not matter but they must be present.

    3000.udpsource.cpp

    7824.udpsink.cpp

    Additionally, following the discovery that the ICMP traffic from the 3rd post made a significant difference I made a tcpsource/tcpsink version of the above application to see if the TCP ACK traffic would provide the same results as the ICMP traffic from the ping.  Unfortunately the TCP version operates with essentially the same conclusion as the UDP getting a transmit timeout.

    In my fifth posting I noted two things:

    1) The MCAST_FWD_STATE status indicator in the multicast address table entries shows a value of "0", while my EVM shows a value of "3".  The manual states that both of these values mean "forwarding".  Can you please explain to me the difference between MCAST_FWD_STATE==0 and MCAST_FWD_STATE==3?

    2) When the transmit error occurs I see a non-zero value for TXCARRIERSENSEERRORS (CPSW reg 0x460).  I am very surprised to see a non zero value because the carrier sense input from the PHY is tied low.  What would cause this register to count up?

    Best Regards,

    -Ryan

  • Hi Ryan,

    In one of your previous posts, you stated "One other thing of note is that my board is using the second slave port of the CPSW i.e. EMAC[1] with GMII, and the EVM is using RGMII." 

    I guess your board has GMII ports (GMII0 and GMII1 port) and GMII PHY modules? On the TI EVM DM8148 from Mistral, the ethernet ports are RGMII0 and RGMII1, with RGMII PHYs (which supports only RGMII/SGMII). My concerns are that if you are reusing the same ports/PHYs from the TI EVM, may be this cause the problems.

    If you use the first slave port of the CPSW with GMII, EMAC[0], are you facing the same problem?

    As everything works fine on the TI EVM board, may be your custom board needs double check on the Hardware aspect (PCB layout, connections, etc).

    BR

    Pavel

  • Hi Pavel,

    The board we have built is indeed using the GMII[1] connections and the PHY in use is also a GMII phy.

    If I correctly understand your comments about the comparison to the EVM my answer is that we believe the hardware to be working based on the following:

    1) ICMP (ping) traffic works, i.e. the link integrity is high because the packet loss with ICMP is very low.  If there was an incorrect implementation of the hardware such that the PHY and 3874 were not both connected as GMII this would never work and we would see 100% packet loss.

    2) The UDP data does transmit from the 3874 to the PC for some length of time.  Again if we were experiencing a GMII versus RGMII mis-connected problem there should never be a packet received by the PC.

    Concerning the GMII[0] connections, it would be a great idea to try that interface.  Unfortunately, the PCB does not have traces connected to the GMII[0].

    On another note, we have performed a few more experiments.

    1) We implemented a loop-back after our 1000Base-X PHY to see if somehow the ethernet switch could be preventing us from transmitting.  This experiment was performed primarily because of the confusing error indication of a non-zero value in the TXCARRIERSENSEERRORS (CPSW reg 0x460) register.  This experiment produced no change in results, meaning the "NETDEV WATCHDOG: eth0 (cpsw): transmit queue 0 timed out" error message still appears and no further packets appear to transmit.

    2) Because our PHY is implemented in an FPGA we removed the ability of the RX GMII signals to toggle by driving them low, thinking this would remove any feedback to the AM3874 that might be causing the transmit to stop.  This experiment also produced no change in results.

    Best Regards,

    -Ryan

  • Hi Ryan,

    I am not an EMAC expert, so I ask your questions (the two questions from your post in 17 July) to the experts and still waiting for feedback.

    As this bug (the tx queue 0 timeout) is active only on the custom board, on the TI EVM (that I have) this bug does not appear, I can not be in much help to you.

    What I can propose you is to:

    1. Compare the EMAC registers settings from the TI EVM and custom board and see that you are aligned.

    2. Enlarge the Transmit buffer (queue)

    3. Increase the EMAC clock, make it faster

    4. enlarge the time for the NETDEV watchdog (watchdog_timeo)

    BR

    Pavel

  • These are the answers from the EMAC experts:

    1. The MCAST_FWD_STATE status indicator in the multicast address table entries shows a value of "0" in TI EVM, while custom board shows a value of "3". The manual states that both of these values mean "forwarding". Can you please explain to me the difference between MCAST_FWD_STATE==0 and MCAST_FWD_STATE==3?

    Answer: They are the same

    2. When the transmit error occurs he see a non-zero value for TXCARRIERSENSEERRORS (CPSW reg 0x460). He is very surprised to see a non zero value because the carrier sense input from the PHY is tied low. What would cause this register to count up?

    Answer:  If CRS is tied low then carrier sense errors will happen in half-duplex mode because carrier sense is not going high.

    BR

    Pavel

  • Pavel Botev said:

    These are the answers from the EMAC experts:

    1. The MCAST_FWD_STATE status indicator in the multicast address table entries shows a value of "0" in TI EVM, while custom board shows a value of "3". The manual states that both of these values mean "forwarding". Can you please explain to me the difference between MCAST_FWD_STATE==0 and MCAST_FWD_STATE==3?

    Answer: They are the same

    2. When the transmit error occurs he see a non-zero value for TXCARRIERSENSEERRORS (CPSW reg 0x460). He is very surprised to see a non zero value because the carrier sense input from the PHY is tied low. What would cause this register to count up?

    Answer:  If CRS is tied low then carrier sense errors will happen in half-duplex mode because carrier sense is not going high.

    BR

    Pavel

    That answer is exactly what I needed, the fact that the CRS error counter is getting incremented tells me that somehow we're getting into half-duplex.  I have modified the cpsw_adjust_link() method in drivers/net/cpsw.c to never allow the write SL2_MACCONTROL to set value other than full-duplex.  Testing over night looks very promising!
    Do you know what triggers cpsw_adjust_link() to execute?
    Thanks,
    -Ryan
  • Hi Pavel,

    I wanted to post the hack I made to the cpsw driver to avoid the half-duplex configuration of the MAC sliver.  As an aside we have yet to see an mdio response from the PHY that would indicate a cause for this piece of the driver to switch into half duplex mode.   Below is the change in cpsw.c.

    Best Regards,

    -Ryan

    static void _cpsw_adjust_link(struct cpsw_slave *slave,
    struct cpsw_priv *priv, bool *link)
    {
    struct phy_device *phy = slave->phy;
    u32 mac_control = 0;
    u32 slave_port;

    if (!phy)
    return;

    slave_port = cpsw_get_slave_port(priv, slave->slave_num);

    { /* RKM, force to always select:
    * GIG, FULLDUPLEX, GMII_EN, GIG_FORCE */
    if (phy->link) {
    /* enable forwarding */
    cpsw_ale_control_set(priv->ale, slave_port,
    ALE_PORT_STATE,
    priv->port_state[slave_port]);
    mac_control = (priv->data.mac_control &
    ~(BIT(0) | BIT(7) | BIT(18)));

    mac_control |= (BIT(7) | /* GIGABITEN */
    BIT(0) | /* FULLDUPLEXEN */
    BIT(5) | /* GMII_EN */
    BIT(17)); /* GIG_FORCE */
    *link = true;

    if (mac_control != slave->mac_control) {
    phy_print_status(phy);
    __raw_writel(mac_control, &slave->sliver->mac_control);
    }

    slave->mac_control = mac_control;
    }
    else
    slave->mac_control = BIT(0); /* FULLDUPLEXEN */
    }

    #if 0
    if (phy->link) {
    /* enable forwarding */
    cpsw_ale_control_set(priv->ale, slave_port,
    ALE_PORT_STATE,
    priv->port_state[slave_port]);

    mac_control = (priv->data.mac_control &
    ~(BIT(0) | BIT(7) | BIT(18)));

    if (phy->speed == 10)
    mac_control |= BIT(18);
    else if (phy->speed == 1000)
    mac_control |= BIT(7); /* GIGABITEN */
    if (phy->duplex)
    mac_control |= BIT(0); /* FULLDUPLEXEN */

    *link = true;
    } else {
    cpsw_ale_control_set(priv->ale, slave_port,
    ALE_PORT_STATE, ALE_PORT_STATE_DISABLE);
    mac_control = 0;
    }


    if (mac_control != slave->mac_control) {
    phy_print_status(phy);
    __raw_writel(mac_control, &slave->sliver->mac_control);
    }

    slave->mac_control = mac_control;
    #endif
    }

  • Hi Ryan,

    I am also facing similar kind of issue. Could you please explain the solution brifely?

    Regards,

    Tanaji N

  • Tanaji,

    I was experiencing the _cpsw_adjust_link() method putting the MAC Sliver into half-duplex mode when the outbound traffic was much higher than the inbound traffic.  So what I did was to not allow the MAC Sliver to be configured to anything other than full-duplex mode at gigabit speed.  This would probably not work well if you had to support auto-negotiation and all three speeds (10/100/1000).  Without this hack for some reason the state that _cpsw_adjust_link() is evaluating reports half-duplex, and so the MAC Sliver is configured that way at which point the MAC is seeing a carrier sense error and will no longer send data.

    Ciao,

    -Ryan

  • Hi Ryan,

    We are facing the similar issue (NETDEV WATCHDOG: eth0 (cpsw): transmit queue 0 timed out) with external bd_ram.
    We are working with DM385, which supports auto-negotiation and all three speeds (10/100/1000).  
    We are using the iperf bidirectional network traffic from host port to LAN port and vice-verse,
    with udp 1000Mbps + 64B packet size.

    (Note: With external memory, this is working fine till 1000M + 256B packet size. If we reduce the packet size to 64B,
    we are facing timeout issue from 100M network traffic.)

    Client: iperf -c 10.110.1.33 -u -l 64B -b 1000M -t 5000 -d &
    Server: iperf -s -u &

    With the internal memory for bd_ram(8K), we never observed such error, but reduced throughput.

    To increased throughput, we have moved to external bd_ram(32K).
    Do you have any idea?

    Regards,
    Tanaji N

  • Hi.
    Did you manage to resolve the issue?
    I am facing similar problem... The solution here won't seem to help...
  • Hey Ryan. how did the CRS errors caused you a kernel oops?
    I am facing a very similar issue, but in my case i can see rxalignerrors counter infringement, and i cann see the oops (tx transmit...)
    How did you solve the issue??
  • In my case the phy status was incorrectly decoded so the MAC thought the phy was going into half duplex.  This in turn caused the out bound queue to fill up and eventually the NETDEV WATCHDOG would fire.

    Because the phy status was decoded incorrectly I hacked up _cpsw_adjust_link() to force the mac_control like so:

    mac_control |= (BIT(7) | /* GIGABITEN */
                                 BIT(0) | /* FULLDUPLEXEN */
                                 BIT(5) | /* GMII_EN */
                                 BIT(17)); /* GIG_FORCE */

    I was able to do this because I only support a gigabit rate.

    Best,

    -Ryan