This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3359: CPSW issues

Part Number: AM3359
Other Parts Discussed in Thread: TLK110

Hi,

I'm working on the ICEv2.1 eval board.  I'm working with a fairly high Ethernet traffic (25 Mb/s one port, 5mb/s on the other port) with the CPSW in dual MAC mode but with the ALE configured to automatically forward multicast frames from port 2 to port 1.

I'm getting packet loss which I *think* is due to port FIFO overruns, but I'm looking at the statistics and seeing some very strange stuff.

1. I see collisions (single collisions, late collisions, deferred tx frames) which should be impossible because both ports are connected on full duplex links (one port is connected to a switch, the other to a PC, both in forced full duplex, both reporting full duplex, and both PHYs on the ICEv2 reporting full duplex).

2. On port 0, with only port 0 statistics enabled, I see many Rx SoF overruns and DMA overruns, but every other statistic is zero.  How can this be?  Many frames are received successfully so I'd expect to see some data in Net octets.  the attached a screenshot of this.  Even though the statistics say no transmissions were made, I was successfully receiving 500 frames per second and I could see this with Wireshark running on a PC.

3. Even though I see Rx SoF and DMA overruns, there are always at least 3 unused buffer descriptors in the DMA queue.

Can I trust the statistics?

  • It's my own software based on the LWIP starterware example running bare metal. I tried to keep these question software-independent: e.g. regardless of the software, I can see frames being transmitted to a PC, yet the Tx counter is zero.
  • Mat,

    Which CCS version are you using? Did you try to read from CPSW_STATS memory @0x4A10_0900 instead of the CCS register window?

    Regards,
    Garrett
  • Hi Garrett, I'm using CCS 7.2.0.00013.

    I get the same values if I read the memory locations directly from code - see attached where I've made a structure pointer with the stats register addresses:

  • The main issue I have though is not the stats - I just wanted to use those to work out where I'm losing frames.

    In my application I am using dual MAC mode and I have set up the ALE to automatically forward multicast frames received on port 1 to ports 0 and 2.

    Every 25.6 milliseconds, I queue up about 11kbytes of data in a chain of buffer descriptors to be transmitted from port 0 to port 2 and write this queue to TX0_HDP. At the same time, I'm receiving 1064 byte multicast frames every 400 microseconds on port 1, which is forwarded to ports 0 and 2. I lose a small percentage of frames, which I suspect is because I'm overloading the port 2 Tx FIFO.
  • I think I've figured out the mystery - the TRM says "FIFO overruns (SOFOVERRUNS) are the only port 0 statistics that
    are enabled to be kept" and I worked out that the receive overruns in the stats above occur AFTER I've pressed the pause button in code composer - i.e. while the processor transitions to debug mode, so the EMAC is still receiving frames but the code is not processing the receive buffer descriptors. So that isn't the problem.

    Please could you take a look at my previous description of my port configuration and data flow - would you expect the port 2 Tx FIFO to overrun here?

    Also item 1 in my original post still applies - I see collisions and deferred frames on full-duplex links, which should be impossible. Do you have any thoughts on this?
  • Mat,

    >>Every 25.6 milliseconds, I queue up about 11kbytes of data in a chain of buffer descriptors to be transmitted
    With this low data rate (~3.5Mbps), it really should not have port 2 Tx FIFO overrun, also with full duplex, the collision and deferred frame should not happen either. Why do you have to stick with starterware which is no longer supported, and not move to Processor SDK? Is it because of LWIP stack support?

    Regards,
    Garrett
  • Hi Garrett,

    It is difficult to use any 3rd part code in aerospace software so we are not using Processor SDK or Starterware. I just happened to use the LWIP code as the starting point for my own code. Now I'm controlling the buffer descriptors directly - I have a simple circular queue of 5 receive descriptors for reception, and a linear queue of 64 descriptors for transmission.

    Let me ask some specific questions:

    1. TRM section 14.3.2.10.1 describes "normal priority mode", where the Tx FIFO is split into separate priority queues. TRM section 14.3.2.10.2 describes "Dual Mac Mode", which does not mention the Tx FIFO. How is the Tx FIFO organised in Dual Mac Mode?

    2. TRM section 14.3.2.10.2 states "When operating in dual mac mode the intention is to transfer packets between ports 0 and 1 and ports 0 and 2, but not between ports 1 and 2.". I am using dual MAC mode, but I have also set up the ALE to forward multicast frames between ports 1 and 2. This works - but is it something that the CPSW is not designed to support, causing occasional packet loss?

    3. It's hard to figure out the flow control between the FIFOs. There is hardware flow control from port 1 and 2 FIFOs back to port 0 (TRM section 14.3.2.12.1 "CPPI transmit flow control is enabled by default on reset because host packets should not be dropped in any mode of operation."). If the ALE is set to route packets from port 1 to port 2, and port 2 Tx FIFO is full, what happens to the packets?

    4. Can you explain the mechanism that increments the collision and deferred frame counters within the CPSW? Is it based on the COL signal from the PHY? This should not be asserted on full duplex, so I could put a scope on there and see if the problem is in the processor or the PHY / network.

    Thanks!
  • Hi Mat,

    As you pointed out from the TRM that when the ALE is set up as a Dual MAC the intention is "not" to have the packets exchanged between ports 1 and 2. Do you need traffic to happen between ports 1 and 2? In dual mac mode the intention is for the packets to proceed up to ARM for routing out to the other port. Do you need the low latency of traffic between the two external ports?

    When the fifos are full for any port to port transfer packets will be dropped on entry to the ALE and on exit through port 0 if there are not any rx descriptors available. The earlier post that shows start of frame overruns and DMA overruns are indicative of this scenario, the internal rx fifos are full. Until the CPDMA has an rx descriptor packets will continue to be dropped.

    In your setup a queue of 5 RX descriptors is probably why packets are being dropped. To reduce these SOF and DMA overruns will be highly dependent on bit rate and the number of rx descriptors available. On Linux the default is 128, this can be increased by allocating external memory to handle additional rx descriptors instead of the CPPI buffer memory.

    Best Regards,
    Schuyler
  • Hi Schuyler,

    I altered my software to forward packets between port 1 and 2 via the ARM and increased the number of Rx descriptors to 16.  I want to keep the number as small as possible to simplify worst-case execution time analysis.  This should be more than enough - the Ethernet traffic consists of periodic frames every 400us and 25ms and I re-queue all the descriptors in every rx interrupt.

    With these changes, the problem appears to be fixed and I am not losing packets.

    However, I am still getting collisions reported in the stats for port 2 - see attached pic.  This should be impossible - the RJ45 on the ICE board is connected to a switch and I've verified that the switch reports full duplex links.  I've forced the PHYs to use full duplex.  I've monitored the COL pin on the TLK110 using an oscilloscope and it is not asserted.  So it seems that the stats in the CPSW are incorrectly reporting collisions.  Is this a known problem?  Ports 0 and 1 do not report any collisions, deferred frames or errors.

    Thanks,

    Mat

  • Hi Mat,

    Based on previous debug in the Linux driver we found that if the in-band mode (bit 18) is set in the mac control register when the interface was in 10Mbps mode this could cause problems. The in-band bit was getting set when the link was detected as having at 10Mbps, this caused the collision count to increase and affect overall traffic. This in-band bit should only be set when the PHY in RGMII mode and the speed is detected as being 10Mbps.

    Best Regards,
    Schuyler