This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CPSW misses 25% pings? No HW issue!

Hi all

I am facing a (weird?) problem:
I have a prototype-board (similar to the beaglebone) using an XAM3359 with a SMSC-PHY. From an external company I got a linux-system that is working properly (regarding Ethernet), so the Hardware is OK!!

Now I tried to use the actual kernel/drivers delivered with the latest SDK (05.07.00) / PSP 04.06.00.10, made the same changes regarding muxing and the init of the CPSW that were done in the properly-running-linux and at first everything seemed to be OK.

But when PINGing the board about 25% of all pings are missed (response time too high), the others show a response time of always <1ms. Also internet-browsing works but is slow due to the "package-loss". So, the system is working, but only to, lets say, 75%.

Facts:
CPSW is configured as dual EMAC, using RMII, the SMSC-PHY (LAN8720A) is supported by the kernel. HW (PCB-tracing) is OK. ifconfig shows eth0 is properly UP. ethtool shows no irregularities. cat /proc/interrupts shows interrupts 40, 43, 93, 94 for "cpsw", only 93 and 94 counting up, which seems to be OK.

The only difference between the working and the partially-non-working linux is that ifconfig eth0 shows "interrupt: 40" at the working version and no interrupt at the partially-non-woking version. At both versions I see some dropped RX packets in ifconfig.

Any idea how to find the problem why one out of four pings is completely missed?

  • Some more data:

    ifconfig eth0:

    eth0      Link encap:Ethernet  HWaddr 00:18:32:68:11:A0  
              inet addr:192.168.0.60  Bcast:0.0.0.0  Mask:255.255.255.0
              UP BROADCAST RUNNING ALLMULTI MULTICAST  MTU:1500  Metric:1
              RX packets:210 errors:0 dropped:78 overruns:0 frame:0
              TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:28951 (28.2 KiB)  TX bytes:1770 (1.7 KiB)

    ethtool eth0:

    Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 0
        Transceiver: external
        Auto-negotiation: on
        Current message level: 0x00000000 (0)
                       
        Link detected: yes

    So, erverything looks OK - but it isn't!!

  • Which refclk do you use?

  • The PHY is configured to be the source for REFCLK, so the PHY delivers the 50MHz-signal to the MAC (the cpsw)!

    I am using the standard-init-routine for the cpsw (with RMII) as it somes with the SDK (like it is used for BeagleBone BEFORE A3: am33xx_cpsw_init(AM33XX_CPSW_MODE_RMII, NULL, NULL);)!

  • <so the PHY delivers the 50MHz-signal>

    Well, .. I venture to say - never!

    Either pll refclk (internal) or external 50MHz oscillator. The internal refclk does not work for RMII.


    This is what I wanted to suggest.

  • @sviss

    Don't know what you mean - of course the PHY is the SOURCE of the 50MHz signal. So for the MAC (cpsw) the source of the RMII-clock comes from a chip-pin (in this case the RMII_REFCLK/GPIO0_29, pin #H18). This can be done with the AM335x!

    Anyway, I have one advantage in searching for that problem - I have a kernel that is working properly on the same board. So, in the meanwhile I wrote a small executable which writes all cpsw- and control-module-registers in a file. I did this on the working and on the non-working linux and so I found a bad register setting:

    In the control-module "gmii_sel" register  the mii1_io_clk_en is set to "0", so the MAC does not "know" that the PHY delivers the 50MHz signal (as mentioned above) - so they are out of sync. Somehow funny, this means that SVISS was right with his opinion, that there is something wrong with the RMII-clock!

    Next week I will try to find where this register in intialized and fix this. If it works then I will close this thread.

  • No fun. Pure experience and doc reading.

    1) There is TI note on Sitara problem using internal refclk for RMII.

    2) There is my own experience of 25% ethermet frames loss if using pll refclk.

    Could you please give us a reference confirmation of your: "of course the PHY is the SOURCE of the 50MHz signal" ?

  • Problem solved, when setting Bit6 (and eventually 7) in the control-modules "gmii_sel" register (datasheet says "Enable RMII clock to be sourced from chip pin") it works.

    @sviss

    still not sure what you mean - as stated above we are using a HW-configuration where the RMII-clk is NOT generated in the AM335x but comes from the external PHY. In our case this is the SMSC LAN8720A, which is able to generate the 50MHz clock and deliver it via its nINT0/REFCLKO-pin to the MAC/CPSW - this is done by pulling the nINTSEL/LED2-pin of the LAN8720A to GND. Therefore the CPSW (in our case) does not use an AM335x-internal clock but uses an external 50MHz-clock-signal.

  • I mean that datasheets should be read carefully. If you refer to the LAN8720A datasheet (I asked you for reference) then there are words:

    "The REF_CLK Out Mode is not part of the RMII Specification. Timing in this mode is not compliant with the RMII specification. To ensure proper system operation, a timing analysis of the MAC and LAN8720 must be performed."

    External refclk is recommended. Feel free usinhg LAN8720A generated one, at your own risk.

  • @sviss

    Sorry, I did not know what you asked for, as you were always mentioning only the internal refclk of the AM335x.

    Anyway, the HW-design was made by an external company, I hope THEY did verify timing of the RMII-interface.  Of course it is not RMII-specification that the PHY delivers the REFCLK, but if the PHY AND the MAC supports this, why should it not work? It makes no difference where the REFCLK comes from (again, this is only true for this special configuration)!