This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM5728: Ethernet PHY randomly disappears

Part Number: AM5728

Tool/software: Linux

My custom AM5728 board cannot connect to the network about 2 out 3 boots because the Ethernet PHY (BCM54610) disappears!

This is when it works:

U-Boot 2018.01-g62414e2c81 (Mar 25 2019 - 18:36:04 +0000)                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                   
CPU  : DRA752-GP ES2.0                                                                                                                                                                                                                                                             
Model: TI AM5728 IDK                                                                                                                                                                                                                                                               
Board: Solix19 Dev                                                                                                                                                                                                                                                                 
DRAM:  2 GiB                                                                                                                                                                                                                                                                       
MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1                                                                                                                                                                                                                                              
*** Warning - bad CRC, using default environment                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                                   
SCSI:  SATA link 0 timeout.                                                                                                                                                                                                                                                        
AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl SATA mode                                                                                                                                                                                                                          
flags: 64bit ncq stag pm led clo only pmp pio slum part ccc apst                                                       
scanning bus for devices...                                                                                             
Found 0 device(s).                                                  
Net:                                                                                              
Warning: ethernet@48484000 using MAC address from ROM                      
eth0: ethernet@48484000                                                                      
Hit any key to stop autoboot:  0 

root@solix:~# dmesg | grep -i eth                                                                                                                                                                                                                                       
[    2.151339] cpsw 48484000.ethernet: No slave[1] phy_id, phy-handle, or fixed-link property
[    2.159689] cpsw 48484000.ethernet: Detected MACID = 34:03:de:ca:d0:7a
[    2.166323] cpsw 48484000.ethernet: initialized cpsw ale version 1.4
[    2.172743] cpsw 48484000.ethernet: ALE Table size 1024
[    2.178036] cpsw 48484000.ethernet: cpts: overflow check period 500 (jiffies)
[    3.284323] systemd[1]: /lib/systemd/system/eth0-up.service:13: Unknown lvalue 'After' in section 'Install'
[    7.103159] net eth0: initializing cpsw version 1.15 (0)
[    7.156786] net eth0: phy "" not found on slave 1, err -19
[    7.168656] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[    8.240135] cpsw 48484000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[    8.252017] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[    9.280086] cpsw 48484000.ethernet eth0: Link is Down
[   11.360152] cpsw 48484000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
root@solix:~# dmesg | grep -i phy                                                                                                                                                                                                                                       
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] arch_timer: cp15 timer(s) running at 6.14MHz (phys).
[    2.056926] libphy: Fixed MDIO Bus: probed
[    2.130112] davinci_mdio 48485000.mdio: detected phy mask fffffffe
[    2.138647] libphy: 48485000.mdio: probed
[    2.142711] davinci_mdio 48485000.mdio: phy[0]: device 48485000.mdio:00, driver unknown
[    2.151339] cpsw 48484000.ethernet: No slave[1] phy_id, phy-handle, or fixed-link property
[    7.156655] Generic PHY 48485000.mdio:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=48485000.mdio:00, irq=POLL)
[    7.156781] libphy: PHY  not found
[    7.156786] net eth0: phy "" not found on slave 1, err -19

root@solix:~# ethtool eth0
Settings for eth0:
        Supported ports: [ TP AUI BNC MII FIBRE ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  100baseT/Full 
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full 
                                             100baseT/Half 100baseT/Full 
        Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 100Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000000 (0)
                               
        Link detected: yes



And this is when it doesn't:

U-Boot 2018.01-g62414e2c81 (Mar 25 2019 - 18:36:04 +0000)                    
                                                                                                                                                                                
CPU  : DRA752-GP ES2.0                                                             
Model: TI AM5728 IDK                                                                                                                          
Board: Solix19 Dev                                                                                                                            
DRAM:  2 GiB                                                                                                                                  
MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1                                                                                                         
*** Warning - bad CRC, using default environment                                       
                                                                                                  
SCSI:  SATA link 0 timeout.                                                  
AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl SATA mode                                              
flags: 64bit ncq stag pm led clo only pmp pio slum part ccc apst                                      
scanning bus for devices...                                                            
Found 0 device(s).                                                                                                      
Net:   Could not get PHY for ethernet@48484000: addr 0    

root@solix:~# dmesg | grep -i eth
[    2.163116] cpsw 48484000.ethernet: No slave[1] phy_id, phy-handle, or fixed-link property
[    2.171465] cpsw 48484000.ethernet: Detected MACID = 34:03:de:ca:d0:7a
[    2.178078] cpsw 48484000.ethernet: initialized cpsw ale version 1.4
[    2.184482] cpsw 48484000.ethernet: ALE Table size 1024
[    2.189775] cpsw 48484000.ethernet: cpts: overflow check period 500 (jiffies)
[    3.297448] systemd[1]: /lib/systemd/system/eth0-up.service:13: Unknown lvalue 'After' in section 'Install'
[    8.011871] net eth0: initializing cpsw version 1.15 (0)
[    8.161286] net eth0: phy "48485000.mdio:00" not found on slave 0, err -19
[    8.161305] net eth0: phy "" not found on slave 1, err -19
[    8.176401] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   53.657624] Bluetooth: BNEP (Ethernet Emulation) ver 1.3

root@solix:~# dmesg | grep -i phy
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] arch_timer: cp15 timer(s) running at 6.14MHz (phys).
[    2.075205] libphy: Fixed MDIO Bus: probed
[    2.148541] davinci_mdio 48485000.mdio: no live phy, scanning all
[    2.163116] cpsw 48484000.ethernet: No slave[1] phy_id, phy-handle, or fixed-link property
[    8.035611] libphy: PHY 48485000.mdio:00 not found
[    8.161286] net eth0: phy "48485000.mdio:00" not found on slave 0, err -19
[    8.161301] libphy: PHY  not found
[    8.161305] net eth0: phy "" not found on slave 1, err -19

root@solix:~# ethtool eth0
Settings for eth0:
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000000 (0)
                               
        Link detected: no



My Kernel is 4.14 from SDK 5.02. I have a batch of identical boards, but only 2 from the same batch show this issue. This could indicate PCB component issue, but I checked the boards and nothing really stands out.

  • Hi,

    Thanks for posting the log file of the two use cases.

    The working case the link speed is switching from 1Gbps to 100Mbps, that could indicate another issue. I would recommend to look at the results of ethtool -S eth0. Here you are looking at the RX CRC check sums or any hardware error counts being non-zero. This might point to subtle board layout issue.

    The non-working case is showing the -19 which translates to no such device. This error detection happens in both use cases but does not clear up in the second case. For this I would look at the MDIO bus. You mentioned you reviewed the boards that are showing this no such device error, could you go into more detail about what that review was? Are the expected voltage levels correct, the clock in good shape? The PHY question is something you will have to pursue with the PHY manufacturer.

    I agree that it is most likely something to do with the PCB.

    Best Regards,
    Schuyler
  • Yes, the working case switches from 1Gbps to 100Mbps, because I had to use the lower speed of 100Mbps or it never acquires an IP address!

    So indeed it is another issue (possibly related) issue.



    I will get back with the results from ethtool -S eth0.
  • Here is output from 'ethtool -s eth0':

    IP Address acquired at 100Mbps:

    # ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 533
         Broadcast Rx Frames: 0
         Multicast Rx Frames: 498
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 49379
         Good Tx Frames: 90
         Broadcast Tx Frames: 5
         Multicast Tx Frames: 40
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 11121
         Rx + Tx 64 Octet Frames: 267
         Rx + Tx 65-127 Octet Frames: 289
         Rx + Tx 128-255 Octet Frames: 44
         Rx + Tx 256-511 Octet Frames: 23
         Rx + Tx 512-1023 Octet Frames: 0
         Rx + Tx 1024-Up Octet Frames: 0
         Net Octets: 60500
         Rx Start of Frame Overruns: 0
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 0
         Rx DMA chan 0: head_enqueue: 1
         Rx DMA chan 0: tail_enqueue: 424
         Rx DMA chan 0: pad_enqueue: 0
         Rx DMA chan 0: misqueued: 0
         Rx DMA chan 0: desc_alloc_fail: 0
         Rx DMA chan 0: pad_alloc_fail: 0
         Rx DMA chan 0: runt_receive_buf: 0
         Rx DMA chan 0: runt_transmit_bu: 0
         Rx DMA chan 0: empty_dequeue: 0
         Rx DMA chan 0: busy_dequeue: 285
         Rx DMA chan 0: good_dequeue: 297
         Rx DMA chan 0: requeue: 0
         Rx DMA chan 0: teardown_dequeue: 0
         Tx DMA chan 0: head_enqueue: 83
         Tx DMA chan 0: tail_enqueue: 7
         Tx DMA chan 0: pad_enqueue: 0
         Tx DMA chan 0: misqueued: 7
         Tx DMA chan 0: desc_alloc_fail: 0
         Tx DMA chan 0: pad_alloc_fail: 0
         Tx DMA chan 0: runt_receive_buf: 0
         Tx DMA chan 0: runt_transmit_bu: 6
         Tx DMA chan 0: empty_dequeue: 77
         Tx DMA chan 0: busy_dequeue: 0
         Tx DMA chan 0: good_dequeue: 90
         Tx DMA chan 0: requeue: 0
         Tx DMA chan 0: teardown_dequeue: 0
    

    No IP address acquired (at 1Gbps)- 

    # ethtool -S eth0                                                                                                                                                                                                                                                      
    NIC statistics:
         Good Rx Frames: 0
         Broadcast Rx Frames: 0
         Multicast Rx Frames: 0
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 0
         Good Tx Frames: 0
         Broadcast Tx Frames: 0
         Multicast Tx Frames: 0
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 0
         Rx + Tx 64 Octet Frames: 0
         Rx + Tx 65-127 Octet Frames: 0
         Rx + Tx 128-255 Octet Frames: 0
         Rx + Tx 256-511 Octet Frames: 0
         Rx + Tx 512-1023 Octet Frames: 0
         Rx + Tx 1024-Up Octet Frames: 0
         Net Octets: 0
         Rx Start of Frame Overruns: 0
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 0
         Rx DMA chan 0: head_enqueue: 1
         Rx DMA chan 0: tail_enqueue: 127
         Rx DMA chan 0: pad_enqueue: 0
         Rx DMA chan 0: misqueued: 0
         Rx DMA chan 0: desc_alloc_fail: 0
         Rx DMA chan 0: pad_alloc_fail: 0
         Rx DMA chan 0: runt_receive_buf: 0
         Rx DMA chan 0: runt_transmit_bu: 0
         Rx DMA chan 0: empty_dequeue: 0
         Rx DMA chan 0: busy_dequeue: 0
         Rx DMA chan 0: good_dequeue: 0
         Rx DMA chan 0: requeue: 0
         Rx DMA chan 0: teardown_dequeue: 0
         Tx DMA chan 0: head_enqueue: 0
         Tx DMA chan 0: tail_enqueue: 0
         Tx DMA chan 0: pad_enqueue: 0
         Tx DMA chan 0: misqueued: 0
         Tx DMA chan 0: desc_alloc_fail: 0
         Tx DMA chan 0: pad_alloc_fail: 0
         Tx DMA chan 0: runt_receive_buf: 0
         Tx DMA chan 0: runt_transmit_bu: 0
         Tx DMA chan 0: empty_dequeue: 0
         Tx DMA chan 0: busy_dequeue: 0
         Tx DMA chan 0: good_dequeue: 0
         Tx DMA chan 0: requeue: 0
         Tx DMA chan 0: teardown_dequeue: 0
    
  • Hi,
    Thanks for posting the ethtool statistics outputs. The second case where no ip address is obtained there are not any packets leaving the interface. The ip address acquisition process is never getting past the MAC. If the earlier log file is still applicable a likely issue is that the console output was indicating -19 (error no such device) meaning the kernel could not find the PHY on the MDIO bus. Without the PHY detection the CPSW driver cannot establish if the PHY has detected and established a link with the Link partner. So to summarize the PHY has to be detected and the PHY has to detect a link partner before TX packets will leave the interface. If you perform a ethtool eth0 when the second case is present I would expect to see nothing returned or the PHY is seen but a link not being detected.

    For these boards I think there are two HW issues have to be resolved why the PHY is not being detected by the kernel or link not being detected.

    Best Regards,
    Schuyler
  • Right, I suspected as such.

    1. Hardware Problem #1 - Ethernet at 1Gbps fails to acquire IP. Only 100mbps works.

      This problem is consistent among 10 + custom boards I have tried. It *could* still be a software issue.

    2. Hardware Problem #2 - Ethernet PHY disappears randomly (say, 2 out of 5 boots)
      This problem only occurs on a few (2 of 10+) custom boards. I just haven't figured out what the variable here is to reliably replicate the disappearance.

    Will close this post for now. When I figure out the details, I will update. 

    Thanks for the insight!