This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5708: Ethernet interface down issue in AM5708

Part Number: AM5708

Hi:

      I have a custom AM5708 based board and SDK6.02. I have an ethernet board with 2 interfaces (eth1 and eth2) over PRU_2.

My phyter is DP83822. The issue is that when I down either interface ("ifconfig eth1 down" or "ifconfig eth2 down")  I get a crash

(see attached file) when unplugging and plugging the cable back.

It seems that the crash occurs because there is a phyter interrupt that is not attended. It is supposed that in

down state the interface interrupts, included phyter´s, should be disabled.

If I configure in the dtb the phyter no to generate interrupts the crash does not happen but I see that after some unpluggings/pluggins

the interface still receives frames but does not send any. In principle this should not have to do with the phyter interrupts.


Regards

Billa

ethernet_redundancy_interface_down_issue.txt
root@predixedge:/mnt/data/APPROOT/bin# ifconfig
docker0   Link encap:Ethernet  HWaddr 02:42:8C:47:6D:E3  
          inet addr:172.17.0.1  Bcast:172.17.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

docker_gwbridge Link encap:Ethernet  HWaddr 02:42:34:AB:DD:FE  
          inet addr:172.18.0.1  Bcast:172.18.255.255  Mask:255.255.0.0
          inet6 addr: fe80::42:34ff:feab:ddfe/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:1006 (1006.0 B)

eth0      Link encap:Ethernet  HWaddr 00:A0:F4:DE:AD:BE  
          inet addr:10.3.33.187  Bcast:10.3.39.255  Mask:255.255.248.0
          inet6 addr: fe80::b555:b9f:1283:2946/64 Scope:Link
          UP BROADCAST RUNNING ALLMULTI MULTICAST  MTU:1500  Metric:1
          RX packets:1142 errors:0 dropped:0 overruns:0 frame:0
          TX packets:25 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:73162 (71.4 KiB)  TX bytes:1996 (1.9 KiB)
          Interrupt:90 

eth1      Link encap:Ethernet  HWaddr 00:A0:F4:DE:AD:CC  
          inet addr:0.0.0.1  Bcast:255.255.255.255  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1121 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:67284 (65.7 KiB)  TX bytes:896 (896.0 B)

eth2      Link encap:Ethernet  HWaddr 00:A0:F4:DE:AD:CD  
          inet addr:0.0.0.1  Bcast:255.255.255.255  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1121 errors:0 dropped:1 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:67284 (65.7 KiB)  TX bytes:896 (896.0 B)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:833 errors:0 dropped:0 overruns:0 frame:0
          TX packets:833 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:72294 (70.5 KiB)  TX bytes:72294 (70.5 KiB)

usb0      Link encap:Ethernet  HWaddr 00:1E:58:41:B8:78  
          inet addr:172.16.0.3  Bcast:172.16.0.255  Mask:255.255.255.0
          inet6 addr: fe80::45d5:14dc:4e91:c7c6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:51 errors:0 dropped:0 overruns:0 frame:0
          TX packets:25 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:7211 (7.0 KiB)  TX bytes:1854 (1.8 KiB)

veth52b4f5a Link encap:Ethernet  HWaddr 16:A2:A8:76:56:69  
          inet6 addr: fe80::14a2:a8ff:fe76:5669/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:1216 (1.1 KiB)

vetha5aaccf Link encap:Ethernet  HWaddr D2:EF:60:62:A1:21  
          inet6 addr: fe80::d0ef:60ff:fe62:a121/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:2012 (1.9 KiB)

root@predixedge:/mnt/data/APPROOT/bin# [  760.023941] kgoose: RAW socket ready for iface: eth1, index: 6, MAC: 0:a0:f4:de:ad:cc
ifconfig eth1 down
[  767.405506] prueth pruss2_eth eth1: Link is Down
[  767.418687] pruss 4b280000.pruss: unconfigured system_events[63-0] = 00000600,04500000
[  767.426722] pruss 4b280000.pruss: unconfigured host_intr = 0x00000155
[  767.433468] remoteproc remoteproc5: stopped remote processor 4b2b4000.pru
[  767.445174] net eth1: stopped
root@predixedge:/mnt/data/APPROOT/bin# [  773.963914] sched: RT throttling activated
[  774.700198] irq 197: nobody cared (try booting with the "irqpoll" option)
[  774.700206] CPU: 0 PID: 30 Comm: irq/41-48051000 Tainted: G           O      4.19.79-rt28-g5baf382c8f #1
[  774.700208] Hardware name: Generic DRA72X (Flattened Device Tree)
[  774.700211] Backtrace: 
[  774.700228] [<c020c4cc>] (dump_backtrace) from [<c020c804>] (show_stack+0x18/0x1c)
[  774.700234]  r7:000000c5 r6:00000000 r5:00000000 r4:cfbf2100
[  774.700243] [<c020c7ec>] (show_stack) from [<c0ae6780>] (dump_stack+0x24/0x28)
[  774.700252] [<c0ae675c>] (dump_stack) from [<c0269c5c>] (__report_bad_irq+0x40/0x10c)
[  774.700260] [<c0269c1c>] (__report_bad_irq) from [<c0269a64>] (note_interrupt+0x114/0x29c)
[  774.700266]  r9:60030013 r8:00000002 r7:000000c5 r6:00000000 r5:00000000 r4:cfbf2100
[  774.700274] [<c0269950>] (note_interrupt) from [<c0266f74>] (handle_irq_event_percpu+0x9c/0xa4)
[  774.700280]  r10:c1606888 r9:60030013 r8:00000002 r7:cfbf2100 r6:00000000 r5:c1606888
[  774.700282]  r4:00000000 r3:00000000
[  774.700289] [<c0266ed8>] (handle_irq_event_percpu) from [<c0266fe0>] (handle_irq_event+0x64/0xa4)
[  774.700293]  r8:ffffe000 r7:00000000 r6:fa05102c r5:00000001 r4:cfbf2100
[  774.700300] [<c0266f7c>] (handle_irq_event) from [<c026a964>] (handle_level_irq+0xd0/0x180)
[  774.700303]  r5:00000001 r4:cfbf2100
[  774.700310] [<c026a894>] (handle_level_irq) from [<c0265ea8>] (generic_handle_irq+0x2c/0x3c)
[  774.700313]  r5:00000001 r4:df236e40
[  774.700323] [<c0265e7c>] (generic_handle_irq) from [<c06a9890>] (omap_gpio_irq_handler+0x134/0x218)
[  774.700332] [<c06a975c>] (omap_gpio_irq_handler) from [<c02682b0>] (irq_forced_thread_fn+0x28/0xa0)
[  774.700338]  r9:00000000 r8:00000001 r7:c0268288 r6:ffffe000 r5:df239240 r4:df235600
[  774.700344] [<c0268288>] (irq_forced_thread_fn) from [<c02685f0>] (irq_thread+0x124/0x24c)
[  774.700348]  r7:c0268288 r6:ffffe000 r5:df235600 r4:df239240
[  774.700355] [<c02684cc>] (irq_thread) from [<c0248358>] (kthread+0x158/0x160)
[  774.700361]  r10:df089a38 r9:c02684cc r8:df239240 r7:df25a000 r6:00000000 r5:df239100
[  774.700363]  r4:df2396c0
[  774.700369] [<c0248200>] (kthread) from [<c02010e0>] (ret_from_fork+0x14/0x34)
[  774.700372] Exception stack(0xdf25bfb0 to 0xdf25bff8)
[  774.700376] bfa0:                                     00000000 00000000 00000000 00000000
[  774.700381] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  774.700386] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  774.700391]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0248200
[  774.700393]  r4:df239100
[  774.700394] handlers:
[  774.700400] [<b762c41a>] irq_default_primary_handler threaded [<d253e99f>] phy_interrupt
[  774.700411] Disabling IRQ #197

  • Hi:

                   I have seen a couple of things:

     

                   1.- Interface crash after down:

    I have modified some piece of code in dp83822.c as it seems that it is not disabling interrupts correctly

    in function  “dp83822_config_intr()”:

     

                   } else {

                                  err = phy_write(phydev, MII_DP83822_MISR1, 0);

                                  if (err < 0)

                                                return err;

     

    -                            err = phy_write(phydev, MII_DP83822_MISR1, 0);

    +                           err = phy_write(phydev, MII_DP83822_MISR2, 0);

                                  if (err < 0)

                                                return err;

     

                                  physcr_status = phy_read(phydev, MII_DP83822_PHYSCR);

                                  if (physcr_status < 0)

                                                return physcr_status;

     

    -                            physcr_status &= ~DP83822_PHYSCR_INTEN;

    +                           physcr_status &= ~(DP83822_PHYSCR_INT_OE | DP83822_PHYSCR_INTEN);

                   }

    With this modification, phyter can work in IRQ mode instead of POLL mode and no crash happens.

     

                   2.- I have a kernel socket for sending/receiving GOOSE mesages.

    If there is high GOOSE traffic and  I unplug/plug back the cable I see that that interface

    stops sending gooses and normal messages but continues to receive any kind of messages.

    If there is high traffic but not GOOSE traffic  i.e. broadcast high traffic the interface works fine.

    I debugged with printks the function “kgoose_socket_sendmsg()” and have seen that “kernel_sendmsg()”

    is called even when no messages are sent out in the wire.

     

    Regards

    Billa

  • Hi:

        After some investigation here are my comments:

       1.- This behavior does not happen in GMAC port. It is not affected by high tx/rx traffic.

       2.- PRU ports continue to receive high traffic messages, but they don´t send anything out any more.

       3.- The only way to recover PRU ports are by doing "ifconfig down" and then "ifconfig up".

            In this process the PRU firmwares are loaded again, so they start from an initial conditions.

       4. After debugging the code, the issue seems to be in "prueth.c" in the transmission part.

          Here below you have this piece of code in function "emac_ndo_start_xmit()":

        qid = prueth_get_tx_queue_id(emac->prueth, skb);
        if (emac->port_id == PRUETH_PORT_MII0) {
            /* packet sent on MII0 */
            ret = prueth_tx_enqueue(emac, skb, PRUETH_PORT_QUEUE_MII0,
                        qid);
        } else if (emac->port_id == PRUETH_PORT_MII1) {
            /* packet sent on MII1 */
            ret = prueth_tx_enqueue(emac, skb, PRUETH_PORT_QUEUE_MII1,
                        qid);
        } else {
            goto fail_tx; /* switch mode not supported yet */
        }

        if (ret) {
            if (ret != -ENOBUFS && ret != -EBUSY) {
                if (netif_msg_tx_err(emac) && net_ratelimit())
                    netdev_err(ndev,
                           "packet queue failed: %d\n", ret);
                goto fail_tx;
            } else {
                return NETDEV_TX_BUSY;
            }
        }

      Comments about this piece of code:

              1.- If I run this code, when there is high tx traffic and unplugging/plugging the cable, the system goes

                   very slowly and the message "RT Throttling actived" appears in the console. After a short

                   time, linux crashes.

                   To avoid this I have to replace "return NETDEV_TX_BUSY" with "goto fail_tx".

    2.- When the problem comes up I see that function "ret = prueth_tx_enqueue()" is returning "-ENOBUFS".

          and no messages are sent out. Reception is fine in this situation; it still continues to receive messages.

         The only way I found to get out of this situation and come back to a normal situation is by issuing "ifconfig eth1 down"

         and then "ifconfig eth1 up". (The same for eth2 interface).


    Regards

    Billa

  • Hi Billa,

    Sorry for the delay, opening this thread after some time.

    Are you still facing this issue ? Were you able to root cause ?

    Regards

    Vineet