DRA821U: During CPSW2g communication, if you plug and unplug the cable in CPSWng, the interrupt may not be obtained.

Part Number: DRA821U
Other Parts Discussed in Thread: J7200XSOMXEVM

Tool/software:

Hi Team,

We have designed a custom board based on the J7200XSOMXEVM reference design.

We are developing the software for the custom board with PROCESSOR-SDK-LINUX-RT-J7200 (10.00.07.03).

The custom board uses three Ethernet ports: CPSW2g (MCU-RMII1) and CPSWng (RGMII2, RGMII3).

CPSW2g is used for EtherCAT communication, and CPSWng performs general-purpose IP communication.

During CPSW2g communication, if you disconnect and connect the cable in CPSWng, a frame loss may occur in EtherCAT communication.

When we used ftrace to obtain trace data when frame loss occurred and when it did not, we found the following differences.

*image(left:frame loss occurred, right:frame loss not occurred)

Frame loss does not occur all the time, but occurs once every few times.

I plugged and unplugged the cable five times and attached the trace data when a frame lost occurred on the fifth attempt.

err_sched_switch.dat

Could this be because CPSW2g and CPSWng use the same driver (am65-cpsw-nuss.c)?
Is there a way to solve this?

Regards,

mizutani

  • Hi,

    Could this be because CPSW2g and CPSWng use the same driver (am65-cpsw-nuss.c)?

    It should not be because of same Driver used for CPSW2G and CPSWnG.

    Can you check CPSW statistics any errors during cable unplug and plug.

    Best Regards,
    Sudheer

  • Hi

    Further investigation revealed that CPSW2g is an EtherCAT main device, and the issue seems to be caused by the frame transmission process not running. However, the reason why the process stops running when the CPSWng cable is unplugged and plugged back in remains unclear.

    The results of tracing the preemptirq event are shown below.

    *image(left:frame loss not occurred, right:frame loss occurred)

    Do these results tell us anything? And if so, how should we proceed with further investigation?

    Regards,

    mizutani

  • Hi,

    Have you observed any Link down for CPSW2G? If so, it can stop the transmission.
    Other wise no meaning of stop CPSW2G transmission process.

    Can you please share the Linux log with us while above observation.

    Best Regards,
    Sudheer

  • Hi

    I recorded the kernel logs during the phenomenon. It appears that a link down of CPSW2G did not occur.

    Also, this phenomenon seems to occur both during link up and link down of CPSWng.

    The logs for each timing are shown below.

    link up *

    *** Start plugging and unplugging the cable ***
    Dec 26 11:44:59 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:44:59 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:44:59 Device systemd-networkd[639]: ethernet: DHCP lease lost
    Dec 26 11:44:59 Device avahi-daemon[596]: Withdrawing address record for 192.168.22.192 on ethernet.
    Dec 26 11:44:59 Device my_app1[660]: [2024/12/26 11:44:59:3668] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:44:59 Device avahi-daemon[596]: Leaving mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:44:59 Device avahi-daemon[596]: Interface ethernet.IPv4 no longer relevant for mDNS.
    Dec 26 11:44:59 Device my_app2[644]: [2024/12/26 11:44:59:5158] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:45:03 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Up - 1Gbps/Full - flow control rx/tx
    Dec 26 11:45:03 Device systemd-networkd[639]: ethernet: Gained carrier
    Dec 26 11:45:03 Device systemd-networkd[639]: ethernet: DHCPv4 address 192.168.22.192/24, gateway 192.168.22.1 acquired from 192.168.22.1
    Dec 26 11:45:03 Device avahi-daemon[596]: Joining mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:45:03 Device avahi-daemon[596]: New relevant interface ethernet.IPv4 for mDNS.
    Dec 26 11:45:03 Device avahi-daemon[596]: Registering new address record for 192.168.22.192 on ethernet.IPv4.
    Dec 26 11:45:07 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:45:07 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:45:07 Device avahi-daemon[596]: Withdrawing address record for 192.168.22.192 on ethernet.
    Dec 26 11:45:07 Device my_app1[660]: [2024/12/26 11:45:07:5598] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:45:07 Device systemd-networkd[639]: ethernet: DHCP lease lost
    Dec 26 11:45:07 Device avahi-daemon[596]: Leaving mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:45:07 Device avahi-daemon[596]: Interface ethernet.IPv4 no longer relevant for mDNS.
    Dec 26 11:45:07 Device my_app2[644]: [2024/12/26 11:45:07:7086] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:45:12 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Up - 1Gbps/Full - flow control rx/tx
    Dec 26 11:45:12 Device systemd-networkd[639]: ethernet: Gained carrier
    Dec 26 11:45:12 Device systemd-networkd[639]: ethernet: DHCPv4 address 192.168.22.192/24, gateway 192.168.22.1 acquired from 192.168.22.1
    Dec 26 11:45:12 Device avahi-daemon[596]: Joining mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:45:12 Device avahi-daemon[596]: New relevant interface ethernet.IPv4 for mDNS.
    Dec 26 11:45:12 Device avahi-daemon[596]: Registering new address record for 192.168.22.192 on ethernet.IPv4.
    Dec 26 11:45:17 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:45:17 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:45:17 Device systemd-networkd[639]: ethernet: DHCP lease lost
    Dec 26 11:45:17 Device my_app1[660]: [2024/12/26 11:45:17:7986] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:45:17 Device avahi-daemon[596]: Withdrawing address record for 192.168.22.192 on ethernet.
    Dec 26 11:45:17 Device avahi-daemon[596]: Leaving mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:45:17 Device avahi-daemon[596]: Interface ethernet.IPv4 no longer relevant for mDNS.
    Dec 26 11:45:17 Device my_app2[644]: [2024/12/26 11:45:17:9479] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:45:20 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Up - 1Gbps/Full - flow control rx/tx
    Dec 26 11:45:20 Device systemd-networkd[639]: ethernet: Gained carrier
    Dec 26 11:45:20 Device systemd-networkd[639]: ethernet: DHCPv4 address 192.168.22.192/24, gateway 192.168.22.1 acquired from 192.168.22.1
    Dec 26 11:45:20 Device avahi-daemon[596]: Joining mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:45:20 Device avahi-daemon[596]: New relevant interface ethernet.IPv4 for mDNS.
    Dec 26 11:45:20 Device avahi-daemon[596]: Registering new address record for 192.168.22.192 on ethernet.IPv4.
    Dec 26 11:45:24 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:45:24 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:45:24 Device systemd-networkd[639]: ethernet: DHCP lease lost
    Dec 26 11:45:24 Device avahi-daemon[596]: Withdrawing address record for 192.168.22.192 on ethernet.
    Dec 26 11:45:24 Device my_app1[660]: [2024/12/26 11:45:24:9702] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:45:24 Device avahi-daemon[596]: Leaving mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:45:24 Device avahi-daemon[596]: Interface ethernet.IPv4 no longer relevant for mDNS.
    Dec 26 11:45:25 Device my_app2[644]: [2024/12/26 11:45:25:1188] N: rops_handle_POLLIN_netlink: DELADDR
    
    *** An EtherCAT communication error occurs after this point. ***
    Dec 26 11:45:31 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Up - 1Gbps/Full - flow control rx/tx
    Dec 26 11:45:31 Device systemd-networkd[639]: ethernet: Gained carrier
    Dec 26 11:45:31 Device systemd-networkd[639]: ethernet: DHCPv4 address 192.168.22.192/24, gateway 192.168.22.1 acquired from 192.168.22.1
    Dec 26 11:45:31 Device avahi-daemon[596]: Joining mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:45:31 Device avahi-daemon[596]: New relevant interface ethernet.IPv4 for mDNS.
    Dec 26 11:45:31 Device avahi-daemon[596]: Registering new address record for 192.168.22.192 on ethernet.IPv4.
    
    

    link down *

    *** Start plugging and unplugging the cable ***
    Dec 26 11:48:13 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:48:13 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:48:13 Device systemd-networkd[639]: ethernet: DHCP lease lost
    Dec 26 11:48:13 Device avahi-daemon[596]: Withdrawing address record for 192.168.22.192 on ethernet.
    Dec 26 11:48:13 Device my_app1[660]: [2024/12/26 11:48:13:9323] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:48:13 Device avahi-daemon[596]: Leaving mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:48:13 Device avahi-daemon[596]: Interface ethernet.IPv4 no longer relevant for mDNS.
    Dec 26 11:48:14 Device my_app2[644]: [2024/12/26 11:48:14:0808] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:48:20 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Up - 1Gbps/Full - flow control rx/tx
    Dec 26 11:48:20 Device systemd-networkd[639]: ethernet: Gained carrier
    Dec 26 11:48:20 Device systemd-networkd[639]: ethernet: DHCPv4 address 192.168.22.192/24, gateway 192.168.22.1 acquired from 192.168.22.1
    Dec 26 11:48:20 Device avahi-daemon[596]: Joining mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:48:20 Device avahi-daemon[596]: New relevant interface ethernet.IPv4 for mDNS.
    Dec 26 11:48:20 Device avahi-daemon[596]: Registering new address record for 192.168.22.192 on ethernet.IPv4.
    Dec 26 11:48:24 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:48:24 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:48:24 Device systemd-networkd[639]: ethernet: DHCP lease lost
    Dec 26 11:48:24 Device avahi-daemon[596]: Withdrawing address record for 192.168.22.192 on ethernet.
    Dec 26 11:48:24 Device my_app1[660]: [2024/12/26 11:48:24:1671] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:48:24 Device avahi-daemon[596]: Leaving mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:48:24 Device avahi-daemon[596]: Interface ethernet.IPv4 no longer relevant for mDNS.
    Dec 26 11:48:24 Device my_app2[644]: [2024/12/26 11:48:24:3159] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:48:29 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Up - 1Gbps/Full - flow control rx/tx
    Dec 26 11:48:29 Device systemd-networkd[639]: ethernet: Gained carrier
    Dec 26 11:48:29 Device systemd-networkd[639]: ethernet: DHCPv4 address 192.168.22.192/24, gateway 192.168.22.1 acquired from 192.168.22.1
    Dec 26 11:48:29 Device avahi-daemon[596]: Joining mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:48:29 Device avahi-daemon[596]: New relevant interface ethernet.IPv4 for mDNS.
    Dec 26 11:48:29 Device avahi-daemon[596]: Registering new address record for 192.168.22.192 on ethernet.IPv4.
    
    *** An EtherCAT communication error occurs after this point. ***
    Dec 26 11:48:35 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:48:35 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:48:35 Device my_app1[660]: [2024/12/26 11:48:35:4340] N: rops_handle_POLLIN_netlink: DELADDR
    Dec 26 11:48:35 Device avahi-daemon[596]: Withdrawing address record for 192.168.22.192 on ethernet.
    Dec 26 11:48:35 Device systemd-networkd[639]: ethernet: DHCP lease lost
    Dec 26 11:48:35 Device avahi-daemon[596]: Leaving mDNS multicast group on interface ethernet.IPv4 with address 192.168.22.192.
    Dec 26 11:48:35 Device avahi-daemon[596]: Interface ethernet.IPv4 no longer relevant for mDNS.
    Dec 26 11:48:35 Device my_app2[644]: [2024/12/26 11:48:35:5828] N: rops_handle_POLLIN_netlink: DELADDR

    Regards,

    mizutani

  • HI,

    From log it look like systemd services created for ethernet interface,

    What is is ethernet here, is it CPSWnG?
    networkd-639 ?
    Dec 26 11:44:59 Device kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down
    Dec 26 11:44:59 Device systemd-networkd[639]: ethernet: Lost carrier
    Dec 26 11:44:59 Device systemd-networkd[639]: ethernet: DHCP lease lost

    If you have same file service file both CPSW2G & CPSWnG, can you make it different files for each of them.

    Above looks like it is not related to driver of link issue, some services might be blocking transmission on CPSW2G.

    Best Regards,
    Sudheer

  • Hi

    What is is ethernet here, is it CPSWnG?

    That's right.

    If you have same file service file both CPSW2G & CPSWnG, can you make it different files for each of them.

    Does this mean "/etc/systemd/network/*.network"?
    If so, you have different interfaces defined in different files.

    Above looks like it is not related to driver of link issue, some services might be blocking transmission on CPSW2G.

    In our system, the following kernel’s command-line parameters are set so that the EtherCAT application occupies CPU1.

    "nohz_full=1 isolcpus=1 rcu_nocbs=1 rcu_nocb_poll irqaffinity=0 nosoftlockup"
    Also CPSW2G interrupt is assigned to CPU1 by smp_affinity.
    With these configurations, we believe the EtherCAT application is isolated from potential interference from interfaces other than CPSW2G. Is our understanding accurate?

    Regards,

    mizutani

  • Hi,

    If you have same file service file both CPSW2G & CPSWnG, can you make it different files for each of them.

    Does this mean "/etc/systemd/network/*.network"?
    If so, you have different interfaces defined in different files.

    yes, I mean maintain for different for CPSW2G and CPSWnG.

    Above looks like it is not related to driver of link issue, some services might be blocking transmission on CPSW2G.

    In our system, the following kernel’s command-line parameters are set so that the EtherCAT application occupies CPU1.

    "nohz_full=1 isolcpus=1 rcu_nocbs=1 rcu_nocb_poll irqaffinity=0 nosoftlockup"
    Also CPSW2G interrupt is assigned to CPU1 by smp_affinity.
    With these configurations, we believe the EtherCAT application is isolated from potential interference from interfaces other than CPSW2G. Is our understanding accurate?

    Can you please share all the . network and .netdev files used by systemd and also journalctl log in issue case vs non issue case.

    Best Regards,
    Sudheer

  • Hi

    Attached below are all the .network files that systemd uses.

    7870.src.zip

    The journal log at the time the problem occurred has already been answered.

    I recorded the kernel logs during the phenomenon. It appears that a link down of CPSW2G did not occur.

    Also, this phenomenon seems to occur both during link up and link down of CPSWng.

    The logs for each timing are shown below.

    These journal logs represent the period from when '~# journalctl -f' was initiated to when the EtherCAT communication failure occurred after repeatedly unplugging and plugging back in the cable.

    Also, the CPSW2G & CPSWnG logs at boot time are as follows.

    CPSW2G *

    [    0.895145] davinci_mdio 46000f00.mdio: Configuring MDIO in manual mode
    [    1.039804] davinci_mdio 46000f00.mdio: davinci mdio revision 9.7, bus freq 1000000
    [    1.249449] davinci_mdio 46000f00.mdio: phy[0]: device 46000f00.mdio:00, driver TI DP83822
    [    1.249491] am65-cpsw-nuss 46000000.ethernet: initializing am65 cpsw nuss version 0x6BA02102, cpsw version 0x6BA82102 Ports: 2 quirks:00000000
    [    1.249676] am65-cpsw-nuss 46000000.ethernet: initialized cpsw ale version 1.4
    [    1.249679] am65-cpsw-nuss 46000000.ethernet: ALE Table size 64
    [    2.231653] davinci_mdio 46000f00.mdio: Configuring MDIO in manual mode
    [    2.265800] davinci_mdio 46000f00.mdio: davinci mdio revision 9.7, bus freq 1000000
    [    2.276619] davinci_mdio 46000f00.mdio: phy[0]: device 46000f00.mdio:00, driver TI DP83822
    [    2.284935] am65-cpsw-nuss 46000000.ethernet: initializing am65 cpsw nuss version 0x6BA02102, cpsw version 0x6BA82102 Ports: 2 quirks:00000000
    [    2.287967] am65-cpsw-nuss 46000000.ethernet: initialized cpsw ale version 1.4
    [    2.287972] am65-cpsw-nuss 46000000.ethernet: ALE Table size 64
    [    2.320505] am65-cpsw-nuss 46000000.ethernet: set new flow-id-base 48
    [    7.069612] am65-cpsw-nuss 46000000.ethernet ethercat: renamed from eth0
    [   11.832200] am65-cpsw-nuss 46000000.ethernet ethercat: PHY [46000f00.mdio:00] driver [TI DP83822] (irq=POLL)
    [   11.841805] am65-cpsw-nuss 46000000.ethernet ethercat: configuring for phy/rmii link mode
    [   13.920388] am65-cpsw-nuss 46000000.ethernet ethercat: Link is Up - 100Mbps/Full - flow control off
    [   14.883914] am65-cpsw-nuss 46000000.ethernet ethercat: entered promiscuous mode

    CPSWnG  *

    [    1.534356] davinci_mdio c000f00.mdio: Configuring MDIO in manual mode
    [    1.695804] davinci_mdio c000f00.mdio: davinci mdio revision 9.7, bus freq 1000000
    [    1.883913] davinci_mdio c000f00.mdio: phy[0]: device c000f00.mdio:00, driver TI DP83867
    [    1.883948] am65-cpsw-nuss c000000.ethernet: initializing am65 cpsw nuss version 0x6BA02102, cpsw version 0x6BA82102 Ports: 5 quirks:00000000
    [    1.912011] am65-cpsw-nuss c000000.ethernet: initialized cpsw ale version 1.4
    [    1.912016] am65-cpsw-nuss c000000.ethernet: ALE Table size 512
    [    2.332718] davinci_mdio c000f00.mdio: Configuring MDIO in manual mode
    [    2.368804] davinci_mdio c000f00.mdio: davinci mdio revision 9.7, bus freq 1000000
    [    2.378959] davinci_mdio c000f00.mdio: phy[0]: device c000f00.mdio:00, driver TI DP83867
    [    2.387081] am65-cpsw-nuss c000000.ethernet: initializing am65 cpsw nuss version 0x6BA02102, cpsw version 0x6BA82102 Ports: 5 quirks:00000000
    [    2.400189] am65-cpsw-nuss c000000.ethernet: initialized cpsw ale version 1.4
    [    2.404802] am65-cpsw-nuss c000000.ethernet: ALE Table size 512
    [    2.423088] am65-cpsw-nuss c000000.ethernet: set new flow-id-base 60
    [    6.961344] am65-cpsw-nuss c000000.ethernet PC: renamed from eth2
    [    7.025114] am65-cpsw-nuss c000000.ethernet ethernet: renamed from eth1
    [   11.926422] am65-cpsw-nuss c000000.ethernet ethernet: PHY [c000f00.mdio:00] driver [TI DP83867] (irq=POLL)
    [   11.939825] am65-cpsw-nuss c000000.ethernet ethernet: configuring for phy/rgmii-rxid link mode
    [   11.974333] am65-cpsw-nuss c000000.ethernet PC: configuring for fixed/rgmii link mode
    [   11.983472] am65-cpsw-nuss c000000.ethernet PC: Link is Up - 1Gbps/Full - flow control off
    [   16.033311] am65-cpsw-nuss c000000.ethernet ethernet: Link is Up - 1Gbps/Full - flow control rx/tx

    Regards,

    mizutani

  • HI,

    Let me check the configuration files and logs and get back to you soon.

    Best Regards,
    Sudheer

  • Hi,

    Systemd files are very simple and independent for each network interface.

    Let us check on our side with Linux-RT once, and update you in early next week.

    Best Regards,
    Sudheer

  • Hi,

    Do you have any update on this issue?

    Regards,

    mizutani

  • Hi, 

    Sorry for the delay, due to other high priority activities I couldn't able to check this. 

    Will check in this week and update you as early as possible.

    Best Regards, 

    Sudherr

  • Hi,

    I apologize for the repeated follow-up, but could you please provide an update on this matter?

    Regards,

  • Hi,

    Sorry for the delay.

    I have tested the RT-Link SDK by enabling both CPSW2G and CPSW9G from Linux. During the testing, I did not observe any traffic interruption on CPSW2G when disconnecting and reconnecting the cable to CPSW9G.

    I am running Iperf on the CPSW2G interface, and during this process, I am unplugging and reconnecting the CPSW9G Port connection.

    Best Regards,
    Sudheer

  • Hi

    In our evaluation environment, EtherCAT communication is performed at 1 ms, and frame loss occurs when a cable is disconnected or reconnected.

    Looking at the kernel trace, the blocking time is only 3ms to 5ms, and then communication resumes.

    I have tested the RT-Link SDK by enabling both CPSW2G and CPSW9G from Linux. During the testing, I did not observe any traffic interruption on CPSW2G when disconnecting and reconnecting the cable to CPSW9G.

    When you say no traffic interruptions, does that mean that even this slight blockage doesn't occur?

    Regards,

  • Hi

    I have tested the RT-Link SDK by enabling both CPSW2G and CPSW9G from Linux. During the testing, I did not observe any traffic interruption on CPSW2G when disconnecting and reconnecting the cable to CPSW9G.

    When you say no traffic interruptions, does that mean that even this slight blockage doesn't occur?

    I haven't observed any drop in iperf data.

    Looking at the kernel trace, the blocking time is only 3ms to 5ms, and then communication resumes.

    As delay could be very minimal, may not catch in my iperf data.
    Can you share how tracing has done, will check on our side.

    Best Regards,
    Sudheer

  • Hi

    After executing the following command, I unplugged and plugged the LAN cable.

    ~# trace-cmd record --file-version 6 -e sched_switch -o /tmp/sched_switch.dat

    As explained before, this phenomenon does not always occur when the cable is unplugged or plugged in. It occurs once every few times.
    We determine the occurrence of the phenomenon by an EtherCAT communication error and stop the kernel trace.

    The intervals between unplugging and plugging in the cable are as follows.
    1. Unplug the cable
    2. "kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Down" is output
    3. Plug in the cable
    4. "kernel: am65-cpsw-nuss c000000.ethernet ethernet: Link is Up" is output

    Regards,
    mizutani

  • Hi,

    As explained before, this phenomenon does not always occur when the cable is unplugged or plugged in. It occurs once every few times.
    We determine the occurrence of the phenomenon by an EtherCAT communication error and stop the kernel trace.

    You mean issue is due to error in EtherCAT communication not because of cable unplug and reconnect?

    Best Regards,
    Sudheer

  • Hi

    The problem is that an EtherCAT communication error occurs when the cable is unplugged and plugged back in.

    As explained before, this phenomenon does not always occur when the cable is unplugged or plugged in. It occurs once every few times.
    We determine the occurrence of the phenomenon by an EtherCAT communication error and stop the kernel trace.

    You mean issue is due to error in EtherCAT communication not because of cable unplug and reconnect?

    This means when to stop kernel tracing with 'Ctrl+C'.

    Regards,
    mizutani

  • Hi,

    The problem is that an EtherCAT communication error occurs when the cable is unplugged and plugged back in.

    Is it an application right, are you trying to connect CPSW9G interface?
    Is PHY using for CPSW2G and CPSW9G are different? are they handled via independent MDIO interface? 

    Best Regards,
    Sudheer

  • Hi

    Is it an application right, are you trying to connect CPSW9G interface?

    CPSW2g is used for EtherCAT communication, and CPSWng performs general-purpose IP communication.

    During CPSW2g communication, if you disconnect and connect the cable in CPSWng, a frame loss may occur in EtherCAT communication.

    The process for EtherCAT communication and the process for Ethernet communication are independent.

    Is PHY using for CPSW2G and CPSW9G are different? are they handled via independent MDIO interface? 

    CPSW2g uses DP83822 as PHY, and CPSWng uses DP83867 as PHY.

    CPSW2g uses MCU_MDIO0, and CPSWng uses MDIO0.

    Regards,
    mizutani

  • Hi,

    Thank you for sharing the details, we understood both CPSW2G and CPSWnG are running independent use cases and MDIO used for PHYs also different.

    Let me check internally about this.

    Best Regards,
    Sudheer

  • Hello Mizutani,

    Can you try adding:
    RequiredForOnline=no
    within the [Link] section of:
    20-eth1.link
    30-eth2.link

    It might be possible that systemd-networkd is considering the network to be unreachable momentarily and therefore stopping the traffic from eth0.

    Regards,
    Siddharth.

  • Hi,

    I added the parameters you gave me, but the problem still occurs.

    Is there anything else I can try?

    Regards,
    mizutani

  • Also CPSW2G interrupt is assigned to CPU1 by smp_affinity.

    Have you ensured that all other interrupts have their affinity set to CPU0? Also, what is the value of effective_affinity? Can you please share the output of:
    for j in $(ls /proc/irq); do echo "${j}: $(cat /proc/irq/${j}/effective_affinity)"; done;

    Regards,
    Siddharth.

  • Hi,

    The interrupt information is given below.

    ~# cat /proc/interrupts
               CPU0       CPU1
     11:      95965     290886     GICv3  30 Level     arch_timer
     14:       2032          0     GICv3  69 Level     32c00000.mailbox thr_011
     19:      10990          0     GICv3  35 Level     mmc0
     33:        297          0     GICv3 928 Level     42120000.i2c
     34:     312441          0     GICv3 232 Level     2000000.i2c
     35:          0      75104  MSI-INTA 15401056 Level     46000000.ethernet-tx0
     44:          0     144746  MSI-INTA 15401065 Level     46000000.ethernet
     45:          0          0  MSI-INTA 15401066 Level     285c0000.dma-controller chan0
     46:          0          0  MSI-INTA 15401067 Level     285c0000.dma-controller chan1
     47:          0          0  MSI-INTA 15401068 Level     285c0000.dma-controller chan2
     67:       1024          0  MSI-INTA 13828216 Level     31150000.dma-controller chan0
     68:          2          0  MSI-INTA 13828217 Level     c000000.ethernet-tx0
     69:          6          0  MSI-INTA 13828218 Level     c000000.ethernet-tx1
     70:          2          0  MSI-INTA 13828219 Level     c000000.ethernet-tx2
     71:          5          0  MSI-INTA 13828220 Level     c000000.ethernet-tx3
     72:         11          0  MSI-INTA 13828221 Level     c000000.ethernet-tx4
     73:          0          0  MSI-INTA 13828222 Level     c000000.ethernet-tx5
     74:          9          0  MSI-INTA 13828223 Level     c000000.ethernet-tx6
     75:         32          0  MSI-INTA 13828224 Level     c000000.ethernet-tx7
     77:        171          0  MSI-INTA 13828226 Level     c000000.ethernet
     78:          0          0  MSI-INTA 13828227 Level     31150000.dma-controller chan1
    267:          0          0     GICv3 878 Level     40a00000.serial
    268:        747          0     GICv3 224 Level     2800000.serial
    269:          0          0     GICv3 225 Level     2810000.serial
    270:          0          0     GICv3 872 Level     47040000.spi
    361:          0          0      GPIO  84 Edge    -davinci_gpio  tps6594-0-0x48, tps6594-2-0x4c
    368:          0          0      GPIO   1 Edge    -davinci_gpio  powersignal
    369:          0          0      GPIO   2 Edge    -davinci_gpio  powersignal
    370:         39          0      GPIO   3 Edge    -davinci_gpio  degitalinput
    371:         38          0      GPIO   4 Edge    -davinci_gpio  degitalinput
    425:          0          0      GPIO  58 Edge    -davinci_gpio  pwm-fan
    522:          0          0  tps6594-0-0x48  88 Edge      alarm
    662:          0          0  MSI-INTA 13893765 Edge      31150000.dma-controller chan1
    704:          0          0     GICv3  36 Level     mmc1
    705:          0          0     GICv3 892 Level     TI-am335x-adc.10.auto
    709:          0          0     GICv3 128 Level     xhci-hcd:usb1
    711:          0          0     GICv3 152 Level     6000000.usb
    IPI0:       439     156065       Rescheduling interrupts
    IPI1:         0        726       Function call interrupts
    IPI2:         0          0       CPU stop interrupts
    IPI3:         0          0       CPU stop (for crash dump) interrupts
    IPI4:         0          0       Timer broadcast interrupts
    IPI5:       137     135316       IRQ work interrupts
    IPI6:         0

    Have you ensured that all other interrupts have their affinity set to CPU0? Also, what is the value of effective_affinity? Can you please share the output of:
    for j in $(ls /proc/irq); do echo "${j}: $(cat /proc/irq/${j}/effective_affinity)"; done;


    ~# for j in $(ls /proc/irq); do echo "${j}: $(cat /proc/irq/${j}/effective_affinity)"; done;
    1: 0
    10: 0
    11: 0
    12: 0
    13: 0
    14: 1
    19: 1
    2: 0
    267: 0
    268: 1
    269: 0
    270: 1
    3: 0
    33: 1
    34: 1
    35: 2
    36: 2
    361: 0
    368: 0
    369: 0
    37: 1
    370: 0
    371: 0
    38: 2
    39: 1
    4: 0
    40: 2
    41: 1
    42: 2
    425: 0
    44: 2
    45: 1
    46: 1
    47: 1
    5: 0
    522: 0
    6: 0
    662: 1
    67: 1
    68: 1
    69: 1
    7: 0
    70: 1
    704: 1
    705: 1
    709: 1
    71: 1
    711: 1
    72: 1
    73: 1
    74: 1
    75: 1
    77: 1
    78: 1
    8: 0
    9: 0

    Regards,
    mizutani

  • When the issue occurs, the EtherCAT task is not running on CPU1, but there seems to be another process - possibly systemd-networkd - that is running on CPU1.

    To set the affinity of systemd-networkd to CPU0:
    1. Run the following command:
    systemctl edit systemd-networkd
    2. Add the following at the end of the file:
    CPUAffinity=0
    3. Save the file and then run:
    systemctl restart networkd-dispatcher.service
    4. Try to recreate the problem

    Regards,
    Siddharth.

  • Hi,

    I added "CPUAffinity=0" to the [Service] section as follows:

    ~# systemctl cat systemd-networkd
    # /usr/lib/systemd/system/systemd-networkd.service
    #  SPDX-License-Identifier: LGPL-2.1-or-later
    #
    #  This file is part of systemd.
    #
    #  systemd is free software; you can redistribute it and/or modify it
    #  under the terms of the GNU Lesser General Public License as published by
    #  the Free Software Foundation; either version 2.1 of the License, or
    #  (at your option) any later version.
    
    [Unit]
    Description=Network Configuration
    Documentation=man:systemd-networkd.service(8)
    Documentation=man:org.freedesktop.network1(5)
    ConditionCapability=CAP_NET_ADMIN
    DefaultDependencies=no
    # systemd-udevd.service can be dropped once tuntap is moved to netlink
    After=systemd-networkd.socket systemd-udevd.service network-pre.target systemd-sysusers.service systemd-sysctl.service
    Before=network.target multi-user.target shutdown.target initrd-switch-root.target
    Conflicts=shutdown.target initrd-switch-root.target
    Wants=systemd-networkd.socket network.target
    
    [Service]
    AmbientCapabilities=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW
    BusName=org.freedesktop.network1
    CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW
    DeviceAllow=char-* rw
    ExecStart=!!/usr/lib/systemd/systemd-networkd
    FileDescriptorStoreMax=512
    LockPersonality=yes
    MemoryDenyWriteExecute=yes
    NoNewPrivileges=yes
    ProtectProc=invisible
    ProtectClock=yes
    ProtectControlGroups=yes
    ProtectHome=yes
    ProtectKernelLogs=yes
    ProtectKernelModules=yes
    ProtectSystem=strict
    Restart=on-failure
    RestartKillSignal=SIGUSR2
    RestartSec=0
    RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6 AF_PACKET
    RestrictNamespaces=yes
    RestrictRealtime=yes
    RestrictSUIDSGID=yes
    RuntimeDirectory=systemd/netif
    RuntimeDirectoryPreserve=yes
    SystemCallArchitectures=native
    SystemCallErrorNumber=EPERM
    SystemCallFilter=@system-service
    Type=notify-reload
    User=systemd-network
    WatchdogSec=3min
    CPUAffinity=0
    
    [Install]
    WantedBy=multi-user.target
    Also=systemd-networkd.socket
    Alias=dbus-org.freedesktop.network1.service
    
    # The output from this generator is used by udevd and networkd. Enable it by
    # default when enabling systemd-networkd.service.
    Also=systemd-network-generator.service
    
    # We want to enable systemd-networkd-wait-online.service whenever this service
    # is enabled. systemd-networkd-wait-online.service has
    # WantedBy=network-online.target, so enabling it only has an effect if
    # network-online.target itself is enabled or pulled in by some other unit.
    Also=systemd-networkd-wait-online.service

    When I run "systemctl restart networkd-dispatcher.service", I get an error saying that the service does not exist.

    ~# systemctl restart networkd-dispatcher.service
    Failed to restart networkd-dispatcher.service: Unit networkd-dispatcher.service not found.

    Therefore, after adding the above parameters, I powered the system back on.

    but the problem still occurs.

    I have attached the kernel trace data from when the problem occurred, so please check it.

    sched_switch_edit_systemd-networkd.dat

    Regards,
    mizutani

  • Hello,

    When I run "systemctl restart networkd-dispatcher.service", I get an error saying that the service does not exist.

    Can you share the output of:
    systemctl
    to identify the entire list of services running on your device? networkd-dispatcher.service might not have been the correct service and we need to modify the service-file for its equivalent service for the CPUAffinity parameter to take effect.

    Regards,
    Siddharth.

  • Hi,

    The output of the systemctl command is shown below.

    ~# systemctl
      UNIT                                                                                                      LOAD   ACTIVE SUB       DESCRIPTION
      sys-devices-platform-bus\x40100000-2800000.serial-tty-ttyS2.device                                        loaded active plugged   /sys/devices/platform/bus@100000/2800000.serial/tty/ttyS2
      sys-devices-platform-bus\x40100000-2810000.serial-tty-ttyS3.device                                        loaded active plugged   /sys/devices/platform/bus@100000/2810000.serial/tty/ttyS3
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0boot0.device  loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot0
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0boot1.device  loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot1
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p1.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p1
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p2
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p3.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p3
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p4.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p4
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p5.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p5
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p6.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p6
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p7.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p7
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p8.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p8
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p9.device     loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p9
      sys-devices-platform-bus\x40100000-4f80000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0.device               loaded active plugged   /sys/devices/platform/bus@100000/4f80000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0
      sys-devices-platform-bus\x40100000-bus\x40100000:bus\x4028380000-40a00000.serial-tty-ttyS1.device         loaded active plugged   /sys/devices/platform/bus@100000/bus@100000:bus@28380000/40a00000.serial/tty/ttyS1
      sys-devices-platform-bus\x40100000-bus\x40100000:bus\x4028380000-46000000.ethernet-net-ethercat.device    loaded active plugged   /sys/devices/platform/bus@100000/bus@100000:bus@28380000/46000000.ethernet/net/ethercat
      sys-devices-platform-bus\x40100000-bus\x40100000:bus\x4028380000-47000000.bus-47040000.spi-spi_master-... loaded active plugged   /sys/devices/platform/bus@100000/bus@100000:bus@28380000/47000000.bus/47040000.spi/spi_master/spi0/spi...
      sys-devices-platform-bus\x40100000-c000000.ethernet-net-ethernet.device                                   loaded active plugged   /sys/devices/platform/bus@100000/c000000.ethernet/net/ethernet
      sys-devices-platform-bus\x40100000-c000000.ethernet-net-PC.device                                         loaded active plugged   /sys/devices/platform/bus@100000/c000000.ethernet/net/PC
      sys-devices-platform-serial8250-tty-ttyS0.device                                                          loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS0
      sys-devices-platform-serial8250-tty-ttyS10.device                                                         loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS10
      sys-devices-platform-serial8250-tty-ttyS11.device                                                         loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS11
      sys-devices-platform-serial8250-tty-ttyS4.device                                                          loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS4
      sys-devices-platform-serial8250-tty-ttyS5.device                                                          loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS5
      sys-devices-platform-serial8250-tty-ttyS6.device                                                          loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS6
      sys-devices-platform-serial8250-tty-ttyS7.device                                                          loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS7
      sys-devices-platform-serial8250-tty-ttyS8.device                                                          loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS8
      sys-devices-platform-serial8250-tty-ttyS9.device                                                          loaded active plugged   /sys/devices/platform/serial8250/tty/ttyS9
      sys-devices-virtual-misc-rfkill.device                                                                    loaded active plugged   /sys/devices/virtual/misc/rfkill
      sys-devices-virtual-net-tap0.device                                                                       loaded active plugged   /sys/devices/virtual/net/tap0
      sys-devices-virtual-tty-ttyp0.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp0
      sys-devices-virtual-tty-ttyp1.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp1
      sys-devices-virtual-tty-ttyp2.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp2
      sys-devices-virtual-tty-ttyp3.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp3
      sys-devices-virtual-tty-ttyp4.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp4
      sys-devices-virtual-tty-ttyp5.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp5
      sys-devices-virtual-tty-ttyp6.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp6
      sys-devices-virtual-tty-ttyp7.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp7
      sys-devices-virtual-tty-ttyp8.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp8
      sys-devices-virtual-tty-ttyp9.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttyp9
      sys-devices-virtual-tty-ttypa.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttypa
      sys-devices-virtual-tty-ttypb.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttypb
      sys-devices-virtual-tty-ttypc.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttypc
      sys-devices-virtual-tty-ttypd.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttypd
      sys-devices-virtual-tty-ttype.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttype
      sys-devices-virtual-tty-ttypf.device                                                                      loaded active plugged   /sys/devices/virtual/tty/ttypf
      sys-module-configfs.device                                                                                loaded active plugged   /sys/module/configfs
      sys-module-fuse.device                                                                                    loaded active plugged   /sys/module/fuse
      sys-subsystem-net-devices-ethercat.device                                                                 loaded active plugged   /sys/subsystem/net/devices/ethercat
      sys-subsystem-net-devices-ethernet.device                                                                 loaded active plugged   /sys/subsystem/net/devices/ethernet
      sys-subsystem-net-devices-PC.device                                                                       loaded active plugged   /sys/subsystem/net/devices/PC
      sys-subsystem-net-devices-tap0.device                                                                     loaded active plugged   /sys/subsystem/net/devices/tap0
      -.mount                                                                                                   loaded active mounted   Root Mount
      dev-hugepages.mount                                                                                       loaded active mounted   Huge Pages File System
      dev-mqueue.mount                                                                                          loaded active mounted   POSIX Message Queue File System
      opt-sys.mount                                                                                             loaded active mounted   /opt/sys
      opt-usr-tmp.mount                                                                                         loaded active mounted   /opt/usr/tmp
      opt-usr.mount                                                                                             loaded active mounted   /opt/usr
      run-user-0.mount                                                                                          loaded active mounted   /run/user/0
      sys-fs-fuse-connections.mount                                                                             loaded active mounted   FUSE Control File System
      sys-kernel-config.mount                                                                                   loaded active mounted   Kernel Configuration File System
      sys-kernel-debug.mount                                                                                    loaded active mounted   Kernel Debug File System
      sys-kernel-tracing.mount                                                                                  loaded active mounted   Kernel Trace File System
      tmp.mount                                                                                                 loaded active mounted   Temporary Directory /tmp
      var-cache.mount                                                                                           loaded active mounted   /var/cache
      var-lib.mount                                                                                             loaded active mounted   /var/lib
      var-log.mount                                                                                             loaded active mounted   /var/log
      var-spool.mount                                                                                           loaded active mounted   /var/spool
      var-volatile.mount                                                                                        loaded active mounted   /var/volatile
      systemd-ask-password-console.path                                                                         loaded active waiting   Dispatch Password Requests to Console Directory Watch
      systemd-ask-password-wall.path                                                                            loaded active waiting   Forward Password Requests to Wall Directory Watch
      init.scope                                                                                                loaded active running   System and Service Manager
      session-c1.scope                                                                                          loaded active running   Session c1 of User root
      avahi-daemon.service                                                                                      loaded active running   Avahi mDNS/DNS-SD Stack
      codesys.service                                                                                           loaded active running   Codesys Runtime Service
      dbus-broker.service                                                                                       loaded active running   D-Bus System Message Bus
      ethercat-settings.service                                                                                 loaded active exited    Configure the EtherCAT interface
      iptables.service                                                                                          loaded active exited    IPv4 Packet Filtering Framework
      kmod-static-nodes.service                                                                                 loaded active exited    Create List of Static Device Nodes
      serial-getty@ttyS2.service                                                                                loaded active running   Serial Getty on ttyS2
      status-monitor-ctrl.service                                                                               loaded active running   Status Monitor Contorol Service
      systemd-fsck-root.service                                                                                 loaded active exited    File System Check on Root Device
      systemd-journal-flush.service                                                                             loaded active exited    Flush Journal to Persistent Storage
      systemd-journald.service                                                                                  loaded active running   Journal Service
      systemd-logind.service                                                                                    loaded active running   User Login Management
      systemd-modules-load.service                                                                              loaded active exited    Load Kernel Modules
      systemd-network-generator.service                                                                         loaded active exited    Generate network units from Kernel command line
      systemd-networkd.service                                                                                  loaded active running   Network Configuration
      systemd-random-seed.service                                                                               loaded active exited    Load/Save OS Random Seed
      systemd-remount-fs.service                                                                                loaded active exited    Remount Root and Kernel File Systems
      systemd-resolved.service                                                                                  loaded active running   Network Name Resolution
      systemd-sysctl.service                                                                                    loaded active exited    Apply Kernel Variables
      systemd-tmpfiles-setup-dev-early.service                                                                  loaded active exited    Create Static Device Nodes in /dev gracefully
      systemd-tmpfiles-setup-dev.service                                                                        loaded active exited    Create Static Device Nodes in /dev
      systemd-tmpfiles-setup.service                                                                            loaded active exited    Create Volatile Files and Directories
      systemd-udev-trigger.service                                                                              loaded active exited    Coldplug All udev Devices
      systemd-udevd.service                                                                                     loaded active running   Rule-based Manager for Device Events and Files
      systemd-update-utmp.service                                                                               loaded active exited    Record System Boot/Shutdown in UTMP
      systemd-user-sessions.service                                                                             loaded active exited    Permit User Sessions
      systemd-userdbd.service                                                                                   loaded active running   User Database Manager
      systemd-vconsole-setup.service                                                                            loaded active exited    Virtual Console Setup
      user-runtime-dir@0.service                                                                                loaded active exited    User Runtime Directory /run/user/0
      user@0.service                                                                                            loaded active running   User Manager for UID 0
      var-volatile-cache.service                                                                                loaded active exited    Bind mount volatile /var/cache
      var-volatile-lib.service                                                                                  loaded active exited    Bind mount volatile /var/lib
      var-volatile-spool.service                                                                                loaded active exited    Bind mount volatile /var/spool
      -.slice                                                                                                   loaded active active    Root Slice
      system-modprobe.slice                                                                                     loaded active active    Slice /system/modprobe
      system-serial\x2dgetty.slice                                                                              loaded active active    Slice /system/serial-getty
      system.slice                                                                                              loaded active active    System Slice
      user-0.slice                                                                                              loaded active active    User Slice of UID 0
      user.slice                                                                                                loaded active active    User and Session Slice
      avahi-daemon.socket                                                                                       loaded active running   Avahi mDNS/DNS-SD Stack Activation Socket
      dbus.socket                                                                                               loaded active running   D-Bus System Message Bus Socket
      dropbear.socket                                                                                           loaded active listening dropbear.socket
      systemd-coredump.socket                                                                                   loaded active listening Process Core Dump Socket
      systemd-initctl.socket                                                                                    loaded active listening initctl Compatibility Named Pipe
      systemd-journald-audit.socket                                                                             loaded active running   Journal Audit Socket
      systemd-journald-dev-log.socket                                                                           loaded active running   Journal Socket (/dev/log)
      systemd-journald.socket                                                                                   loaded active running   Journal Socket
      systemd-networkd.socket                                                                                   loaded active running   Network Service Netlink Socket
      systemd-rfkill.socket                                                                                     loaded active listening Load/Save RF Kill Switch Status /dev/rfkill Watch
      systemd-udevd-control.socket                                                                              loaded active running   udev Control Socket
      systemd-udevd-kernel.socket                                                                               loaded active running   udev Kernel Socket
      systemd-userdbd.socket                                                                                    loaded active running   User Database Manager Socket
      basic.target                                                                                              loaded active active    Basic System
      getty.target                                                                                              loaded active active    Login Prompts
      local-fs-pre.target                                                                                       loaded active active    Preparation for Local File Systems
      local-fs.target                                                                                           loaded active active    Local File Systems
      multi-user.target                                                                                         loaded active active    Multi-User System
      network-pre.target                                                                                        loaded active active    Preparation for Network
      network.target                                                                                            loaded active active    Network
      nss-lookup.target                                                                                         loaded active active    Host and Network Name Lookups
      paths.target                                                                                              loaded active active    Path Units
      remote-fs.target                                                                                          loaded active active    Remote File Systems
      slices.target                                                                                             loaded active active    Slice Units
      sockets.target                                                                                            loaded active active    Socket Units
      swap.target                                                                                               loaded active active    Swaps
      sysinit.target                                                                                            loaded active active    System Initialization
      timers.target                                                                                             loaded active active    Timer Units
      systemd-tmpfiles-clean.timer                                                                              loaded active waiting   Daily Cleanup of Temporary Directories
    
    Legend: LOAD   -> Reflects whether the unit definition was properly loaded.
            ACTIVE -> The high-level unit activation state, i.e. generalization of SUB.
            SUB    -> The low-level unit activation state, values depend on unit type.
    
    141 loaded units listed. Pass --all to see loaded but inactive units, too.
    To show all installed unit files use 'systemctl list-unit-files'.

    Regards,
    mizutani

  • Hello,

    The correct service seems to be:
    systemd-networkd.service
    So the command to be run after updating the file corresponding to:
    systemctl edit systemd-networkd
    is
    systemctl restart systemd-networkd.service

    Can you please test the above, and in case the issue is still seen, please share the equivalent of the following visualization shared earlier:

    Could you also let me know which tool you are using to generate the above from the .dat file that was shared in your previous reply?

    Regards,
    Siddharth.

  • Hi,

    Editing systemd-networkd did not solve the problem.

    I am using KernelShark as a tool to visualize kernel traces.

    Regards,
    mizutani

  • In our system, the following kernel’s command-line parameters are set so that the EtherCAT application occupies CPU1.

    "nohz_full=1 isolcpus=1 rcu_nocbs=1 rcu_nocb_poll irqaffinity=0 nosoftlockup"
    Also CPSW2G interrupt is assigned to CPU1 by smp_affinity.
    With these configurations, we believe the EtherCAT application is isolated from potential interference from interfaces other than CPSW2G. Is our understanding accurate?

    While the EtherCAT application is running on CPU1 by isolating it, the CPSW2G driver is still running on CPU0 (apart from the TX Completion Interrupt handling which you have clarified as being assigned to CPU1). Therefore, events which occur on CPU0 including CPSW5G Link Down / Link Up events and other Systemd services, will affect the Linux Network Stack and the CPSW2G driver's ndo_xmit callback unless even they are bound to CPU1.

    I installed KernelShark and was able to view the .dat file that you had shared in your previous reply. Could you let me know the process for identifying the sections in the Trace corresponding to the latency of 3ms?

    Regards,
    Siddharth.

  • Hi,

    Therefore, events which occur on CPU0 including CPSW5G Link Down / Link Up events and other Systemd services, will affect the Linux Network Stack and the CPSW2G driver's ndo_xmit callback unless even they are bound to CPU1.

    Regarding the above, could you please tell me if there is a way to have CPSW5G and CPSW2G processed by their own CPUs?

    I installed KernelShark and was able to view the .dat file that you had shared in your previous reply. Could you let me know the process for identifying the sections in the Trace corresponding to the latency of 3ms?

    This phenomenon occurs when systemd-networkd is processed.

    From the menu bar, select “Plots” -> “Tasks”, select “631 systemd-network” and “715 EtherCAT_Task”, and click “Apply” to display the processing timing of the two tasks.

    In the .dat file I sent, the issue occurred when the LAN cable was unplugged for the fifth time, so you will use the cursor to zoom in on the last entry, systemd-network. When using the cursor, you can enlarge the area between left-clicking and releasing.

    Regards,
    mizutani

  • Hello,

    Thank you for sharing the steps to identify the sections where the 3ms latency is observed. In the .dat file shared earlier:
    8883.sched_switch_edit_systemd-networkd.dat
    Even with all processes selected in the visualizer, there still seems to be a gap i.e no process shows up in the duration where a long delay is seen on CPU1 for the EtherCAT_Task being scheduled. It is not clear to me whether there are certain processes which are running on CPU1 but not being captured in the Trace, or, whether it is related to the scheduler which is not scheduling the EtherCAT_Task within the deadline. Could you please share the section of the code in the EtherCAT_Task application which is responsible for scheduling itself every 1ms?

    Regards,
    Siddharth.

  • Hi,

    The EtherCAT control unit is a third-party software, so I cannot share the source code.

    but I was able to find something that bothers me.

    It was discovered that functions related to the Address Lookup Engine (ALE) were executed every 1 usec while plugging and unplugging the cable.

    I would like to confirm whether disabling the Address Lookup Engine (ALE) can prevent this issue. Could you please tell me how to disable it?

    Regards,
    mizutani

  • It was discovered that functions related to the Address Lookup Engine (ALE) were executed every 1 usec while plugging and unplugging the cable.

    The trace shows that they are executed on CPU0 whereas the EtherCAT_Task is being executed on CPU1. I don't see how they are related.

    The EtherCAT control unit is a third-party software, so I cannot share the source code.

    Is there a way to rule out the EtherCAT Software having a bug? How can we be certain that the manner in which it is being scheduled is unrelated to the Software? If not the entire source code, could you at-least share the APIs used to schedule the task at 1msec intervals?

    Regards,
    Siddharth.

  • Hi

    I previously received the following answer:

    While the EtherCAT application is running on CPU1 by isolating it, the CPSW2G driver is still running on CPU0 (apart from the TX Completion Interrupt handling which you have clarified as being assigned to CPU1).

    Therefore, could it be considered that a high load on the CPU0 side is affecting the processing of the CPSW2G driver, causing the EtherCAT task to be blocked?

    It's a bit forceful, this issue disappeared after I commented out the ALE processing part of the am65-cpsw-nuss driver.

    static void am65_cpsw_nuss_ndo_slave_set_rx_mode(struct net_device *ndev)
    {
    	struct am65_cpsw_common *common = am65_ndev_to_common(ndev);
    	struct am65_cpsw_port *port = am65_ndev_to_port(ndev);
    	u32 port_mask;
    	bool promisc;
    
    	promisc = !!(ndev->flags & IFF_PROMISC);
    	am65_cpsw_slave_set_promisc(port, promisc);
    
    	if (promisc)
    		return;
    
    	// /* Restore allmulti on vlans if necessary */
    	// cpsw_ale_set_allmulti(common->ale,
    	// 		      ndev->flags & IFF_ALLMULTI, port->port_id);
    
    	// port_mask = ALE_PORT_HOST;
    	// /* Clear all mcast from ALE */
    	// cpsw_ale_flush_multicast(common->ale, port_mask, -1);
    
    	// if (!netdev_mc_empty(ndev)) {
    	// 	struct netdev_hw_addr *ha;
    
    	// 	/* program multicast address list into ALE register */
    	// 	netdev_for_each_mc_addr(ha, ndev) {
    	// 		cpsw_ale_add_mcast(common->ale, ha->addr,
    	// 				   port_mask, 0, 0, 0);
    	// 	}
    	// }
    }

    I understand that this is the wrong way to fix it. So, could you please tell me how to optimize ALE?

    Regards,
    mizutani

  • I understand that this is the wrong way to fix it. So, could you please tell me how to optimize ALE?

    You could try to artificially reduce the number of ALE Entries using:

    diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c
    index dc5e247ca5d1..4f85d59a8263 100644
    --- a/drivers/net/ethernet/ti/cpsw_ale.c
    +++ b/drivers/net/ethernet/ti/cpsw_ale.c
    @@ -1509,7 +1509,7 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params)
            if (!ale_dev_id)
                    return ERR_PTR(-EINVAL);
     
    -       params->ale_entries = ale_dev_id->tbl_entries;
    +       params->ale_entries = 64; //ale_dev_id->tbl_entries;
            params->nu_switch_ale = ale_dev_id->nu_switch_ale;
            params->reg_fields = ale_dev_id->reg_fields;
            params->num_fields = ale_dev_id->num_fields;

    That should reduce the time taken for executing the section you have pointed to in your reply.

    Regards,
    Siddharth.

  • Hi,

    The problem stopped occurring after I reduced the number of entries using.

    Thank you very much for your cooperation.

    Regards,
    mizutani