This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/PROCESSOR-SDK-AM57X: ACTIVE-BACKUP bonding driver with VLANs

Part Number: PROCESSOR-SDK-AM57X

Tool/software: Linux

Hi,

we are trying to setup the linux bonding driver over the physical eth0/eth1 of AM572x and using VLANs on the bonded interfaces. The AM57x is setup to the dual standalone emac mode.

We are trying to achieve

"Bonding for High Availability" (see link)

and specifically

"High Availability in a Multiple Switch Topology" (see link)

In this mode one eth is active, the second is disable and does not accept nor send packets.

After all is set up we can see following behavior:

ARP broadcasts sent to the backup (i.e. inactive) eth interface are forwarded to the active eth interface causing a loop in the network infrastructure.

Is there any configuration to disable this loop?

  • Hi,
    I will need some background information. What TI SDK version are you working with?

    Can you describe how you know the ARP packets are going from the inactive to active? Could you also describe the topology that you are using for bonding? Are you connected a switch? How are the ARP packets isolated between the active and inactive link of the network?

    Are you using spanning tree or a variant of in your network?

    Best Regards,
    Schuyler
  • Hi Schuyler,

    sure!

    >I will need some background information. What TI SDK version are you working with?

    TI SDK 04.01

    >Can you describe how you know the ARP packets are going from the inactive to active? Could you also describe the topology that you are using for bonding? Are you connected a switch? How are the ARP packets isolated between the active and inactive link of the network?

    The system gets flooded very fast when the AM57x is connected to the network. Replacing a the AM57x with a standard PC and same bonding configurations does not produce any flooding/loop. This can be reproduced with the bonding setup active and two PCs on both interfaces of the AM57x. if you send a broadcast from the inactive leg (eg eth1), the PC on the active leg (eg eth0) can see the packet (wireshark).

    The topology is shown in the previous message, see the second link. We are connected to two interlinked switches.

    Maybe the AM57x internal switch is missing some configuration? We are in DUAL EMAC mode and VLAN AWARE.

    >Are you using spanning tree or a variant of in your network?

    Yes, spanning tree is active

    BR
    Marco

  • Hi,

    Could you please provide the bonding setup commands that you are using? Also what are you using to flood the inactive leg?

    Best Regards,
    Schuyler
  • Hi,

    Bonding Configuration:

    # cat /etc/modprobe.d/bonding.conf
    options bonding mode=active-backup max_bonds=1 miimon=100

    And the Network config is attached.

    etc_systemd_network.tar.gz

  • I Think though that this is not a Linux configuration issue. How ca I be sure that the CPSW switch is really separating eth0 and eth1? We use the DUAL EMAC MODE.

    I believe we are also in the VLAN AWARE mode, but not sure how to check. Can you give me a hint on this?
  • Hi Schuyler,

    after some deep analisys, we have come to the conclusion that we are violationg this constraing from the CPSW Users's Guide:

    • While adding VLAN id to the eth interfaces, same VLAN id should not be added in both interfaces which will lead to VLAN forwarding and act as switch

    This is what is happening, since we have VLANs on top of the Bonding driver, and that adds the same VLANs to all the enslaved interfaces (i.e. eth0 and eth1 in our situation).

    We discovered that starting tcpdump on the board stops the ARP storm/loop, and we further discovered that the CPSW driver sets the ALE_BYPASS on promiscuos mode (which is set by tcpdump). If we set this ALE_BYPASS via IOCTLs the storm/loop stops.

    I can't find any documentation to this ALE_BYPASS configuration and what it does, can you point me to it?

    BR
    Marco

  • Hi Marco,

    If I am correct ALE Bypass turns off all the rules in the ALE so that all packets pass through, hence why promiscuous mode sets up the ALE this way. I will look to see if there is additional information post tomorrow.

    Best Regards,
    Schuyler
  • Could you check if below patch helps:
    lkml.org/.../756
    git.kernel.org/.../cpsw.c

    ?
    some back-porting effort might be required