Bond Interoperability with Cisco and Arista Switches

Follow

This knowledge base article discusses interoperability and troubleshooting in a mixed switch topology, featuring switches running Cumulus Linux on the same network as switches running Cisco and Arista network OSes. The article uses the syntax of ifupdown2.

{{table_of_contents}}

Environment

  • Cumulus Linux 2.1 and later

Example Mixed Switch Topology

The following diagram is used for the three examples described below. All testing was performed on actual equipment. The same interfaces are used on both sides; for example, swp19 (switch port 19) on the Cumulus Linux switch is hooked up to g0/19 (Gigabit Ethernet 0/19) on the Cisco 3560.

Cumulus Linux and Cisco IOS

The following example utilizes slow LACPDUs (that is, bond-lacp-rate is set to 0):

Quanta LY2 w/Cumulus Linux 2.1.0 Cisco WS-C3560X-24 12.2(55)SE5
auto bond1
iface bond1
	bond-slaves glob swp19-20
	bond-miimon 100
	bond-min-links 1
	bond-mode 802.3ad
	bond-xmit-hash-policy layer3+4
	bond-lacp-rate 0

auto vlan10
iface vlan10
	bridge-ports bond1.10
	address 10.10.10.11/24
	bridge-stp on
vlan 10

interface GigabitEthernet0/19
 switchport trunk encapsulation dot1q
 switchport mode trunk
 channel-group 1 mode active
interface GigabitEthernet0/20
 switchport trunk encapsulation dot1q
 switchport mode trunk
 channel-group 1 mode active

interface Port-channel1
 switchport trunk encapsulation dot1q
 switchport mode trunk

interface Vlan10
 ip address 10.10.10.10 255.255.255.0

Cumulus Linux and Arista EOS

The following example utilizes fast LACPDUs (where bond-lacp-rate is set to 1):

Quanta LY2 w/Cumulus Linux 2.1.0 Arista DCS-7148S-R 4.13.5F 
auto bond2
iface bond2
	bond-slaves glob swp37-38
	bond-miimon 100
	bond-min-links 1
	bond-mode 802.3ad
	bond-xmit-hash-policy layer3+4
	bond-lacp-rate 1

auto vlan12
iface vlan12
	bridge_ports bond2.12
	address 12.12.12.11/24
	bridge-stp on
interface Ethernet37
   switchport mode trunk
   channel-group 2 mode active
interface Ethernet38
   switchport mode trunk
   channel-group 2 mode active

interface Port-Channel2
   switchport trunk allowed vlan 12
   switchport mode trunk

interface Vlan12
   ip address 12.12.12.12/24


Cumulus Linux and Cisco NX-OS

The following example utilizes fast LACPDUs (where bond-lacp-rate is set to 1):

Quanta LY2 w/Cumulus Linux 2.1.0 Cisco Nexus3064 5.0(3)U2(2c)
auto bond3
iface bond3
	bond-slaves glob swp39-40
	bond-miimon 100
	bond-min-links 1
	bond-mode 802.3ad
	bond-xmit-hash-policy layer3+4
	bond-lacp-rate 1

auto vlan14
iface vlan14
	bridge-ports bond3.14
	address 14.14.14.11/24
	bridge-stp on
feature interface-vlan
feature lacp

vlan 14

interface Ethernet1/39
  switchport mode trunk
  channel-group 3 mode active
interface Ethernet1/40
  switchport mode trunk
  channel-group 3 mode active


interface port-channel3
  switchport mode trunk

interface Vlan14
no shutdown
ip address 14.14.14.14/24

Troubleshooting Bond/EtherChannel/LACP Links

The three most common problems with EtherChannels are:

  • VLAN mismatches with L2 bonds
  • Fast vs slow LACP rate of LACPDUs 
  • Both sides using passive LACP mode instead of active LACP mode

Since Cumulus Linux is Linux, it utilizes the same kernel syntax for bonds that can be found in the kernel.org documentation. The Cumulus Linux bonding documentation contains specific examples. The following guide compares the Cisco 3560 to the Quanta LY2 in the diagram and configuration above.

Parameters of a Bond

Here is the recommended way to configure a bond in Cumulus Linux:

auto bond0
iface bond0
    bond-slaves swp1 swp2
    bond-mode 802.3ad
    bond-miimon 100
    bond-lacp-rate 1
    bond-min-links 1
    bond-xmit-hash-policy layer3+4
  • bond-slaves equates to the members of the bond. In this case, swp1 and swp2 are members of bond0; bond-mode 802.3ad is required for an LACP bond.
  • bond-miimon 100 is the failure inspection frequency. The default value is 0, but Cumulus Networks recommends 100.
  • bond-lacp-rate 1 means fast LACP, which is discussed in the Fast vs Slow LACP Rates section below; Cumulus Networks recommends using fast LACP.
  • bond-min-links is followed by a integer indicating the number of links that need to be up for the bond to become active.
  • bond-xmit-hash-policy must be set to layer3+4 so it is evenly distributed.

To read more information about the bond parameters, read the kernel.org documentation.

VLAN Mismatch

The following configuration has a VLAN mismatch. Can you find it?

Quanta LY2 w/Cumulus Linux 2.1.0 Cisco WS-C3560X-24 12.2(55)SE5
auto bond1
iface bond1
	bond-slaves glob swp19-20
	bond-miimon 100
	bond-min-links 1
	bond-mode 802.3ad
	bond-xmit-hash-policy layer3+4
	bond-lacp-rate 0

auto vlan10
iface vlan10
	bridge-ports bond1.100
	address 10.10.10.11/24
	bridge-stp on
vlan 10

interface GigabitEthernet0/19
 switchport trunk encapsulation dot1q
 switchport mode trunk
 channel-group 1 mode active
interface GigabitEthernet0/20
 switchport trunk encapsulation dot1q
 switchport mode trunk
 channel-group 1 mode active

interface Port-channel1
 switchport trunk encapsulation dot1q
 switchport mode trunk

interface Vlan10
 ip address 10.10.10.10 255.255.255.0

As illustrated above, the bridge called vlan10 indicates the member of this bridge is bond1.100. The name vlan10 does not mean that the bridge members have to be tagged vlan10. The name has nothing to do with what 802.1q tags are within the bridge. The sub-interface .100 (bond1.100) indicates that tagged ingress packets will be a member of VLAN 100, but in a bridge named vlan10. This syntax is correct but is not the result you probably would want. You could name the bridge anything, such as mgmt-bridge, or outofband.

Unlike Cisco IOS, packets will be dropped unless you join the tagged subinterface to a bridge or L3 interface. Many IOS and IOS-look-alikes will do something like:

switchport trunk allowed vlan 5

This allows only vlan5 and nothing else. Cumulus Linux does the opposite, where it drops everything unless it allows it in.

You can find more information on configuring VLAN tagging in the Cumulus Linux user guide.

Fast vs Slow LACP Rates

The Cumulus Linux documentation recommends:

bond-lacp-rate 1

This means fast; according to the kernel.org documentation, it means "Request partner to transmit LACPDUs every 1 second."

In some cases, the other vendor cannot perform fast LACPDUs or there may be some other unknown requirement requiring slow LACP. To configure slow rate, use:

bond-lacp-rate 0

According to kernel.org, this means "Request partner to transmit LACPDUs every 30 seconds."

Troubleshooting Fast vs Slow

To figure out what a bond is configured to run, use this command:

cat /proc/net/bonding/bond1

The following output is a snippet of the information received:

cumulus@switch:~$ cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow

On the Cisco switch, you check the port channel like this:

show etherchannel summary

The following output is a snippet of the information received:

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
1      Po1(SU)         LACP      Gi0/19(P)   Gi0/20(P)

Notice that the port channel is up on both ports and looks good! To see the LACPDU's speed on the Cisco side, run the following command:

show etherchannel detail

The following output is a snippet of the information received:

Local information:
                            LACP port     Admin     Oper    Port        Port
Port      Flags   State     Priority      Key       Key     Number      State
Gi0/19    SA      bndl      32768         0x1       0x1     0x114       0x3D

Partner's information:

                  LACP port                        Admin  Oper   Port    Port
Port      Flags   Priority  Dev ID          Age    key    Key    Number  State
Gi0/19    SA      255       089e.01ce.e216   3s    0x0    0x11   0x1     0x3D

Where the SA flags mean:

A - Device is in active mode
S - Device is sending Slow LACPDUs

Making sure both sides match is imperative for traffic to pass and the bond to stay up and be stable. In the case above, they were both utilizing slow LACPDUs. The following table will help you match:

Cumulus Linux Cisco Rate
LACP rate: slow S every 30 seconds
LACP rate: fast F every second

Active vs. Passive Modes

Cumulus Linux does not currently support passive mode. Since active mode works with active and passive configurations, and Cumulus Linux does not have a knob to change it, there is no interopability issue between switches running Cumulus Linux and switches from other network OS vendors.

Have more questions? Submit a request

Comments

Powered by Zendesk