TX Drop Counter Counts Twice for VXLAN Traffic in TX Direction

Follow

Overview

Platforms based on the Broadcom ASIC reflect BUM traffic as TX drops on a VXLAN-enabled bridge.

{{table_of_contents}}

Environment

  • Cumulus Linux, all versions

Issue

When a VXLAN-enabled bridge receives a broadcast, unknown unicast, or multicast (BUM) frame, the ASIC must avoid sending the frame back on the same port it was originally received; otherwise, this could cause loops to form — this is known as a split horizon. This split-horizon correction behavior causes the HwIfOutNonQDrops transmit drop counter to increment twice for each count.

You can see the effects of this double count when you run these commands:

  • sudo ethtool -S swpX
  • net show counters
  • cl-netstat

This behavior is not seen on a bridge configured with ports performing a traditional Ethernet bridging function (that is, non-VXLAN ports). However, when VNIs are configured as bridge-ports, switch ports in that bridge are no longer treated as regular ports in this regard. These switch ports are susceptible to this behavior, and the traffic statistics reflect this split-horizon correction behavior.

The second situation where you see the HwIfOutNonQDrops counter counting twice is on the routed uplink ports, where VXLAN-encapsulated packets are received.

This is expected behavior for Broadcom-based platforms that stems from the ASIC itself, when the switch is configured with a VXLAN-enabled bridge.

Symptoms

Consider a network with node Host11 (which is directly connected to VTEP1 on port swp11) communicating with Host22 (which is directly connected to VTEP2 on port swp22), with an IP-routed network separating VTEP1 and VTEP2.

For a unidirectional traffic flow transmitted by Host11 towards Host22:

  • The ingress VTEP counts TX drops on the edge port where the traffic is received (VTEP1 swp11) at a rate equal to the number of BUM packets received on the ingress port (from the source of the traffic flow).
  • The egress VTEP counts TX drops on the routed uplink port (VTEP2 swp22) at a rate equal to number of BUM packets multiplied by number of remote VTEPs in the flood list (per VNI).

Example

In the following test, 10,000 multicast packets are sent. Results in red indicate where the TX_DRP counter is incremented because of the split-horizon correction behavior. Results in blue indicate the physical network path for the traffic flow relevant to this example. Results in black indicate the network path across logical ports that are relevant to this example.

For the ingress VTEP:

  • 10,005 packets are received on the edge port, swp5. 10,001 packets are shown as TX drops on the ingress port because of the split-horizon correction behavior.
  • 20,049 packets are sent on the routed uplink (swp1) toward the two remote VTEPs (VTEP2 and VTEP3). vni-1010 has two remote VTEPs in the flood list, so 20,007 packets are shown for TX_OK; this is normal behavior.
cumulus@vtep1:~$ net show counters

Kernel Interface table

Iface       MTU   Met    RX_OK   RX_ERR    RX_DRP   RX_OVR   TX_OK    TX_ERR   TX_DRP   TX_OVR Flg
--------  ----- -----  ------- --------  -------- -------- -------  -------- -------- -------- -----
bridge     1500     0    10011        0         0        0   50023         0        0        0 BMRU
eth0       1500     0       95        0         0        0      14         0        0        0 BMRU
lo        65536     0        0        0         0        0       0         0        0        0 LRU
swp1       1500     0       80        0         0        0   20049         0        6        0 BMRU
swp5       1500     0    10005        0         0        0      54         0    10001        0 BMRU
swp15      1500     0       13        0         0        0      23         0        0        0 BMRU
swp16      1500     0        3        0         0        0   20058         0        0        0 BMRU
swp51      1500     0        0        0         0        0       0         0        0        0 BMU
vni-1010   1500     0        3        0         0        0   20007         0        0        0 BMRU

For the egress VTEP:

  • 10,055 packets are received on a routed uplink port, swp5. 20,009 packets are shown as TX drops on the ingress port (and 20,011 packets for the corresponding VNI) because of the split-horizon correction behavior.
  • 10,061 packets are sent on the edge port (swp49) towards the destination.
cumulus@vtep2:~$ net show counters

Kernel Interface table

Iface       MTU   Met    RX_OK   RX_ERR    RX_DRP   RX_OVR   TX_OK    TX_ERR   TX_DRP   TX_OVR Flg
--------  ----- -----  ------- --------  -------- -------- -------  -------- -------- -------- -----
bridge     1500     0    10014        0         0        0   40027         0        0        0 BMRU
eth0       1500     0      133        0         0        0      25         0        0        0 BMRU
lo        65536     0        0        0         0        0       0         0        0        0 LRU
swp5       1500     0    10055        0         0        0      90         0    20009        0 BMRU
swp15      1500     0        3        0         0        0      14         0        0        0 BMRU
swp16      1500     0        3        0         0        0   10060         0        0        0 BMRU
swp49      1500     0        6        0         0        0   10061         0        0        0 BMRU
vni-1010   1500     0    10004        0         0        0   20011         0        0        0 BMRU
vni-1020   1500     0        1        0         0        0       1        0         0        0 BMRU
Have more questions? Submit a request

Comments

Powered by Zendesk