Cumulus Linux 2.5.0 Release Notes

Follow

Overview

These release notes support Cumulus Linux 2.5.0 and describe currently available features and known issues.

Licensing

Cumulus Linux is licensed on a per-instance basis. Each network system is fully operational, enabling any capability to be utilized on the switch with the exception of forwarding on switch panel ports. Only eth0 and console ports are activated on an un-licensed instance of Cumulus Linux. Enabling front panel ports requires a license.

You should have received a license key from Cumulus Networks or an authorized reseller. To install the license, read the Cumulus Linux quick start guide.

Installing Version 2.5.0

To install the software, choose one of the following methods. They are ordered from the most recommended method to least recommended.

  • Download Cumulus Linux 2.5.0 - Final Latest Version from the Downloads page of the Cumulus Networks website, then use cl-img-install to install the software.

    Warning: This method overwrites the target image slot, so if you want to preserve your configuration, you should create a persistent configuration on /mnt/persist.

  • Download Cumulus Linux 2.5.0 - Final Latest Version from the Downloads page of the Cumulus Networks website, then use ONIE to perform a complete install, following the instructions in the quick start guide.

    Warning: This method is destructive; any configuration files on the switch will not be saved, so please copy them to a different server before upgrading via ONIE.

Enabling Quagga

There is no SNMP support for Quagga in this release (see RN 88 below). Due to this circumstance, you must remove all references to smux in each of the following configuration files. You must also remove these references before upgrading Cumulus Linux using apt-get. If the smux entries are present in the configuration files, the daemons in the 2.5 packaged version of Quagga will not start.

  1. cd /etc/quagga
  2. grep smux *
  3. Delete all lines in the config files containing the smux keyword.

The references to smux that must be removed are:

  • In bgpd.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.2 quagga_bgpd
  • In ospf6d.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.6 quagga_ospf6d
  • In ospfd.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.5 quagga_ospfd
  • In zebra.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.1 quagga_zebra

What's New in Cumulus Linux 2.5.0

Cumulus Linux 2.5.0 supports these new hardware platforms and features:

For a deeper overview and a presentation highlighting the major changes, see Cumulus Linux 2.5.0: What's New and Different.

Experimental Features

The following experimental features are included in Cumulus Linux 2.5.0:

Documentation

You can read the technical documentation here.

Issues Fixed in Cumulus Linux 2.5.0

The following is a list of issues fixed in Cumulus Linux 2.5.0 from earlier versions of Cumulus Linux.

Release Note ID Summary
RN-68
RN-103 In a VRR environment, the server that is bonded to the VRR switches could lose packets destined to the VRR's IP addresses for up to 15 seconds
RN-112 Enabling LACP support for non-L3/L4 modes
RN-116 Bridge driver issues affecting IGMP snooping behavior on STP topology change
RN-119 LLDP frames being reported as software RX drops when received on bridge interfaces
RN-150 Tagged packets have their 802.1p value set to 0
RN-153 BGP ECMP x64 topology is missing routes
RN-161 Packets on local ports get dropped on admin state change of VXLAN instance attached to bridge
RN-164 IFLA_VXLAN_SERVICE_NODE incompatible with upstream kernel
RN-165 Quanta LY6 switch has memory parity error _soc_mem_array_sbusdma_read: L2_ENTRY.ipipe0 failed(ERR)
RN-176 ipv6route only shows 2K routes; causes cl-route-check to fail incorrectly
RN-180 JDSU QSFP+ LR4 cable presence not detected on Edge-Core AS-6701 switch
RN-182 ICMP redirects occur on host while pinging bridge IP in VRR active-active topology (VRR and Host-MLAG)
RN-185 ethtool reports wrong values for QSFP alarm and warning thresholding flags on Agema 7448 switch
RN-186 Cannot configure 10M or 100M speeds using ethtool on 1G copper ports
RN-191 On Agema 7448 switch, QSFP vendor information is corrupted
RN-197 Host-MLAG: when a MAC entry is learned and shared with the dual-connected peer, an old entry from the peer switch overwrites the new entry
RN-201 On a Dell S6000-ON, running snmpwalk on the LM-SENSOR MIB times out
RN-202 Running ip link add type bond mode 802.3ad doesn't set bond mode attribute
RN-203 cl-acltool doesn't read policy.d directory files in alphabetical order

Known Issues in Cumulus Linux 2.5.0

Issues are categorized for easy review. Some issues are fixed but will be available in a later release.

Release Note ID Summary Description
RN-4 ifup/ifdown must be used for interfaces with IPv6 addresses defined in /etc/network/interfaces, otherwise the global IPv6 address will not be restored Two scenarios are shown below; one with ifup/ifdown, the other with ifconfig down.

With ifup/ifdown:
 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81
 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0
 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link
 inet6 addr: fec0:1000:1000:1000::2/10 Scope:Site
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 RX packets:4231 errors:0 dropped:0 overruns:0 frame:0
 TX packets:4342 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:500
 RX bytes:412115 (402.4 KiB) TX bytes:425688 (415.7 KiB)

cumulus@switch$ sudo ifdown swp1
cumulus@switch$ sudo ifconfig swp1 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:4248 errors:0 dropped:0 overruns:0 frame:0 TX packets:4356 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:413990 (404.2 KiB) TX bytes:427074 (417.0 KiB)
cumulus@switch$ sudo ifconfig swp1 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:4248 errors:0 dropped:0 overruns:0 frame:0 TX packets:4356 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:413990 (404.2 KiB) TX bytes:427074 (417.0 KiB) cumulus@dni-7448-13$ sudo ifup swp1 ADDRCONF(NETDEV_UP): swp1: link is not ready cumulus@switch$ sudo ifconfig swp1ADDRCONF(NETDEV_CHANGE): swp1: /
link becomes ready swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link inet6 addr: fec0:1000:1000:1000::2/10 Scope:Site UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4250 errors:0 dropped:0 overruns:0 frame:0 TX packets:4362 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:414178 (404.4 KiB) TX bytes:427610 (417.5 KiB)
cumulus@switch$

With ifconfig down:
 sudo ifconfig swp1
 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81
 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0
 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link 
 inet6 addr: fec0:1000:1000:1000::2/10 Scope:Site
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 RX packets:98 errors:0 dropped:0 overruns:0 frame:0
 TX packets:111 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:500 
 RX bytes:13310 (12.9 KiB) TX bytes:12786 (12.4 KiB)

cumulus@switch$ sudo ifconfig swp1 down
cumulus@switch$ sudo ifconfig swp1 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:126 errors:0 dropped:0 overruns:0 frame:0 TX packets:138 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:16998 (16.5 KiB) TX bytes:15998 (15.6 KiB)
cumulus@switch$ sudo ifconfig swp1 up ADDRCONF(NETDEV_UP): swp1: link is not ready
cumulus@switch$ sudo ifconfig swp1ADDRCONF(NETDEV_CHANGE): swp1: link becomes ready swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:130 errors:0 dropped:0 overruns:0 frame:0 TX packets:149 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:17474 (17.0 KiB) TX bytes:17154 (16.7 KiB)
RN-10 cl-phy-update doesn't support aggregated ports Ports can be aggregated into a larger interface in Cumulus Linux. Unfortunately support for aggregated ports is not yet supported when running cl-phy-update.

If there are any ganged ports during a SW upgrade it is recommended to ungang these ports
RN-52 Parameters like the router ID and DR priority cannot be changed while OSPFv2/v3 is running Router ID and DR priority can only be changed by shutting down OSPFv2/v3, changing the ID, and restarting the OSPF process.

A change to the DR priority may not properly be reflected in the LSAs that are still aging out.
RN-56 ipv4/ipv6 forwarding disabled mode not recognized

If either of the following is configured:

 net.ipv4.ip_forward == 0 

or:

 net.ipv6.conf.all.forwarding == 0 

The hardware still forwards packets if there is a neighbor table entry pointing to the destination.

RN-58 IPv6 route is installed and active in the routing table when the associated interface is down If an IPv6 address is assigned to a "down" interface, the associated route is still installed into the route table.

Also, the type of IPv6 address doesn't matter. Link local, site local, and global all exhibit the same problem.

If the interface is bounced up and down, then the routes are no longer in the route table.
RN-61 BGP4 notifications missing for several conditions In certain conditions, Quagga bgpd silently closes the peering without sending a notification. For example, if BGP receives a message with an invalid message type or invalid message length.

Ideally on any one of these cases, bgpd should send out a notification message to the peer.

General functionality of BGP4 is not affected.
RN-64 Configuring route-reflector-client requires specific order In configuring a route to be a route reflector client, the Quagga configuration must be specified in a specific order; otherwise, the router will not be a route reflector client.

The "neighbor <IPv4/IPV6> route-reflector-client" command must be done after the "neighbor <IPV4/IPV6> Activate" command; otherwise, the route-reflector-client command is ignored.

Sample configuration:
 router bgp 65000
 bgp router-id 0.0.0.4 
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 bgp cluster-id 0.0.0.4 
 bgp bestpath as-path multipath-relax 
 redistribute connected 
 neighbor 14.0.0.1 remote-as 65000 
 neighbor 14.0.0.1 route-reflector-client 
 neighbor 14.0.0.1 activate 
 neighbor 14.0.0.1 next-hop-self 
 neighbor 14.0.0.9 remote-as 65000 
 neighbor 14.0.0.9 activate 
 neighbor 14.0.0.9 next-hop-self 
 neighbor 2001:ded:beef::1 remote-as 65000 
 neighbor 2001:ded:beef:2::1 remote-as 65000 
 maximum-paths 4 
 maximum-paths ibgp 4 
 ! 
 address-family ipv6 
 redistribute connected 
 neighbor 2001:ded:beef::1 activate 
 neighbor 2001:ded:beef::1 next-hop-self 
 neighbor 2001:ded:beef:2::1 route-reflector-client 
 neighbor 2001:ded:beef:2::1 activate 
 neighbor 2001:ded:beef:2::1 next-hop-self 
 maximum-paths 4 
 maximum-paths ibgp 4 
 exit-address-family 

At runtime:
 cumulus@switch:$ show ip bgp neighbor 14.0.0.1 
 BGP neighbor is 14.0.0.1, remote AS 65000, local AS 65000, internal link
 BGP version 4, remote router ID 0.0.0.6 
 BGP state = Established, up for 00:23:49
 Last read 23:31:36, hold time is 180, keepalive interval is 60 seconds 
 Neighbor capabilities: 
 4 Byte AS: advertised and received 
 Route refresh: advertised and received(old & new)
 Address family IPv4 Unicast: advertised and received 
 Message statistics: 
 Inq depth is 0 
 Outq depth is 0
 Sent Rcvd 
 Opens: 2 0
 Notifications: 0 0
 Updates: 1 1
 Keepalives: 25 24 
 Route Refresh: 0 0
 Capability: 0 0 
 Total: 28 25 
 Minimum time between advertisement runs is 5 seconds
 For address family: IPv4 Unicast 
 >>>>>>>>>>>>>>>>>>>>>> ROUTE REFLECTOR CLIENT NOT DISPLAYED 
 NEXT_HOP is always this router 
 Community attribute sent to this neighbor(both) 
 6 accepted prefixes 
 Connections established 1; dropped 0 
 Last reset never 
 Local host: 14.0.0.2, Local port: 179 
 Foreign host: 14.0.0.1, Foreign port: 40290 
 Nexthop: 14.0.0.2 
 Nexthop global: 2001:ded:beef::2 
 Nexthop local: fe80::202:ff:fe00:4
 BGP connection: non shared network
 Read thread: on Write thread: off 
 cumulus@switch:$ 

Workaround:
 Define in following order 
 address-family ipv4 unicast
 neighbor 14.0.0.9 activate 
 neighbor 14.0.0.9 next-hop-self
 neighbor 14.0.0.9 route-reflector-client >>> Must be after Activate 
 exit-address-family 
 neighbor 2001:ded:beef:2::1 remote-as 65000
 address-family ipv6 unicast 
 redistribute connected
 maximum-paths 4 
 maximum-paths ibgp 4 
 neighbor 2001:ded:beef:2::1 activate 
 neighbor 2001:ded:beef:2::1 next-hop-self 
 neighbor 2001:ded:beef:2::1 route-reflector-client >>> Must be after activate 
 exit-address-family 
 Runtime status after change: 

cumulus@switch:$ show ip bgp neighbors 14.0.0.9 BGP neighbor is 14.0.0.9, remote AS 65000, local AS 65000, internal link BGP version 4, remote router ID 0.0.0.7 BGP state = Established, up for 00:13:59 Last read 22:35:13, hold time is 180, keepalive interval is 60 seconds Neighbor capabilities: 4 Byte AS: advertised and received Route refresh: advertised and received(old & new) Address family IPv4 Unicast: advertised and received Message statistics: Inq depth is 0 Outq depth is 0 Sent Rcvd Opens: 1 1 Notifications: 0 0 Updates: 2 1 Keepalives: 15 14 Route Refresh: 0 0 Capability: 0 0 Total: 18 16 Minimum time between advertisement runs is 5 seconds For address family: IPv4 Unicast Route-Reflector Client >>>>>>>>>> PLEASE NOTE ME NEXT_HOP is always this router Community attribute sent to this neighbor(both) 6 accepted prefixes Connections established 1; dropped 0 Last reset never Local host: 14.0.0.10, Local port: 38813 Foreign host: 14.0.0.9, Foreign port: 179 Nexthop: 14.0.0.10 Nexthop global: 2001:ded:beef:2::2 Nexthop local: fe80::202:ff:fe00:6 BGP connection: non shared network Read thread: on Write thread: off cumulus@switch:$
RN-65 Virtual links in Quagga's OSPFv2 are non-operational Cumulus Networks testing has identified too many issues with virtual link support in Quagga's OSPFv2. The feature is unsupported.
RN-70 ACL: Bridge traffic that matches a LOG ACTION rule is not logged in syslog For example, a bridge with switch ports swp1, swp2, swp3 as bridge members is configured. ACL rules to LOG and DROP for icmp traffic are configured.

Ping requests are sent from host1 on swp1 to host3 on swp3, and the following was observed:
* Counters for both LOG and DROP ACL rules are incrementing properly, but the packets are not showing up on /var/log/syslog.
* Packets that are copied to the CPU from hardware for the LOG rule are dropped due to the check in kernel to disable software bridging for hardware bridged packets.
RN-77 New routes/ECMPs can evict existing/installed Cumulus Linux syncs routes between the kernel and the switching silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes to move from being fully allocated to being partially allocated.

In order to avoid this, routes in the hardware should be monitored and kept below the ASIC limits.

For example, on systems with Trident+ chips, the limits are as follows:
 routes: 16384 <<<< if all routes are ipv4 
 long mask routes 256 <<<< i.e., routes with a mask longer than the route mask limit 
 route mask limit 64
 host_routes: 8192 
 ecmp_nhs: 4044 
 ecmp_nhs_per_route: 52 
That translates to about 77 routes with ECMP NHs, if every route has the maximum ECMP NHs.

Monitoring this in Cumulus Linux is performed via the cl-resource-query command:
 cumulus@switch:~# sudo cl-resource-query
 hosts : 3 
 all routes : 29 
 IP4 routes : 17 
 IP6 routes : 12 
 nexthops : 3 
 ecmp_groups : 0
 ecmp_nexthops : 0
 mac entries : 0 / 131072 
 bpdu entries : 500 / 512 
The resource to monitor is the ecmp_nexthops. If this count is close to 4044, new ECMPs may evict existing routes.
RN-88 SNMP support for Quagga is NOT provided in Cumulus Linux Cumulus Linux does not provide SNMP support for Quagga.
RN-120 ethtool LED blinking does not work with switch ports Linux uses ethtool -p to identify the physical port backing an interface, or to identify the switch itself. Usually this identification is by blinking the port LED until ethtool -p is stopped.

This feature does not apply to switch ports (swpX) in Cumulus Linux.
RN-121 PTMD: When a physical interface is in a PTM FAIL state, its subinterface still exchanges information Issue:
When PTMD is incorrectly in a failure state and the Zebra interface is enabled, PIF BGP sessions are not establishing the route, but the subinterface on top of it does establish routes.

If the subinterface is configured on the physical interface and the physical interface is incorrectly marked as being in a PTM FAIL state, routes on the physical interface are not processed in Quagga, but the subinterface is working.

Steps to reproduce:
cumulus@switch:$ sudo vtysh -c 'show int swp8' 
Interface swp8 is up, line protocol is up
PTM status: fail
index 10 metric 1 mtu 1500
flags: <UP,BROADCAST,RUNNING,MULTICAST>
HWaddr: 44:38:39:00:03:88
inet 12.0.0.225/30 broadcast 12.0.0.227
inet6 2001:cafe:0:38::1/64
inet6 fe80::4638:39ff:fe00:388/64
cumulus@switch:$ ip addr show | grep swp8
10: swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 500
inet 12.0.0.225/30 brd 12.0.0.227 scope global swp8
104: swp8.2049@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.229/30 brd 12.0.0.231 scope global swp8.2049
105: swp8.2050@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.233/30 brd 12.0.0.235 scope global swp8.2050
106: swp8.2051@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.237/30 brd 12.0.0.239 scope global swp8.2051
107: swp8.2052@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.241/30 brd 12.0.0.243 scope global swp8.2052
108: swp8.2053@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.245/30 brd 12.0.0.247 scope global swp8.2053
109: swp8.2054@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.249/30 brd 12.0.0.251 scope global swp8.2054
110: swp8.2055@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.253/30 brd 12.0.0.255 scope global swp8.2055
cumulus@switch:$
bgp sessions:
12.0.0.226 ,4 ,64057 , 958 , 1036 , 0 , 0 , 0 ,15:55:42, 0, 10472
12.0.0.230 ,4 ,64058 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285
12.0.0.234 ,4 ,64059 , 958 , 1049 , 0 , 0 , 0 ,15:55:40, 187, 10285
12.0.0.238 ,4 ,64060 , 958 , 1039 , 0 , 0 , 0 ,15:55:45, 187, 10285
12.0.0.242 ,4 ,64061 , 958 , 1014 , 0 , 0 , 0 ,15:55:46, 187, 10285
12.0.0.246 ,4 ,64062 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285
12.0.0.250 ,4 ,64063 , 958 , 1029 , 0 , 0 , 0 ,15:55:43, 187, 10285
12.0.0.254 ,4 ,64064 , 958 , 1036 , 0 , 0 , 0 ,15:55:44, 187, 10285


RN-125 Network LSA with an old router ID isn't flushed out by the originator Issue:
When the router ID is changed, the router should remove the previous network LSA (link-state advertisement) that it generated based on the IP address on the interface in the Network LSA.

Resolution:
Cumulus Networks isn't removing this LSA, so it will be naturally aged out.
RN-132 You must run "apt-get update" before running any apt-get commands or after changing sources.list

Before running any apt-get commands or after changing the source.list file in /etc/apt, you need to run apt-get update.

RN-133 Interface names in Cumulus Linux cannot exceed 15 characters

Device names, including interface names, in Cumulus Linux cannot exceed 16 characters – including the terminator. Cumulus Linux truncates longer interface names.

To avoid this issue, do not assign long names to your interfaces.

The following example configuration reproduces this issue:

cumulus@switch:/sys/class/net$ grep 'iface br' /etc/network/interfaces 
iface br2-pubmgmt inet static
iface br3-prvmgmt inet manual
iface br400-quarantine inet manual
iface br401-peering-1k5 inet manual
iface br402-peering-9k inet manual
iface br500-pi-exa inet manual
iface br501-akamai-exa inet manual
iface br502-exa-internetfactory inet manual
cumulus@switch:/sys/class/net$ brctl show | grep br
bridge name	bridge id	 STP enabled	interfaces
br2-pubmgmt	 8000.089e01cebe37	no	 bond0.2
br3-prvmgmt	 8000.089e01cebe3a	no	 bond0.3
br400-quarantin	 8000.089e01cebe37	no	 bond0.400
br401-peering-1	 8000.089e01cebe3a	no	 bond0.401 <<<
RN-134 Installing Chef under Cumulus Linux

The Cumulus Linux 2.2 repository contains two versions of Chef, the automation tool: 11.6.2 (the current version) and 10.30.4.

To install the latest version, connect to the switch and use apt-get:

cumulus@switch:~# sudo  apt-get install chef

To install 10.30.4, connect to the switch and use apt-get:

cumulus@switch:~# sudo apt-get install chef=10.30.4-0.debian.7.3 
RN-162 Priority Flow Control doesn't work on Trident II switches

Priority Flow Control (PFC) configuration is not correct for switches on the Trident II platform. As a result, PFC doesn't work.

There is no workaround at this time.

RN-163 VXLAN: ovsdb-server cannot select loopback interface as source IP address, causing TOR registration to the controller to fail

In a VXLAN using VMware NSX, ovsdb-server cannot select the loopback interface as the source IP address. This causes TOR registration to the controller to fail.

To work around this issue, run:

cl-bgp redistribute add connected
RN-165 Quanta LY6 switch has memory parity error _soc_mem_array_sbusdma_read: L2_ENTRY.ipipe0 failed(ERR)

On a Quanta LY6 switch, you may see some memory parity errors in the switchd log that look like this:

switchd.log.1.gz:1402208356.479833 2014-06-08 06:19:16 
 sync.c:2803 IPv4 Route Summary (90) : 0 Added, 1 Deleted, 0 Updated in 30943 usecs
switchd.log.1.gz:1402208356.521537 2014-06-08 06:19:16 
 hal_acl_bcm.c:2352 ACL: installation succeeded, switched over
switchd.log.1.gz:1402208357.679543 2014-06-08 06:19:17 sync.c:2803 
 IPv4 Route Summary (91) : 1 Added, 299 Deleted, 0 Updated in 220336 usecs
switchd.log.1.gz:1402208357.719811 2014-06-08 06:19:17 
 hal_acl_bcm.c:2352 ACL: installation succeeded, switched over
switchd.log.1.gz:1402208359.486625 2014-06-08 06:19:19 sync.c:2803 
 IPv4 Route Summary (92) : 299 Added, 0 Deleted, 0 Updated in 200425 usecs
switchd.log.1.gz:1402208361.042769 2014-06-08 06:19:21 
 hal_acl_bcm.c:2352 ACL: installation succeeded, switched over
switchd.log.1.gz:1402208419.059525 2014-06-08 06:20:19 hal_bcm.c:408 
 caught a parity error of type parity data error: 0x4000001, 0x1c0043cd
switchd.log.1.gz:1402208419.059594 2014-06-08 06:20:19 switchd.c:533 
 No switchd restart: restart trigger has been disabled
switchd.log.1.gz:1402208419.061002 2014-06-08 06:20:19 hal_bcm.c:408 
 caught a parity error of type corrected data error: 0x7d6, 0x43cd
switchd.log.1.gz:1402208419.061032 2014-06-08 06:20:19 switchd.c:533 
 No switchd restart: restart trigger has been disabled
switchd.log.1.gz:1402208419.061091 2014-06-08 06:20:19 
 hal_bcm_console.c:169 WARN STATUS: 0x00000083
switchd.log.1.gz:OPCODE: 0x1c110200
switchd.log.1.gz:START ADDR: 0x04790180
switchd.log.1.gz:CUR ADDR: 0x1c004394
switchd.log.1.gz:_soc_mem_array_sbusdma_read: L2_ENTRY.ipipe0 failed(ERR)
switchd.log.1.gz:H/W received sbus nack with error bit set.
switchd.log.1.gz:Unit: 0 
switchd.log.1.gz:
switchd.log.1.gz:Mem: Parity error..
switchd.log.1.gz:Error in: SBUS transaction.
switchd.log.1.gz:Blk: 1, Pipe: 0, Address: 0x1c0043cd, base: 0x0, stage: 7, index: 17357

While troubleshooting this issue, the error occurred only once. If you encounter this error more than once, please submit a support request.

RN-179

10GTek 10G SR cables exhibit high rate of errors on Penguin Arctica 4804X switch

Some PHY-less Penguin Arctica 4804X platforms using 10GTek 10G MM SR cables exhibit high rates of errors and low bandwidth one direction.

RN-181

ECMP paths not inserted for directly connected unnumbered neighbors

Cumulus Linux does not insert multiple paths for directly connected ununumbered neighbors into hardware; it inserts only one, as determined by cl-route-check -V.

This may present a problem in VXLAN configurations were a VTEP's neighbors are directly adjacent (that is, the spine switch is the VTEP) and you want to use ECMP for the tunneled traffic. If only your leaf switches are the VTEPs, this issue will not occur.

RN-183

Link not coming up (NO-CARRIER) on Penguin Arctica 4804xp with CAB-10GSFP-P9M 10Gtek 9 meter cable

The link does not come up on a Penguin Arctica 4804xp 720G PHY-less switch with 10Gtek 9 meter cables (CAB-10GSFP-P9M). You can determine this by running:

root@switch:~# ip link show swp15
17: swp15: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT qlen 500
    link/ether 44:38:39:00:70:a1 brd ff:ff:ff:ff:ff:ff

Cumulus Linux supports copper passive cables no longer than 7m.

RN-187

Rolling back installation causes Debian packages to be unusable

When you run apt-get install, Cumulus Linux creates a new snapshot automatically if at least one file is added to /etc. If you roll back the installation using cl-persistify, the installer reverts the /etc directory to the specific snapshot, but does not do anything with installed packages.

As a result, these packages will be unusable because missing configuration files. The Cumulus Linux installer does not record which Debian packages changed after a rollback. As a workaround, you can try installing again.

RN-192

PTM may crash when a large topology file has a syntax error

If there is a syntax error in a large (~ 4000 entries) topology.dot file, PTM may crash while reading/parsing this file. This can occur because the libcgraph API that PTM uses for parsing the file cannot handle the error. You should check topology.dot for syntax errors using a Graphviz tool like dot or dotty that can identify the syntax error.

RN-196 For a VXLAN in NSX, ovsdb-server cannot select a loopback interface as the SRC IP

As a result, the TOR registration to the controller fails.

To work around this issue, run:

cl-bgp redistribute add connected
RN-198 Port LEDs behave differently on different switch models

It's been observed that port LEDs behave differently depending upon the make and model of the switch. For example:

  • Agema AG-7448CU: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • Edge-Core AS4600-54T: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • QuantaMesh T3048-LY2R: the LED is on when the link is up. It blinks off briefly when there is traffic.

Cumulus Networks is currently working to fix this issue.

RN-199 When a Quagga route-map is modified, the switch could use the partial map before edits are completed

Cumulus Linux triggers a route-map update before the user finishes editing the route map, resulting in an incorrect route map being used. The route-map update trigger should only occur when user finishes editing the map.

Cumulus Networks is working to fix this issue.

RN-200 On an Edge-Core AS5610-52X switch with long reach QSFP cables, link stays down

To work around this issue, create a file in /etc/network/ called qsfphp, and populate it with the content below. Then run the following for each swp with these high-powered cables:

qsfphp swp#

The contents of qsfphp:

#!/bin/bash

usage() {
    echo "Software override power settings for QSFP cables allowing high power"
    echo "(>1.5W) operation."
    echo "Usage: $0 "
    exit -1
}

if [[ "$#" -ne 1 ]]; then
    usage
fi

interface="$1"

if [[ ${interface:0:3} != "swp" ]]; then
    usage
fi

port=${interface:3}
eeprom_dev=`grep -l '^port'${port}'$' /sys/class/eeprom_dev/*/label`

if [[ -z "$eeprom_dev" ]]; then
    echo "no such interface: $interface"
    exit -1
fi

eeprom="`dirname $eeprom_dev`/device/eeprom"

identifier=`dd if=$eeprom bs=1 count=1 2>/dev/null | hexdump -e '/1 "0x%02x"'`

if [[ "$identifier" != "0x0d" ]]; then
    echo "$interface is not a QSFP cable"
    exit -1
fi

echo -en '\x01' | dd of=$eeprom seek=93 bs=1 count=1 2>/dev/null
if [[ $? -eq 0 ]]; then
    echo done
else
    echo failed
    exit -1
fi
RN-204 QuantaMesh T3048-LY8 switch can hang on reboot

During multiple reboot cycles, it's been observed under rare conditions that the QuantaMesh LY8 hangs on boot.

Cumulus Networks implemented a workaround to significantly reduce the probability of such a boot failure.

If the LY8 does hang on boot, a subsequent reboot will result in the switch booting correctly.

RN-210 Restarting or clearing BGP can lead to SYN flooding warning, triggering an unnecessary alarm

When you run bgp clear or restart BGP, this message can appear on console (in vtysh):

179 dynamic neighbor(s), limit 90
switch07# clear bgp *
switch07#TCP: Possible SYN flooding on port 179. /  Sending cookies. Check SNMP counters. 

The warning also appears in syslog:

Sep 18 17:27:40 switch07 kernel: TCP: Possible SYN flooding on port 179. /  Sending cookies. Check SNMP counters.

This issue is caused by the maximum number of socket connections that are accepted on a listening socket is reached.

You can safely ignore this warning. Or to silence, it, you can increase the number of socket connections by changing the value for net.conf.somaxconn, which defaults to 128. If you increase net.conf.somaxconn to a larger number, the issue is no longer occurs.

RN-211 /etc/default/clagd file does not get generated if subinterface MTU setting is different than the parent interface setting

Consider this configuration for CLAG peerlink interfaces, defined in /etc/network/interfaces. Note that there is no MTU setting for peerlink, the parent interface:

auto peerlink 
iface peerlink
    bond-slaves swp47 swp48
    bond-mode 802.3ad
    bond-miimon 100
    bond-use-carrier 1
    bond-lacp-rate 1
    bond-min-links 1 
    bond-xmit-hash-policy layer3+4  

auto peerlink.4000 
iface peerlink.4000
    address 169.254.0.2/30
    mtu 9000
    clagd-peer-ip 169.254.0.1
    clagd-sys-mac 44:38:39:FF:00:00
    clagd-priority 8192 

When you try to bring up the subinterface, the following error occurs:

cumulus@leaf01:/etc/default$ sudo ifup peerlink.4000 
... 
error: peerlink.4000 : failed to execute cmd 'ip -force -batch - [link set dev peerlink.4000 mtu 9000 ]'(RTNETLINK answers: Numerical result out of range 
Command failed -:1) 

To work around this issue, make sure the interface peerlink also has mtu 9000 set. This allows for the creation of the clagd file.

RN-212 clagd 'kill -9 `pgrep clagd` ' turns /bin/bridge monitor fdb into a runaway process

If the clagd daemon is killed with signal 9 (killall -9 clagd), the /bin/bridge process started by clagd will not be terminated, and must be terminated manually before restarting clagd. To find the appropriate bridge process, run:

cumulus@switch:~$ sudo ps -ef | grep 'bridge monitor'

You should see output similar to this:

root 6184 1 0 03:23 ? 00:00:00 /bin/bridge monitor fdb

Then run kill PID to kill this bridge process (in this example, 6184 is the PID of the /bin/bridge process above):

cumulus@switch:~$ sudo kill 6184 

Or, you can run killall bridge, if no other bridge processes are running except the one started by clagd.

RN-214 ifupdown2: when empty bonds are part of a bridge port, the VLAN configuration is incomplete

When a switch port is set incorrectly and that swp is a member of bond, the bond gets created as an empty bond. When an empty bond is added to a bridge, the bridge fails.

As a result, other interfaces like VLANs do not get configured fully. This black holes traffic on the network.

RN-215 When bond members are added or deleted using the ip link command, the egress mask is not updated

When bond members are added or deleted using ip link commands, the egress mask (egr_mask) is not updated. To work around this issue, bring down the bond then bring it up manually.

cumulus@switch:~$ sudo ip link set spine1-2 down  cumulus@switch:~$  cumulus@switch:~$ sudo bcm dump chg egr_mask 
cumulus@switch:~$  
cumulus@switch:~$ sudo ip link set spine1-2 up   
cumulus@switch:~$  
cumulus@switch:~$ sudo bcm dump chg egr_mask cumulus@switch:~$  
cumulus@switch:~$ sudo bcm dump chg egr_mask EGRESS_MASK.ipipe0[51]: <EGRESS_MASK_W1=0x60000,EGRESS_MASK=0x000006000000000000> EGRESS_MASK.ipipe0[52]: <EGRESS_MASK_W1=0x60000,EGRESS_MASK=0x000006000000000000> 
cumulus@switch:~$  
RN-216 Broadcast traffic in a bridge gets sent to the CPU even if the bridge doesn't have SVI enabled

Any time a bridge or VLAN interface is created, like bridge br0 or VLAN br0.100, that interface enables a switch virtual interface (SVI) and broadcast traffic in the bridge will be sent up to CPU.

With a VLAN-unaware bridge, a bridge interface is necessary to do bridging. Thus, an SVI gets automatically created, and broadcast traffic is always sent to the CPU.

Normally, with a VLAN-aware bridge, the SVI gets enabled only when a VLAN (like br0.100) is created, so the broadcast traffic is not always sent to the CPU.

Note that for both VLAN-aware and VLAN-unaware bridges, this behavior applies only to broadcast traffic and nothing else.

However, there is a known issue where broadcast traffic will get sent to the CPU even if a VLAN interface is not configured in a VLAN-aware bridge. It will be fixed in a future Cumulus Linux release.

RN-217 LNV: Network restart removes vxsnd anycast IP address from loopback interface

Given the following conditions:

  • You have not configured a loopback anycast IP address in /etc/network/interfaces
  • You enabled the vxsnd (service node daemon) log to automatically add anycast IP addresses

When you restart networking (with service networking restart), the anycast IP address gets removed from the loopback interface.

To prevent this issue from occurring, you should specify an anycast IP address for the loopback interface in both /etc/network/interfaces and vxsnd.conf. This way, in case the vxsnd fails, you can withdraw the IP address.

RN-219 SNMP getting incorrect ifSpeed and ifHighSpeed from 40G interfaces; 10G interfaces are correct  
RN-220 Error after running snmpwalk: OID not increasing: iso.3.6.1.2.1.4.24.4.1.1.100.1.0.0.255.255.0.0.0.0.0.0.0; unknown interface in /proc/net/ipv6_route (( '#020#021<P��~P#017~Z')  
RN-221 BGP graceful restart, including helper mode, not fully supported If you encounter issues with this, please submit a support request and include the output from cl-support with your ticket.
RN-222 lscpu command returns error on PowerPC switches Running the lscpu -p command returns an error:
root@switch:~# lscpu -p
lscpu: error: cannot open /sys/devices/system/cpu/cpu0/cache/index2/size: No such file or directory

To work around this issue, use cat /proc/cpuinfo:

root@switch:~# cat /proc/cpuinfo 
processor       : 0
cpu             : e500v2
clock           : 1200.000000MHz
revision        : 5.1 (pvr 8021 1051)
physical id     : 0
core id         : 0
bogomips        : 150.00

processor       : 1
cpu             : e500v2
clock           : 1200.000000MHz
revision        : 5.1 (pvr 8021 1051)
physical id     : 0
core id         : 1
bogomips        : 150.00

total bogomips  : 300.00
timebase        : 75000000
platform        : Delta Networks, Inc  ET-7448
model           : dni,et-7448bf
Vendor          : Freescale Semiconductor
PVR             : 0x80211051
SVR             : 0x80ea0021
PLL setting     : 0x4
Memory          : 2048 MB
Memory          : 2048 MB
RN-223 ifupdown2: For VRR, changing the primary IP address flips the order of primary and secondary addresses in the routing table

VRR requires that a physical interface must be created any virtual MAC-VLAN devices. If you change the IP address on a physical interface and run ifup bridge.xx, then the device order gets changed and the switch will use the virtual MAC-VLAN device for resolving the host ARP entries. In a CLAG setup, if you use a virtual device for resolving ARP, you may not receive an ARP reply since the virtual MACs are the same on both switches.

To work around this issue, bring down the virtual MAC-VLAN device when ever its parent is down and bring it back up when ever its parent is up.

cumulus@switch:~$ sudo ip route | grep 10.0.0.
10.0.0.0/24 dev bridge.100  proto kernel  scope link  src 10.0.0.2 
10.0.0.0/24 dev bridge-100-v0  proto kernel  scope link  src 10.0.0.1 
cumulus@switch:~$ 
cumulus@switch:~$ sudo vi /etc/network/interfaces
cumulus@switch:~$  
cumulus@switch:~$ sudo ifup -a
warning: /etc/network/interfaces: cannot find source file /etc/network/interfaces.d/*.if
cumulus@switch:~$  
cumulus@switch:~$ sudo ip route | grep 10.0.0.
10.0.0.0/24 dev bridge-100-v0  proto kernel  scope link  src 10.0.0.1 
10.0.0.0/24 dev bridge.100  proto kernel  scope link  src 10.0.0.254 
cumulus@switch:~$  
cumulus@switch:~$ ping 10.0.0.4
PING 10.0.0.4 (10.0.0.4) 56(84) bytes of data.
From 10.0.0.1 icmp_seq=9 Destination Host Unreachable
From 10.0.0.1 icmp_seq=10 Destination Host Unreachable
From 10.0.0.1 icmp_seq=11 Destination Host Unreachable
From 10.0.0.1 icmp_seq=20 Destination Host Unreachable
From 10.0.0.1 icmp_seq=21 Destination Host Unreachable
From 10.0.0.1 icmp_seq=22 Destination Host Unreachable
^C
--- 10.0.0.4 ping statistics ---
25 packets transmitted, 0 received, +6 errors, 100% packet loss, time 24002ms
pipe 3
cumulus@switch:~$ 
cumulus@switch:~$ sudo ifdown bridge.100
cumulus@switch:~$ 
cumulus@switch:~$ sudo ip route | grep 10.0.0.
10.0.0.0/24 via 10.0.5.2 dev peer6.4000  proto zebra  metric 20 
cumulus@switch:~$ 
cumulus@switch:~$ sudo ifup bridge.100
warning: /etc/network/interfaces: cannot find source file /etc/network/interfaces.d/*.if
cumulus@switch:~$ 
cumulus@switch:~$ sudo ip route | grep 10.0.0.
10.0.0.0/24 dev bridge.100  proto kernel  scope link  src 10.0.0.254 
10.0.0.0/24 dev bridge-100-v0  proto kernel  scope link  src 10.0.0.1 
cumulus@switch:~$ 
cumulus@switch:~$ ping 10.0.0.4
PING 10.0.0.4 (10.0.0.4) 56(84) bytes of data.
64 bytes from 10.0.0.4: icmp_req=1 ttl=64 time=1.70 ms
64 bytes from 10.0.0.4: icmp_req=2 ttl=64 time=0.582 ms
64 bytes from 10.0.0.4: icmp_req=3 ttl=64 time=0.599 ms
64 bytes from 10.0.0.4: icmp_req=4 ttl=64 time=0.528 ms
^C
--- 10.0.0.4 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.528/0.853/1.706/0.494 ms
cumulus@switch:~$ 
RN-224 bgpd crashes when unbinding a peer from its peer group bgpd may encounter an exception if a peer is disassociated from its peer group. If you need to do this, Cumulus Networks recommends you delete the peer and then reconfigure it as a standalone peer.
RN-225 cl-bgp neighbor restart doesn't work for interface peer For example:
cumulus@switch:~$ sudo cl-bgp summary
BGP router identifier 10.10.0.1, local AS number 100
Read-only mode update-delay limit: 60 seconds
  First neighbor established: 2015/01/07 02:55:45.117
          Best-paths resumed: 2015/01/07 02:55:46.115
        zebra update resumed: 2015/01/07 02:55:46.165
        peers update resumed: 2015/01/07 02:55:46.165
BGP table version 704
RIB entries 1407, using 165 KiB of memory
Peers 8, using 133 KiB of memory
Peer groups 2, using 112 bytes of memory

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
swp32s3         4   400     184     187        0    0    0 02:56:06      102
swp32s0         4   100     184     182        0    0    0 02:56:11      502
swp32s1         4   100     184     182        0    0    0 02:56:11      502
swp32s2         4   100     184     183        0    0    0 02:56:10      502
swp1s1          4   100     184     183        0    0    0 02:56:11      502
swp1s2          4   100     184     183        0    0    0 02:56:11      502
swp1s3          4   100     184     183        0    0    0 02:56:11      502
swp1s0          4   100     184     182        0    0    0 02:56:11      502

Total number of neighbors 8
cumulus@switch:~$ sudo cl-bgp neighbor restart swp1s0 
swp1s0 is not a valid IPv4/v6 address
cumulus@switch:~$ sudo cl-bgp neighbor restart ?      
all peer-group as external swp32s3 swp32s0 swp32s1 swp32s2 swp1s1 swp1s2 swp1s3 swp1s0 swp32s3 swp32s0 swp32s1 swp32s2 swp1s1 swp1s2 swp1s3 swp1s0
cumulus@switch:~$  
RN-226 cl-bgp distance doesn't accept add/del  
RN-227 BGP dynamic capability is not supported BGP peer sessions with dynamic capability are not supported under any version of Cumulus Linux at this time.
RN-228 ifupdown2 doesn't bring up loopback interface (lo) by default

ifupdown2 doesn't bring up loopback interface (lo) by default.

One effect is that since monit uses loopback interfaces, if you delete the lo interface in /etc/network/interfaces, monit commands will not work.

RN-229 When a bond subinterface that is part of a non-VLAN-aware bridge is brought down, it flaps that bridge This issue has been encountered in environments where both VLAN-aware and non-VLAN-aware bridges are in use, where a non-VLAN-aware bridge has a subinterface of a bond that is present as a normal interface in a VLAN-aware bridge.
RN-230 bond-min-links defaults to 0, which causes the bond carrier to be incorrectly up when no slaves are active Even though bond-min-links defaults to 0, Cumulus Networks recommends a minimum link setting of 1 or higher for bonds. Cumulus Linux issues a warning when the setting is 0.
RN-243 Links don't come up after rebooting a switch

The following layer 1 commands are not downloaded to the switch ports during bootup:

link-autoneg
link-duplex
link-speed

As a result, the affected ports don't come up.

To work around this issue, do one of the following:

  • Run ifreload -a to reload the configuration of all interfaces.
  • Run ifreload swpXX to reload the configuration only for the affected interface.

The ifreload command pushes the layer 1 configuration down to the hardware.

To have the workaround persist across reboots of the switch, add the following line to the /etc/rc.local file:

ifreload -a

The rc.local file is executed at the end of the boot procedure, so this forces a reload of the interface configurations and works around this issue.

RN-266 eBGP peers cannot set global next hop via outbound route-map

BGP next hop modification to a specific value using an outbound route-map does not take effect when sending updates to eBGP peers. iBGP peers are unaffected.

This issue is currently being investigated.

RN-267 BGP: When peering on link-local addresses with no global IPv6 address, a null next hop value is sent

When BGP peering is established using link-local IPv6 addresses and there are no global IPv6 addresses configured on the peering interface, the BGP speakers may exchange updates that include an unspecified next hop value in the MP_REACH_NLRI attribute. If the receiver of such a BGP update is a device that is not running Cumulus Linux, it may flag this as an error and treat the update as an implicit withdraw.

This issue is currently being investigated.

RN-268 bridge fdb add command IP address type options inconsistent with Debian

The bridge fdb add commands handles the IP address type in a non-standard way:

  • bridge fdb add [MAC address] dev [INTERFACE] master defaults to a static address
  • bridge fdb add [MAC address] dev [INTERFACE] master temp maps to dynamic address
  • bridge fdb add [MAC address] dev [INTERFACE] master local maps to permanent address

This issue should be fixed in a future version of Cumulus Linux.

RN-270 inotify support

inotify is not supported by the overlayfs root filesystem on PowerPC platforms.


RN-372 (CM-9360)
Security Update for CVE-2015-7547: glibc getaddrinfo Stack-based Buffer Overflow Vulnerability For details on this issue and how to upgrade, read this article.
Have more questions? Submit a request

Comments

Powered by Zendesk