Cumulus VX 3.2.1 Release Notes

Follow

Overview

Cumulus VX is a community-supported virtual environment for cloud and network administrators to test the latest technology from Cumulus Networks, removing all organizational and economic barriers to getting started with open networking in your own time, at your own pace, and within your own environment.

The environment can be used to learn about, and evaluate, Cumulus Linux, anytime and anywhere, producing sandbox environments for prototype assessment, pre-production rollouts, and script development.

These release notes support Cumulus VX 3.2.1 and describe its features and known issues.

Stay up to date: Click Follow above so you can receive a notification when we update these release notes.

{{table_of_contents}}

What's New

Cumulus VX 3.2.1 includes the following improvement:

  • Network Command Line Utility: We've improved the syntax so it's even easier for network operators to configure Cumulus Linux with NCLU.

Cumulus VX 3.y.z is a significant departure from 2.y.z releases. See the user guide for details on new behaviors and functionality.

Downloading Cumulus VX

You can download any of the of the four Cumulus VX images: 

  • An OVA disk image for use with VirtualBox.
  • A VMware-specific OVA disk image.
  • A qcow2 disk image for use with KVM.
  • A Box image for use with Vagrant.

Configuration Notes

Keep in mind the following issues when you are running your Cumulus VX virtual machine.

SNMP Not Supported in Quagga

There is no SNMP support for Quagga in Cumulus VX. However, it's possible to get it via SNMP by:

  • Using Nagios
  • Writing a pass persist script in Perl or Python by filling in the OSPF or BGP (rfc) MIBs manually.
  • Creating your own private MIB for the information you need.

Due to this circumstance, you must remove all references to smux in each of the following configuration files. If the smux entries are present in the configuration files, the daemons in the 2.5 packaged version of Quagga will not start.

  1. cd /etc/quagga
  2. grep smux *
  3. Delete all lines in the config files containing the smux keyword.

The references to smux that must be removed are:

  • In bgpd.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.2 quagga_bgpd
  • In ospf6d.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.6 quagga_ospf6d
  • In ospfd.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.5 quagga_ospfd
  • In zebra.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.1 quagga_zebra

 Perl, Python and BDB Modules

Any Perl scripts that use the DB_File module or Python scripts that use the bsddb module won't run under Cumulus VX.

Documentation

You can read the technical documentation here.

Community Support

If you have any questions or feedback about Cumulus VX, visit the Cumulus VX community for further support. 

Issues Fixed in Cumulus VX 3.2.1

The following is a list of issues fixed in Cumulus VX 3.2.1 from earlier versions of Cumulus VX.

Release Note ID Summary Description

RN-546 (CM-14051)
netd crashes at "snapper list" after running "net show commit history"

This issue has been seen on switches that upgraded from a version of Cumulus VX earlier than 3.2.1. To work around the issue, install the cumulus-snapshot package on the switch. This activates the NCLU rollback capability.

cumulus@switch:~$ sudo apt-get install cumulus-snapshot

This issue has been fixed in Cumulus VX 3.2.1.


RN-547 (CM-14060)
First Quagga reload causes routing changes for an older configuration (earlier than version 3.0)

For older style Quagga configurations (from Cumulus VX versions earlier than 3.0), the first time Quagga is reloaded after upgrading to Cumulus VX 3.y.z, the configuration gets overwritten with the version 3.y.z defaults. This affects these two configuration settings specifically:

  • The bgp bestpath as-path multipath-relax no-as-set setting, which was replaced with no-as-set as the default, so it no longer needs to be specified. This causes all BGP sessions to be reset.
  • The IP import-table, which is used by redistribute neighbor. This causes Cumulus VX to unimport the routes and then add them back again, causing routing prefix changes throughout the network, potentially blackholing traffic.

This issue has been fixed in Cumulus VX 3.2.1.


RN-550 (CM-13674)
The ZTP daemon shuts itself down after 5 minutes of inactivity

The zero touch provisioning (ZTP) daemon ztpd shuts itself down after 5 minutes of inactivity but the service remains enabled for the next reboot.

This can affect deployments where a switch might be powered up in a remote data center for weeks without ever being configured. In such a case, there is no way to automatically initiate the ZTP process.

This issue has been fixed in Cumulus VX 3.2.1.


RN-551 (CM-14264)
Layer 3 egress rewrite information associated with wrong VLAN, causing uplinks to stop forwarding traffic toward the core

A race condition can occur where forwarding rewrite information may not get programmed correctly, when a port is configured as a bridge port and is then reconfigured as a layer 3 uplink port. In this scenario, the exact same neighbor is falsely being re-learned immediately on the reconfigured port, resulting in layer 3 egress rewrite pointing to the bridge, rather than the intended next hop.

This issue has been fixed in Cumulus VX 3.2.1.


RN-555 (CM-14069)
apt doesn't validate InRelease signatures correctly; DSA-3711, CVE-2016-1252

Jann Horn of Google Project Zero discovered that APT, the high level package manager, does not properly handle errors when validating signatures on InRelease files. An attacker able to man-in-the-middle HTTP requests to an apt repository that uses InRelease files (clearsigned Release files), can take advantage of this flaw to circumvent the signature of the InRelease file, leading to arbitrary code execution.

This issue has been fixed Debian Jessie version 1.0.9.8.4 and also in Cumulus VX 3.2.1.


RN-557 (CM-14157)
Security patch for CVE-2016-8655 af_packet.c namespace vulnerability

This is a a fix for security issue CVE-2016-8655.

It is a vulnerability that requires local access, so it's not remotely exploitable.

This issue has been fixed in Cumulus VX 3.2.1.


RN-558 (CM-14125)
Kernel panic in multicast_v4_queriers_show during ifreload -a

Cumulus VX wasn't checking for configured VLANs, which resulted in some corruption.

This issue has been fixed in Cumulus VX 3.2.1.


RN-560 (CM-13328)
Quagga sometimes installs a duplicate static route

On a Cumulus VX switch with VRF routes installed, once rebooted VRF routes are present in the kernel but are not installed into hardware.

Restarting quagga.service resolves the issue.

This issue has been fixed in Cumulus VX 3.2.1.


RN-561 (CM-13485)
Quagga does not reject subnet mask for "bgp neighbor update-source" command

The bgp neighbor update-source command is accepted without error. However, the BGP session with the neighbor is not built as no TCP/BGP packets are sent.

If the subnet mask is removed from the IP address specified in update-source, then peering forms as expected.

This issue has been fixed in Cumulus VX 3.2.1.


RN-562 (CM-13425)
BFD up/down status is not reflected in Quagga for non-default VRF single-hop BFD sessions

This issue occurs because PTM doesn’t keep track of a VRF for single-hop BFD sessions, as they are interface-based sessions, and the status up/down messages to Quagga for single-hop sessions do not contain VRF information.

The zebra daemon searches interfaces across all VRFs based on the interface name extracted from the BFD status message. So, the search does not fail in the zebra daemon. However, the bgpd/ospd interface search is done per VRF and uses the default VRF to search if no VRF is sent in the status message. So, the search fails and the BFD status changes are ignored.

This issue has been fixed in Cumulus VX 3.2.1.


RN-563 (CM-14501)
switchd fails to start if swp28=4x10G and swp33 is not explicitly disabled in ports.conf

A switchd error occurred in configurations that broke out swp28 to either 4x10GB or 4x25GB but had not explicitly disabled swp33 in ports.conf. This error led to switchd failing to start. swp33 has now been explicitly disabled in the default ports.conf configuration.

This issue has been fixed in Cumulus VX 3.2.1.


RN-565 (CM-14289)
dhcpd crash due to memory corruption

A dhcpd crash was caused by memory corruption that led to isc-dhcp restarting multiple times. This was caused by a race condition that led to stale pointer access.

This issue has been fixed in Cumulus VX 3.2.1.


RN-566 (CM-13816)
netshow interface doesn't display interfaces defined in interfaces.d/

Interfaces defined in files accessed via a source in /etc/network/interfaces are not printed when netshow interface is run.

This issue has been fixed in Cumulus VX 3.2.1.


RN-567 (CM-12685)
Breakout port speeds of 25G and 50G are incorrectly shown as (4x10G)

For a 100G port on a switch broken out as 2x50G or 4x25G, as defined in /etc/cumulus/ports.conf, running netshow interface shows the speed of the breakout ports as 50G(4x10G) or 25G(4x10G).

This issue has been fixed in Cumulus VX 3.2.1.


RN-569 (CM-13853)
Create /etc/default/isc-dhcp-relay6 by default for IPv6 support of DHCP relay

Users were previously required to create a service in order to enable IPv6 DHCP relay support. An empty /etc/default/isc-dhcp-relay6 file has been added to allow for IPv6 DHCP relay to be enabled without creating a service.

This issue has been fixed in Cumulus VX 3.2.1.


RN-571 (CM-14791)
Management VRF loses default gateway with gateway command

When a management VRF was configured using NCLU, the route to the default gateway failed and was considered invalid.

The issue was resolved by manually adding post-up ip route add vrf mgmt default via IP_ADDRESS to the /etc/network/interfaces file then running ifreload -a.

This issue has been fixed in Cumulus Linux 3.2.1.

Known Issues in Cumulus VX 3.2.1

Issues are categorized for easy review. Some issues are fixed but will be available in a later release.

Release Note ID Summary Description

RN-52 (CM-997,
CM-1013)
Parameters like the router ID and DR priority cannot be changed while OSPFv2/v3 is running Router ID and DR priority can only be changed by shutting down OSPFv2/v3, changing the ID, and restarting the OSPF process.

A change to the DR priority may not properly be reflected in the LSAs that are still aging out.

RN-56 (CM-343)
IPv4/IPv6 forwarding disabled mode not recognized

If either of the following is configured:

net.ipv4.ip_forward == 0 

or:

net.ipv6.conf.all.forwarding == 0 

The hardware still forwards packets if there is a neighbor table entry pointing to the destination.


RN-77 (CM-265)
New routes/ECMPs can evict existing/installed Cumulus VX syncs routes between the kernel and the switching silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes to move from being fully allocated to being partially allocated.

In order to avoid this, routes in the hardware should be monitored and kept below the ASIC limits.

For example, on systems with Trident+ chips, the limits are as follows:
routes: 16384 <<<< if all routes are ipv4 
 long mask routes 256 <<<< i.e., routes with a mask longer 
       than the route mask limit 
 route mask limit 64
 host_routes: 8192 
 ecmp_nhs: 4044 
 ecmp_nhs_per_route: 52 
That translates to about 77 routes with ECMP NHs, if every route has the maximum ECMP NHs.

Monitoring this in Cumulus VX is performed via the cl-resource-query command:
cumulus@switch:~$ sudo cl-resource-query
 hosts : 3 
 all routes : 29 
 IP4 routes : 17 
 IP6 routes : 12 
 nexthops : 3 
 ecmp_groups : 0
 ecmp_nexthops : 0
 mac entries : 0 / 131072 
 bpdu entries : 500 / 512
The resource to monitor is the ecmp_nexthops. If this count is close to 4044, new ECMPs may evict existing routes.

RN-121 (CM-2123)
ptmd: When a physical interface is in a PTM FAIL state, its subinterface still exchanges information

When ptmd is incorrectly in a failure state and the Zebra interface is enabled, PIF BGP sessions are not establishing the route, but the subinterface on top of it does establish routes.

If the subinterface is configured on the physical interface and the physical interface is incorrectly marked as being in a PTM FAIL state, routes on the physical interface are not processed in Quagga, but the subinterface is working.

Steps to reproduce:

cumulus@switch:$ sudo vtysh -c 'show int swp8' 
Interface swp8 is up, line protocol is up 
PTM status: fail
index 10 metric 1 mtu 1500 
 flags: <UP,BROADCAST,RUNNING,MULTICAST>
 HWaddr: 44:38:39:00:03:88 
 inet 12.0.0.225/30 broadcast 12.0.0.227 
 inet6 2001:cafe:0:38::1/64 
 inet6 fe80::4638:39ff:fe00:388/64 
cumulus@switch:$ ip addr show | grep swp8 
 10: swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc pfifo_fast state UP qlen 500 
  inet 12.0.0.225/30 brd 12.0.0.227 scope global swp8 
 104: swp8.2049@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.229/30 brd 12.0.0.231 scope global swp8.2049 
 105: swp8.2050@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.233/30 brd 12.0.0.235 scope global swp8.2050 
 106: swp8.2051@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.237/30 brd 12.0.0.239 scope global swp8.2051 
 107: swp8.2052@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.241/30 brd 12.0.0.243 scope global swp8.2052 
 108: swp8.2053@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP>
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.245/30 brd 12.0.0.247 scope global swp8.2053 
 109: swp8.2054@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.249/30 brd 12.0.0.251 scope global swp8.2054
 110: swp8.2055@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP>
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.253/30 brd 12.0.0.255 scope global swp8.2055
cumulus@switch:$ bgp sessions: 
 12.0.0.226 ,4 ,64057 , 958 , 1036 , 0 , 0 , 0 ,15:55:42, 0, 10472 
 12.0.0.230 ,4 ,64058 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285
 12.0.0.234 ,4 ,64059 , 958 , 1049 , 0 , 0 , 0 ,15:55:40, 187, 10285 
 12.0.0.238 ,4 ,64060 , 958 , 1039 , 0 , 0 , 0 ,15:55:45, 187, 10285 
 12.0.0.242 ,4 ,64061 , 958 , 1014 , 0 , 0 , 0 ,15:55:46, 187, 10285 
 12.0.0.246 ,4 ,64062 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285 
 12.0.0.250 ,4 ,64063 , 958 , 1029 , 0 , 0 , 0 ,15:55:43, 187, 10285 
 12.0.0.254 ,4 ,64064 , 958 , 1036 , 0 , 0 , 0 ,15:55:44, 187, 10285 

RN-125 (CM-1576)
Network LSA with an old router ID isn't flushed out by the originator
When the router ID is changed, the router should remove the previous network LSA (link-state advertisement) that it generated based on the IP address on the interface in the Network LSA.

Cumulus Networks doesn't remove this LSA, so it will be naturally aged out.

RN-199 (CM-2624)
When a Quagga route-map is modified, the switch could use the partial map before edits are completed

Cumulus VX triggers a route-map update before the user finishes editing the route map, resulting in an incorrect route map being used. The route-map update trigger should only occur when user finishes editing the map.

Cumulus Networks is working to fix this issue.


RN-221 (CM-4501)
BGP graceful restart, including helper mode, not fully supported If you encounter issues with this, please submit a support request and include the output from cl-support with your ticket.

RN-227 (CM-3388)
BGP dynamic capability is not supported BGP peer sessions with dynamic capability are not supported under any version of Cumulus VX at this time.

RN-322 (CM-7387)
Interfaces disabled using iproute2 become enabled after restarting Quagga By default, all interfaces have a "no shutdown" associated with them in Quagga. Thus, when you restart Quagga, it enables the interfaces. This is expected behavior in Quagga. There is no workaround at this time.

RN-327 (CM-4290)
Changing the route-map parameter of the redistribute command in OSPF and BGP doesn't affect the state of the resulting redistribution in those protocols

To work around this issue, remove any old redistribute command configurations before adding a new one with or without route-map as a parameter.

For example, if OSPF has a redistribute configuration such as redistribute bgp route-map redist-map-name, you would enable redistribution without a route-map by following these steps in OSPF configuration mode:

  1. no redistribute bgp
  2. redistribute bgp

You would perform a similar sequence of commands for redistribution changes in BGP as well.


RN-351 (CM-7829)
Installing LNV

The LNV packages are not installed when you upgrade Cumulus VX. You can get the latest version of LNV for this release of Cumulus VX by installing the LNV packages for the registration and service node daemons using apt-get install vxfld-vxrd and/or apt-get install vxfld-vxsnd, depending upon how you intend to use LNV.


RN-355 (CM-7994)
OSPFv2 Area ID being implicitly translated from Integer format to dotted decimal format

While OSPF area ID configuration in Quagga allows for the value to be specified in either dotted decimal format, or as an integer, values specified as an integer will be converted into dotted decimal format when displayed, causing potential confusion for the operator.

This issue does not impact OSPF functionality; only the display output. However, it is recommended that the OSPF area ID is specified in dotted decimal format for consistency.

 

RN-382 (CM-6692)
Quagga: Removing bridge via ifupdown2 does not remove it from Quagga Removing a bridge using ifupdown2 does not remove it from the Quagga configuration files. This issue is being investigated; however, restarting Quagga will successfully remove the bridge.

RN-383 (CM-7196)
admin down of link deletes IPv6 nexthop static route entry, but not for IPv4

When a link is admin down and carrier is on, the IPv4 nexthop entry is marked dead, but the IPv6 nexthop entry is deleted, and will not be restored when the link is admin up. However, if carrier is off, the IPv6 nexthop is marked dead and not deleted. This inconsistency in admin down behavior is being investigated.


RN-384 (CM-7684)
Keeping VXLAN single-connected devices up on MLAG secondary node In the current MLAG secondary design, if the VXLAN device is not dual-connected, it is kept in a protodown state. You can keep them up with individual IP addresses rather than anycast IPs when the peerlink is down, so that all single-connected hosts will have connectivity. Further investigation regarding this issue is underway.

RN-387 (CM-8163)
Quagga appears to not honor passive interfaces if VRR is active

In a VRR configuration, any interface-specific routing configuration (e.g., OSPF mode of operation) specified on the subinterface having a virtual IP address does not take effect. This is because when an operator has specified a virtual IP on a bridge, the system creates another internal interface bridge with the virtual IP and MAC. These two interfaces are treated distinctly by Quagga, so any interface-specific routing configuration on the bridge does not get carried over to the second bridge.

In a VRR deployment needing any interface-specific routing configuration on the interface with a virtual IP address, the routing configuration has to be specified against the internally-created virtual interface also.


RN-389 (CM-8410)
switchd supports only port 4789 as the UDP port for VXLAN packets

switchd currently allows only the standard port 4789 as the UDP port for VXLAN packets. There are cases where a hypervisor could be using non-standard UDP port, which would cause VXLAN exchanges with the hardware VTEP to not work. In such a case, packets would not be terminated and encapsulated packets would be sent out on UDP port 4789.


RN-390 (CM-9055)
If you’re logged into the serial console and type reboot, the system may hang indefinitely

In Cumulus Linux, if the serial console terminal settings are changed and clocal (modem carrier) is turned off,  systemd may do all of the following:

  • Block and stop handling systemctl changes
  • Fail to start or restart services
  • Prevent reboot and shutdown

For example, running the /usr/bin/reset command may cause this problem when run on the serial console. Once in this state, a new login session will not be started on the serial console (/dev/ttyS0 or ttyS1) after logout, and systemd will not respond to systemctl commands, including reboot.

To work around this issue, whenever you are logged in to the serial console and run reset, run stty clocal afterwards.


RN-404 (CM-4407)
Aggregating routes in BGP with as-set can result in high CPU usage 

When BGP is configured with aggregate addresses with as-set configuration and there are many routes to be aggregated, the BGP process gets into high CPU usage.

To work around this issue, do not specify the as-set parameter for the aggregate-address configuration.


RN-409 (CM-10054)
BGP may show an inaccessible path as the best path

Existing BGP issues caused peering between a VRF device and a loopback BGP session to stay up if the loopback session doesn’t advertise its local address.

This issue will be fixed in a future release.


RN-446 (CM-10513)
Redistribute neighbor does not work with more than 1024 interfaces

The rdnbrd service crashes because it cannot work with more than 1024 interfaces.

This issue should be fixed in a future release of Cumulus VX.


RN-462 (CM-12631)
VX on Vbox - can't script the upgrade to 3.1.1 via apt-get update -y due to grub-pc dialog

Due to hypervisor behavior, upgrades to the grub-pc package may require manual intervention on VX platforms. This is specific to VX and will not occur on hardware platforms.

To resolve the problem, select "/dev/sda" from the interactive menu and continue installation.


RN-570 (CM-14499)
apt-get upgrade overwrites edits to TCAM and buffering profiles in datapath.conf without prompting

If you changed the buffering or TCAM profiles in either of the following files, the changes will be lost when you upgrade the cumulus-tools package:

  • /usr/lib/python2.7/dist-packages/cumulus/__chip_config/bcm/datapath.conf
  • /usr/lib/python2.7/dist-packages/cumulus/__chip_config/mlx/datapath.conf

Since the files are not marked as configuration files, they get overwritten without warning.

If you have changed either or both of these files, make sure to back them up before running apt-get upgrade or otherwise upgrading the cumulus-tools package, then re-apply your changes to the newly installed files after the upgrade.


RN-572 (CM-14844)
Invalid locale settings can prevent apt-get upgrade from completing

In some cases, if your locale information (language and/or character set) are invalid for Linux, you may encounter errors like the following when running apt-get upgrade when the upgrade snapshot is taken:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_CTYPE = "UTF-8",
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
Creating pre-apt snapshot... Failed to set locale. Fix your system.
ERROR:/usr/lib/cumulus/apt-snapshot-hook: Unable to create pre snapshot
E: Problem executing scripts DPkg::Pre-Invoke '/usr/lib/cumulus/apt-snapshot-hook pre-invoke'
E: Sub-process returned an error code

This is an issue with the snapper application, which takes snapshots of the Cumulus Linux NOS. Cumulus Networks intends to update snapper in the future so this issue will not cause an error. 

To work around this error, set your locale information to valid settings, such as the following:

export LC_CTYPE=en_US.UTF-8

Then run apt-get upgrade again.


RN-576 (CM-14908)
TACACS sends authentication requests out of the default VRF, not the management VRF

If a management VRF if configured, TACACS won't send authentication requests out of the management VRF. Instead, it sends these requests out of the default VRF.

To work around this issue, run the following commands, which restrict inbound SSH to only the management VRF interface and disable inbound SSH via the switch ports. Note that using SSH via the front panel ports is not a workaround.

cumulus@switch:~$ sudo systemctl disable ssh.service
cumulus@switch:~$ sudo systemctl stop ssh.service
cumulus@switch:~$ sudo systemctl enable ssh@mgmt.service
cumulus@switch:~$ sudo systemctl start ssh@mgmt.service
Have more questions? Submit a request

Comments

Powered by Zendesk