Cumulus VX 3.3.1 Release Notes

Follow

Overview

Cumulus VX is a free virtual environment for cloud and network administrators to test the latest technology from Cumulus Networks, removing all organizational and economic barriers to getting started with open networking in your own time, at your own pace, and within your own environment.

The environment can be used to learn about, and evaluate, Cumulus Linux, anytime and anywhere, producing sandbox environments for prototype assessment, pre-production rollouts, and script development.

These release notes support Cumulus VX 3.3.1 and describe its features and known issues.

Stay up to date: Click Follow above so you can receive a notification when we update these release notes.

{{table_of_contents}}

What's New

Cumulus VX 3.y.z is a significant departure from 2.y.z releases. See the user guide for details on new behaviors and functionality.

  • SNMP is enabled for routing protocols
  • Various security fixes (see below)

Early Access Features

The following early access features are included in Cumulus VX 3.3.1:

  • QinQ: For hybrid cloud connectivity.
  • VXLAN routing: For IP routing between VXLAN VNIs in an overlay network.

Note: The EA version of NetQ is not supported under Cumulus VX 3.3.1.

Downloading Cumulus VX

Refer to the Getting Started documentation to download and setup Cumulus VX instances.

Configuration Notes

Keep in mind the following issues when you are running your Cumulus VX virtual machine.

 Perl, Python and BDB Modules

Any Perl scripts that use the DB_File module or Python scripts that use the bsddb module won't run under Cumulus VX.

Documentation

Support

Cumulus Networks provides support for customers using Cumulus VX in testing and troubleshooting environments. For more information, refer to the Cumulus VX Support Policy.

If you have any questions or feedback about Cumulus VX, visit the Cumulus VX community for further support. 

Issues Fixed in Cumulus VX 3.3.1

The following is a list of issues fixed in Cumulus VX 3.3.1 from earlier versions of Cumulus VX.

Release Note ID Summary Description

RN-581 (CM-16142)
Update for security issue: libfreetype6 font vulnerability - DSA-3839 CVE-2016-10244 CVE-2017-8105 CVE-2017-8287  

Cumulus Networks does not include freetype in the Cumulus Linux repository; however, the repository does mirror libfreetype6, which comes from the freetype source package. The fixed package version for Jessie is 2.5.2-3+deb8u2.

This issue is tracked by Debian in the following security issue and bugs:

And also in the following CVEs:

Here is the content of Debian security advisory:

Debian Security Advisory DSA-3839-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
April 28, 2017 https://www.debian.org/security/faq

*-------------------------------------------------------------------------

Package : freetype
CVE ID : CVE-2016-10244 CVE-2017-8105 CVE-2017-8287
Debian Bug : 856971 861220 861308

Several vulnerabilities were discovered in Freetype. Opening malformed
fonts may result in denial of service or the execution of arbitrary
code.

For the stable distribution (jessie), these problems have been fixed in
version 2.5.2-3+deb8u2.

We recommend that you upgrade your freetype packages.


RN-601 (CM-15926)
VRR breaks redistribute neighbor When a neighbor is learned on an interface running VRR, a duplicate /32 entry is created in table 10, and Quagga stops redistributing it. However, restarting Quagga causes the routes to show up. This is caused by an expectation that there will be only one RIB entry from a route source for any prefix, and will be fixed in the next Cumulus VX release.

RN-609 (CM-16297)
Update for security issue: quagga_sudoers suggested entry allows unbounded quagga commands without password  

/etc/sudoers.d/quagga_sudoers has the following line, which is commented out:

Cmnd_Alias  VTY_SHOW   = /usr/bin/vtysh -c show *
# %quagga ALL = (root) NOPASSWD:NOEXEC: VTY_SHOW

vtysh allows multiple -c commands on a single command line, and sudoers cannot parse the line so as to filter out extra commands. So if an administrator uncomments the line, any user can create any Quagga configuration that is possible with a -c argument.

Cumulus Networks recommends you edit the /etc/sudoers.d/quagga_sudoers file and delete the two lines mentioned above along with the preceding block comment.

This issue is fixed in Cumulus VX 3.3.1.


RN-610 (CM-16341)
ifupdown2 does not apply `link-down yes` for bridge ports or bond slaves  

If link-down yes is configured for a swp interface, it does not take effect when ifreload is run if the switch port is already in an up state and part of bridge or bond.

This issue has been fixed in Cumulus VX 3.3.1.


RN-611 (CM-15813)
Bridge with `bridge-igmp-querier-src` configured still sources queries from 0.0.0.0  

When a bridge is configured with a VLAN interface (such as bridge1.10 below), and that interface has bridge-igmp-querier-src configured, IGMP queries generated from the bridge still source from 0.0.0.0:

auto bridge1
iface bridge1
 bridge-vlan-aware yes
 bridge-pvid 10
 bridge-ports swp52 swp49
 bridge-mcsnoop 1
 bridge-mcquerier 1
 bridge-mcqifaddr 1

auto bridge1.10
iface bridge1.10
 address 192.168.85.1/24
 bridge-igmp-querier-src 192.168.85.1

This issue is fixed in Cumulus VX 3.3.1.


RN-612 (CM-15950)
cl_drop_cntrs_pp.py error with subinterfaces configured, causes high CPU utilization  

In Cumulus VX, cl_drop_cntrs_pp.py has been updated to ignore interface names that include the @ character.

This prevents the following error from being reported when a subinterface was configured on the switch, causing high CPU utilization:

2017-04-18T02:28:45.178409+00:00 leaf-103-01 cl_drop_cntrs_pp.py: Error: ethtool EXCEPTION=Command '['/sbin/ethtool', '-S', 'swp6.2@swp6']' returned non-zero exit status 96

This issue is fixed in Cumulus VX 3.3.1.


RN-615 (CM-16309)
switchd dumps core in sub_intf hash table when sw_sub_int_key_ht is NULL during neighbor sync

Syncing VRFs triggered a route sync without setting up all the needed hash tables.

This issue has been fixed in Cumulus VX 3.3.1.

 

RN-617 (CM-16413)
Complete loss of traffic on bond subinterface when member goes down

On a switch where a bond and a subinterface of that bond are configured, when a member of that bond goes down, all unicast IP traffic destined to the switch is not terminated.

This issue has been fixed in Cumulus VX 3.3.1.

Known Issues in Cumulus VX 3.3.1

Issues are categorized for easy review. Some issues are fixed but will be available in a later release.

Release Note ID Summary Description

RN-52 (CM-997,
CM-1013)
Parameters like the router ID and DR priority cannot be changed while OSPFv2/v3 is running Router ID and DR priority can only be changed by shutting down OSPFv2/v3, changing the ID, and restarting the OSPF process.

A change to the DR priority may not properly be reflected in the LSAs that are still aging out.

RN-56 (CM-343)
IPv4/IPv6 forwarding disabled mode not recognized

If either of the following is configured:

net.ipv4.ip_forward == 0 

or:

net.ipv6.conf.all.forwarding == 0 

The hardware still forwards packets if there is a neighbor table entry pointing to the destination.


RN-77 (CM-265)
New routes/ECMPs can evict existing/installed Cumulus VX syncs routes between the kernel and the switching silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes to move from being fully allocated to being partially allocated.

In order to avoid this, routes in the hardware should be monitored and kept below the ASIC limits.

For example, on systems with Trident+ chips, the limits are as follows:
routes: 16384 <<<< if all routes are ipv4 
 long mask routes 256 <<<< i.e., routes with a mask longer 
       than the route mask limit 
 route mask limit 64
 host_routes: 8192 
 ecmp_nhs: 4044 
 ecmp_nhs_per_route: 52 
That translates to about 77 routes with ECMP NHs, if every route has the maximum ECMP NHs.

Monitoring this in Cumulus VX is performed via the cl-resource-query command:
cumulus@switch:~$ sudo cl-resource-query
 hosts : 3 
 all routes : 29 
 IP4 routes : 17 
 IP6 routes : 12 
 nexthops : 3 
 ecmp_groups : 0
 ecmp_nexthops : 0
 mac entries : 0 / 131072 
 bpdu entries : 500 / 512
The resource to monitor is the ecmp_nexthops. If this count is close to 4044, new ECMPs may evict existing routes.

RN-121 (CM-2123)
ptmd: When a physical interface is in a PTM FAIL state, its subinterface still exchanges information

When ptmd is incorrectly in a failure state and the Zebra interface is enabled, PIF BGP sessions are not establishing the route, but the subinterface on top of it does establish routes.

If the subinterface is configured on the physical interface and the physical interface is incorrectly marked as being in a PTM FAIL state, routes on the physical interface are not processed in Quagga, but the subinterface is working.

Steps to reproduce:

cumulus@switch:$ sudo vtysh -c 'show int swp8' 
Interface swp8 is up, line protocol is up 
PTM status: fail
index 10 metric 1 mtu 1500 
 flags: <UP,BROADCAST,RUNNING,MULTICAST>
 HWaddr: 44:38:39:00:03:88 
 inet 12.0.0.225/30 broadcast 12.0.0.227 
 inet6 2001:cafe:0:38::1/64 
 inet6 fe80::4638:39ff:fe00:388/64 
cumulus@switch:$ ip addr show | grep swp8 
 10: swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc pfifo_fast state UP qlen 500 
  inet 12.0.0.225/30 brd 12.0.0.227 scope global swp8 
 104: swp8.2049@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.229/30 brd 12.0.0.231 scope global swp8.2049 
 105: swp8.2050@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.233/30 brd 12.0.0.235 scope global swp8.2050 
 106: swp8.2051@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.237/30 brd 12.0.0.239 scope global swp8.2051 
 107: swp8.2052@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.241/30 brd 12.0.0.243 scope global swp8.2052 
 108: swp8.2053@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP>
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.245/30 brd 12.0.0.247 scope global swp8.2053 
 109: swp8.2054@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.249/30 brd 12.0.0.251 scope global swp8.2054
 110: swp8.2055@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP>
  mtu 1500 qdisc noqueue state UP 
  inet 12.0.0.253/30 brd 12.0.0.255 scope global swp8.2055
cumulus@switch:$ bgp sessions: 
 12.0.0.226 ,4 ,64057 , 958 , 1036 , 0 , 0 , 0 ,15:55:42, 0, 10472 
 12.0.0.230 ,4 ,64058 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285
 12.0.0.234 ,4 ,64059 , 958 , 1049 , 0 , 0 , 0 ,15:55:40, 187, 10285 
 12.0.0.238 ,4 ,64060 , 958 , 1039 , 0 , 0 , 0 ,15:55:45, 187, 10285 
 12.0.0.242 ,4 ,64061 , 958 , 1014 , 0 , 0 , 0 ,15:55:46, 187, 10285 
 12.0.0.246 ,4 ,64062 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285 
 12.0.0.250 ,4 ,64063 , 958 , 1029 , 0 , 0 , 0 ,15:55:43, 187, 10285 
 12.0.0.254 ,4 ,64064 , 958 , 1036 , 0 , 0 , 0 ,15:55:44, 187, 10285 

RN-125 (CM-1576)
Network LSA with an old router ID isn't flushed out by the originator
When the router ID is changed, the router should remove the previous network LSA (link-state advertisement) that it generated based on the IP address on the interface in the Network LSA.

Cumulus Networks doesn't remove this LSA, so it will be naturally aged out.

RN-199 (CM-2624)
When a Quagga route-map is modified, the switch could use the partial map before edits are completed

Cumulus VX triggers a route-map update before the user finishes editing the route map, resulting in an incorrect route map being used. The route-map update trigger should only occur when user finishes editing the map.

Cumulus Networks is working to fix this issue.


RN-221 (CM-4501)
BGP graceful restart, including helper mode, not fully supported If you encounter issues with this, please submit a support request and include the output from cl-support with your ticket.

RN-227 (CM-3388)
BGP dynamic capability is not supported BGP peer sessions with dynamic capability are not supported under any version of Cumulus VX at this time.

RN-322 (CM-7387)
Interfaces disabled using iproute2 become enabled after restarting Quagga By default, all interfaces have a "no shutdown" associated with them in Quagga. Thus, when you restart Quagga, it enables the interfaces. This is expected behavior in Quagga. There is no workaround at this time.

RN-327 (CM-4290)
Changing the route-map parameter of the redistribute command in OSPF and BGP doesn't affect the state of the resulting redistribution in those protocols

To work around this issue, remove any old redistribute command configurations before adding a new one with or without route-map as a parameter.

For example, if OSPF has a redistribute configuration such as redistribute bgp route-map redist-map-name, you would enable redistribution without a route-map by following these steps in OSPF configuration mode:

  1. no redistribute bgp
  2. redistribute bgp

You would perform a similar sequence of commands for redistribution changes in BGP as well.


RN-351 (CM-7829)
Installing LNV

The LNV packages are not installed when you upgrade Cumulus VX. You can get the latest version of LNV for this release of Cumulus VX by installing the LNV packages for the registration and service node daemons using apt-get install vxfld-vxrd and/or apt-get install vxfld-vxsnd, depending upon how you intend to use LNV.


RN-355 (CM-7994)
OSPFv2 Area ID being implicitly translated from Integer format to dotted decimal format

While OSPF area ID configuration in Quagga allows for the value to be specified in either dotted decimal format, or as an integer, values specified as an integer will be converted into dotted decimal format when displayed, causing potential confusion for the operator.

This issue does not impact OSPF functionality; only the display output. However, it is recommended that the OSPF area ID is specified in dotted decimal format for consistency.

 

RN-382 (CM-6692)
Quagga: Removing bridge via ifupdown2 does not remove it from Quagga Removing a bridge using ifupdown2 does not remove it from the Quagga configuration files. This issue is being investigated; however, restarting Quagga will successfully remove the bridge.

RN-384 (CM-7684)
Keeping VXLAN single-connected devices up on MLAG secondary node In the current MLAG secondary design, if the VXLAN device is not dual-connected, it is kept in a protodown state. You can keep them up with individual IP addresses rather than anycast IPs when the peerlink is down, so that all single-connected hosts will have connectivity. Further investigation regarding this issue is underway.

RN-387 (CM-8163)
Quagga appears to not honor passive interfaces if VRR is active

In a VRR configuration, any interface-specific routing configuration (e.g., OSPF mode of operation) specified on the subinterface having a virtual IP address does not take effect. This is because when an operator has specified a virtual IP on a bridge, the system creates another internal interface bridge with the virtual IP and MAC. These two interfaces are treated distinctly by Quagga, so any interface-specific routing configuration on the bridge does not get carried over to the second bridge.

In a VRR deployment needing any interface-specific routing configuration on the interface with a virtual IP address, the routing configuration has to be specified against the internally-created virtual interface also.


RN-389 (CM-8410)
switchd supports only port 4789 as the UDP port for VXLAN packets

switchd currently allows only the standard port 4789 as the UDP port for VXLAN packets. There are cases where a hypervisor could be using non-standard UDP port, which would cause VXLAN exchanges with the hardware VTEP to not work. In such a case, packets would not be terminated and encapsulated packets would be sent out on UDP port 4789.


RN-404 (CM-4407)
Aggregating routes in BGP with as-set can result in high CPU usage 

When BGP is configured with aggregate addresses with as-set configuration and there are many routes to be aggregated, the BGP process gets into high CPU usage.

To work around this issue, do not specify the as-set parameter for the aggregate-address configuration.


RN-409 (CM-10054)
BGP may show an inaccessible path as the best path

Existing BGP issues caused peering between a VRF device and a loopback BGP session to stay up if the loopback session doesn’t advertise its local address.

This issue will be fixed in a future release.


RN-446 (CM-10513)
Redistribute neighbor does not work with more than 1024 interfaces

The rdnbrd service crashes because it cannot work with more than 1024 interfaces.

This issue should be fixed in a future release of Cumulus VX.


RN-462 (CM-12631)
VX on Vbox - can't script the upgrade to 3.1.1 via apt-get update -y due to grub-pc dialog

Due to hypervisor behavior, upgrades to the grub-pc package may require manual intervention on VX platforms. This is specific to VX and will not occur on hardware platforms.

To resolve the problem, select "/dev/sda" from the interactive menu and continue installation.


RN-595 (CM-15934)
Error returned: Not all ping requests matched on the ingress ACL rule

When pinging between hosts, all the pings are successfully transmitted. However, the ACL egress packet count is incorrect for the VLAN in question, and fewer packets are counted. The following error can occur:

"2017-04-17 14:04:06,506:  ERROR: tests.acl.acl_svi_tests: Not all the ping requests are matched on the ACL rules. SVI: vlan100"
"2017-04-17 14:04:06,507:  ERROR: tests.acl.acl_svi_tests: Expected: 20 Got: in:20 out:10"
"2017-04-17 14:04:06,507:  ERROR: tests.lib.base: Not all ping requests matched on the ingress ACL rule. Exp: 20, Got: in:20 out:10"

This issue is being investigated.


RN-597 (CM-15705)
sFlow doesn't generate flow samples to sflowd on Tomahawk-based switches At this time, sFlow is not supported on switches with Tomahawk ASICs. This is a known issue. 

RN-598 (CM-15575)
CLAGD process restarts when updating backup-ip

An error was found when an accidental change was made to the backup IP, and then corrected. ifreload -a would restart the clagd process to invoke the daemon with the new backup IP, rather than updating the backup IP with the change.

This issue is being investigated.


RN-599 (CM-15949)
DHCRELAY automatically binds to eth0 when not specified in the configuration dhcrelay listens for all interfaces that have an IP, even if not configured to listen for that interface. This causes dhcrelay to bind to unspecified ports.

This behavior is expected, due to upstream configuration. The packet is dropped later in the process, as it is not coming from a configured port.


RN-601 (CM-15926)
VRR breaks redistribute neighbor When a neighbor is learned on an interface running VRR, a duplicate /32 entry is created in table 10, and Quagga stops redistributing it. However, restarting Quagga causes the routes to show up. This is caused by an expectation that there will be only one RIB entry from a route source for any prefix, and will be fixed in the next Cumulus VX release.

RN-604 (CM-15959)
ARP suppression does not work well with VxLAN A-A

In some instances, ARP requests do not get suppressed (when they ought to be) in a VxLAN A-A scenario, but instead get flooded over VxLAN tunnels. This issue is caused because there is no "control plane" syncing the snooped local neighbor entries between the CLAG pair; CLAG does not perform this sync, and neither does EVPN.

This issue is being investigated.


RN-605 (CM-15515)
Unable to change the bond-modes using ifup or ifreload When the bond mode is changed from 802.3ad to balance-xor or vice versa using ifup bondx or ifreload -a, the bond-mode does not change, and the following error is produced:
2017-03-23 21:39:37,495:  DEBUG:      autolib.netobjects: [cumulus@127.0.0.1:1042] sudo: ('ifup bond1',)
2017-03-23 21:39:37,926:  DEBUG:      autolib.netobjects: warning: error writing to file /sys/class/net/bond1/bonding/mode([Errno 39] Directory not empty)

This issue is being addressed in a later release.


RN-606 (CM-6366)
BGP: MD5 password is not enforced for dynamic neighbors

It was determined that the MD5 password configured against a BGP listen-range peer-group (used to accept and create dynamic BGP neighbors) is not enforced. This means that connections are accepted from peers that don't specify a password; and only if they don't.

This issue is being investigated.

Have more questions? Submit a request

Comments

Powered by Zendesk