Cumulus Linux 3.5 Release Notes

Follow

Overview

These release notes support Cumulus Linux 3.5.0, 3.5.1, 3.5.2 and 3.5.3, and describe currently available features and known issues. 

Stay up to Date 

  • Please sign in and click Follow above so you can receive a notification when we update these release notes.
  • Subscribe to our product bulletin mailing list to receive important announcements and updates about issues that arise in our products.
  • Subscribe to our security announcement mailing list to receive alerts whenever we update our software for security issues.

{{table_of_contents}}

What's New in Cumulus Linux 3.5

Cumulus Linux 3.5 contains the following new features, platforms and improvements:

  • New platforms include:
    • Accton OMP-800 chassis/Cumulus Express CX-10256-S (100G)
    • Delta 9032-v1 (100G Tomahawk) and AG7648 (10G Trident II)
    • Broadcom Maverick-based 10G switches, including Dell S4128F-ON
    • Edgecore AS5812 AC with 3Y PSU
    • Facebook Wedge-100S now generally available
    • Mellanox Spectrum A1 chipsets in the 2100, 2410 and 2700 models; Mellanox 2740 (100G) and 2740B (40G)
    • Quanta LY7 (10G)
    • 10GBASE-LR BiDi optics
  • Symmetric VXLAN routing
  • VLAN-aware bridge support for ovs-vtepd, for VXLAN solutions using controllers
  • OSPF is now VRF-aware
  • Voice VLAN
  • PIM now supports overlapping IP addresses and IP multicast boundaries
  • Bridge layer 2 protocol tunnels
  • The SNMP Cumulus-Counters-MIB file includes a new table pfcClCountersTable for link pause and priority flow control counters
  • The bridge MAC address is now set to the MAC address of eth0
  • See what's new and different with NCLU in this release

Note: The EA version of netq is not supported under Cumulus Linux 3.5.

Licensing

Cumulus Linux is licensed on a per-instance basis. Each network system is fully operational, enabling any capability to be utilized on the switch with the exception of forwarding on switch panel ports. Only eth0 and console ports are activated on an un-licensed instance of Cumulus Linux. Enabling front panel ports requires a license.

You should have received a license key from Cumulus Networks or an authorized reseller. To install the license, read the Cumulus Linux Quick Start Guide.

Installing Version 3.5

If you are upgrading from version 3.0.0 or later, use apt-get to update the software.

Cumulus Networks recommends you use the -E option with sudo whenever you run any apt-get command. This option preserves your environment variables — such as HTTP proxies — before you install new packages or upgrade your distribution.

  1. Retrieve the new version packages: 
    cumulus@switch:~$ sudo -E apt-get update
  2. If you are using any early access features from an older release, remove them with:
    cumulus@switch:~$ sudo -E apt-get remove EA_PACKAGENAME
  3. Upgrade the release: 
    cumulus@switch:~$ sudo -E apt-get upgrade
  4. Reboot the switch:
    cumulus@switch:~$ sudo reboot

Note: If you see errors for expired GPG keys that prevent you from upgrading packages when upgrading to Cumulus Linux 3.5.2 or 3.5.3 from 3.5.1 or earlier, follow the steps in Upgrading Expired GPG Keys.

New Install or Upgrading from Versions Older than 3.0.0

If you are upgrading from a version older than 3.0.0, or installing Cumulus Linux for the first time, download the Cumulus Linux 3.5.0 installer for Broadcom or Mellanox switches from the Cumulus Networks website, then use ONIE to perform a complete install, following the instructions in the quick start guide.

Note: This method is destructive; any configuration files on the switch will not be saved, so please copy them to a different server before upgrading via ONIE.

Important! After you install, run apt-get update, then apt-get upgrade on your switch to make sure you update Cumulus Linux to include any important or other package updates.

Updating a Deployment that Has MLAG Configured

If you are using MLAG to dual connect two switches in your environment, and those switches are still running Cumulus Linux 2.5 ESR or any other release earlier than 3.0.0, the switches will not be dual-connected after you upgrade the first switch. To ensure a smooth upgrade, follow these steps:

  1. Disable clagd in the /etc/network/interfaces file (set clagd-enable to no), then restart the switchd, networking and FRR services.
    cumulus@switch:~$ sudo systemctl restart switchd.service
    cumulus@switch:~$ sudo systemctl restart networking.service
    cumulus@switch:~$ sudo systemctl restart frr.service
  2. If you are using BGP, notify the BGP neighbors that the switch is going down:
    cumulus@switch:~$ sudo vtysh -c "config t" -c "router bgp" -c "neighbor X.X.X.X shutdown"
  3. Stop the Quagga (if upgrading from a version earlier than 3.2.0) or FRR service (if upgrading from version 3.2.0 or later):
    cumulus@switch:~$ sudo systemctl stop [quagga|frr].service 
  4. Bring down all the front panel ports:
    cumulus@switch:~$ sudo ip link set swp<#> down
  5. Run cl-img-select -fr to boot the switch in the secondary role into ONIE, then reboot the switch.
  6. Install Cumulus Linux 3.5 onto the secondary switch using ONIE. At this time, all traffic is going to the switch in the primary role.
  7. After the install, copy the license file and all the configuration files you backed up, then restart the switchd, networking and Quagga services. All traffic is still going to the primary switch.
    cumulus@switch:~$ sudo systemctl restart switchd.service
    cumulus@switch:~$ sudo systemctl restart networking.service
    cumulus@switch:~$ sudo systemctl restart quagga.service
  8. Run cl-img-select -fr to boot the switch in the primary role into ONIE, then reboot the switch. Now, all traffic is going to the switch in the secondary role that you just upgraded to version 3.5.
  9. Install Cumulus Linux 3.5 onto the primary switch using ONIE. 
  10. After the install, copy the license file and all the configuration files you backed up.
  11. Follow the steps for upgrading from Quagga to FRRouting.
  12. Enable clagd again in the /etc/network/interfaces file (set clagd-enable to yes), then run ifreload -a.
    cumulus@switch:~$ sudo ifreload -a
  13. Bring up all the front panel ports:
    cumulus@switch:~$ sudo ip link set swp<#> up
  14. Now the two switches are dual-connected again and traffic flows to both switches.

 Perl, Python and BDB Modules

Any Perl scripts that use the DB_File module or Python scripts that use the bsddb module won't run under Cumulus Linux 3.5.

Documentation

You can read the technical documentation here.

Issues Fixed in Cumulus Linux 3.5.3

The following is a list of issues fixed in Cumulus Linux 3.5.3 from earlier versions of Cumulus Linux.

Release Note ID Summary Description

RN-804 (CM-19421)
Convergence after a spine goes down takes longer than expected

When a spine-to-leaf link is brought down or when a spine switch is powered down completely, traffic does not fail over to the remaining ECMP paths with minimal losses, and convergence takes longer than expected.

This issue is fixed in Cumulus Linux 3.5.3.


RN-815 (CM-19630)
Bridge MAC address clashing when eth0 is part of the same broadcast domain

Cumulus Linux 3.5.0 and above uses the eth0 MAC address as the MAC address for bridges. If eth0 is part of the same broadcast domain, you experience outages when upgrading to 3.5.x.

To work around this issue, manually change the bridge MAC address in the /etc/network/interfaces file.

This issue is fixed in Cumulus Linux 3.5.3


RN-817 (CM-19733)
The clagd service crashes when vxlan-local-tunnelip is not defined

If the local tunnel IP address is not defined for the VNI interface in the /etc/network/interfaces file, the clagd service crashes.

This issue is fixed in Cumulus Linux 3.5.3.


RN-818 (CM-19628)
On Mellanox switches, adding two default routes (ECMP) that point to interfaces instead of IP addresses crashes switchd

On a Mellanox switch, if you try to add two equal cost default routes that point to interfaces instead of IP addresses, switchd crashes.

This issue is fixed in Cumulus Linux 3.5.3.


RN-819 (CM-19419)
switchd crashes in libc_start_main

Under heavy load, switchd crashes due to an object reference count leak.

This issue is fixed in Cumulus Linux 3.5.3.

New Known Issues in Cumulus Linux 3.5.3

The following issues are new to Cumulus Linux and affect the current release.

Release Note ID Summary Description
 
RN-790 (CM-19014)
Configuring DHCP relay with VRR breaks ifreload

When you configure DHCP relay with VRR, the ifreload command does not work as expected; for example the IP address might be removed from an SVI.

This issue is currently being investigated.


RN-820 (CM-19908)
RADIUS and TACACS Plus should use pam_syslog not openlog/syslog/closelog

The pam_syslog() interface is now being used to send messages to the system logger, which changes the message format. For example, with an incorrect password, the old message format for TACACS Plus is:

Feb 27 21:06:02 switch3 PAM-tacplus[17368]: auth failed 2

The new message format for TACACS Plus is:

Feb 27 21:04:08 switch3 sshd[16676]: pam_tacplus(sshd:auth): auth failed 2

This issue should be fixed in the next release of Cumulus Linux.


RN-821 (CM-19898)
The net show interface command output missing information

The net show interface command output is missing LACP, CLAG, VLAN, LLDP, and physical link failure information.

This issue should be fixed in the next release of Cumulus Linux.


RN-822 (CM-19788)
Using the same VLAN ID on a subinterface and bridge VIDs for a given port is not easily corrected

If you configure a VLAN under a VLAN-aware bridge and create a subinterface of the same VLAN on one of the bridge ports, the bridge and interface compete for the same VLAN and if the interface is flapped, it stops working. Correcting the configuration and running the ifreload command does not resolve the conflict. To work around this issue, correct the bridge VIDs and restart switchd or delete the subinterface.

This issue should be fixed in the next release of Cumulus Linux.


RN-823 (CM-19724)
Multicast control protocols are classified to the bulk queue by default

PIM and MSDP entries are set to the internal COS value of 6 so they are grouped together with the bulk traffic priority group in the default traffic.conf file. However, PIM, IGMP, and MSDP are considered control-plane and should be set to the internal COS value of 7.

This issue should be fixed in the next release of Cumulus Linux.


RN-824 (CM-19667)
The show v6 route ospf command results in an unknown route type

When you run the vtysh -c 'show ipv6 route ospf json' command to show IPv6 routes through OSPF, you see the error Unknown route type. To work around this issue, you must specify ospf6 in the command:

cumulus@switch:~$  vtysh -c 'show ipv6 route ospf6 json'

This issue should be fixed in the next release of Cumulus Linux.


RN-825 (CM-19633)
cl-netstat counters count twice for VXLAN traffic in TX direction

This is expected behavior. Multicast frames are being dropped at the transmit port of the same interface on which they are received. This is known as a split-horizon correction, which is required for multicast to operate correctly.

This issue should be fixed in the next release of Cumulus Linux.


RN-826 (CM-16865)
The compute unique hash seed default value is the same for each switch

The algorithm that calculates hashing is the same on every switch instead of being unique.

This issue should be fixed in the next release of Cumulus Linux.


RN-827 (CM-14300)
cl-acltool counters for implicit accept do not work for IPv4 on management (ethX) interfaces

The iptables are not counting against the default INPUT chain rule for packets ingressing ethX interfaces.

This issue should be fixed in the next release of Cumulus Linux.


RN-828 (CM-19748)
Security: Debian Security Advisory DSA-4110-1 for exim4 issue CVE-2018-6789

The following CVE was announced in Debian Security Advisory DSA-4110-1, and affects the exim4 package. While this package is no longer in the Cumulus Linux installation image, it is still in the repo3 repository. Cumulus Linux 3.5.3 is built on Debian Jessie.

This issue should be fixed in the next version of Cumulus Linux.

-------------------------------------------------------------------------
Debian Security Advisory DSA-4110-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
February 10, 2018 https://www.debian.org/security/faq
-------------------------------------------------------------------------
Package : exim4
CVE ID : CVE-2018-6789
Debian Bug : 890000
Meh Chang discovered a buffer overflow flaw in a utility function used in the SMTP listener of Exim, a mail transport agent. A remote attacker can take advantage of this flaw to cause a denial of service, or potentially the execution of arbitrary code via a specially crafted message.
For the oldstable distribution (jessie), this problem has been fixed in version 4.84.2-2+deb8u5.
For the stable distribution (stretch), this problem has been fixed in version 4.89-2+deb9u3.


RN-829 (CM-19660)
Security: Debian Security Advisory DSA-4052-1 for Bazaar issue CVE-2017-14176

The following CVE was announced in Debian Security Advisory DSA-4052-1, and affects the Bazaar version control system.

This issue should be fixed in the next version of Cumulus Linux.

-------------------------------------------------------------------------
Debian Security Advisory DSA-4052-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
November 29, 2017 https://www.debian.org/security/faq
-------------------------------------------------------------------------
Package : bzr
CVE ID : CVE-2017-14176
Debian Bug : 874429

Adam Collard discovered that Bazaar, an easy to use distributed version control system, did not correctly handle maliciously constructed bzr+ssh URLs, allowing a remote attackers to run an arbitrary shell command.

For the oldstable distribution (jessie), this problem has been fixed in version 2.6.0+bzr6595-6+deb8u1.

For the stable distribution (stretch), this problem has been fixed in version 2.7.0+bzr6619-7+deb9u1.


RN-830 (CM-19595)
Security: Debian Security Advisory DSA-4098-1 for curl issues CVE-2018-1000005 CVE-2018-1000007

The following CVEs were announced in Debian Security Advisory DSA-4098-1, and affect the curl package.

This issue should be fixed in the next version of Cumulus Linux.

-------------------------------------------------------------------------
Debian Security Advisory DSA-4098-1 security@debian.org
https://www.debian.org/security/ Alessandro Ghedini
January 26, 2018 https://www.debian.org/security/faq
-------------------------------------------------------------------------
Package : curl
CVE ID : CVE-2018-1000005 CVE-2018-1000007
Two vulnerabilities were discovered in cURL, an URL transfer library.

CVE-2018-1000005
Zhouyihai Ding discovered an out-of-bounds read in the code handling HTTP/2 trailers. This issue doesn't affect the oldstable distribution (jessie).

CVE-2018-1000007
Craig de Stigter discovered that authentication data might be leaked to third parties when following HTTP redirects.

For the oldstable distribution (jessie), these problems have been fixed in version 7.38.0-4+deb8u9.


RN-831 (CM-19507)
Security: Debian Security Advisory DSA-4091-1 for mysql issues CVE-2018-2562 CVE-2018-2622 CVE-2018-2640 CVE-2018-2665 CVE-2018-2668

The following CVEs were announced in Debian Security Advisory DSA-4091-1, and affect all mysql packages, including mysql-* and libmysql-*.

This issue should be fixed in the next version of Cumulus Linux.

-------------------------------------------------------------------------
Debian Security Advisory DSA-4091-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
January 18, 2018 https://www.debian.org/security/faq
-------------------------------------------------------------------------
Package : mysql-5.5
CVE ID : CVE-2018-2562 CVE-2018-2622 CVE-2018-2640 CVE-2018-2665 CVE-2018-2668

Several issues have been discovered in the MySQL database server. The vulnerabilities are addressed by upgrading MySQL to the new upstream version 5.5.59, which includes additional changes. Please see the MySQL 5.5 Release Notes and Oracle's Critical Patch Update advisory for further details:

https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-59.html
http://www.oracle.com/technetwork/security-advisory/cpujan2018-3236628.html

For the oldstable distribution (jessie), these problems have been fixed in version 5.5.59-0+deb8u1.


RN-832 (CM-19458)
Security: Debian Security Advisory DSA-4089-1 for bind9 issue CVE-2017-3145

The following CVE was announced in Debian Security Advisory DSA-4089-1, and affects the bind9 package.

This issue should be fixed in the next version of Cumulus Linux.

-------------------------------------------------------------------------
Debian Security Advisory DSA-4089-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
January 16, 2018 https://www.debian.org/security/faq
-------------------------------------------------------------------------
Package : bind9

CVE ID : CVE-2017-3145
Jayachandran Palanisamy of Cygate AB reported that BIND, a DNS server implementation, was improperly sequencing cleanup operations, leading in some cases to a use-after-free error, triggering an assertion failure and crash in named.

For the oldstable distribution (jessie), this problem has been fixed in version 1:9.9.5.dfsg-9+deb8u15.

For the stable distribution (stretch), this problem has been fixed in version 1:9.10.3.dfsg.P4-12.3+deb9u4.

We recommend that you upgrade your bind9 packages.


RN-833 (CM-19446)
Security: Debian Security Advisory DSA-4086 for libxml2 issue CVE-2017-15412

The following CVE was announced in Debian Security Advisory DSA-4086-1, and affects the libxml2 package.

This issue should be fixed in the next version of Cumulus Linux.

--------------------------------------------------------------------------
Debian Security Advisory DSA-4086-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
January 13, 2018 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : libxml2
CVE ID : CVE-2017-15412
Debian Bug : 883790

Nick Wellnhofer discovered that certain function calls inside XPath
predicates can lead to use-after-free and double-free errors when
executed by libxml2's XPath engine via an XSLT transformation.

For the oldstable distribution (jessie), this problem has been fixed
in version 2.9.1+dfsg1-5+deb8u6.


RN-834 (CM-19385)
Security: Debian Security Advisories DSA-4082 for kernel issues CVE-2017-8824 CVE-2017-15868 CVE-2017-16538 CVE-2017-16939 CVE-2017-17448 CVE-2017-17449 CVE-2017-17450 CVE-2017-17558 CVE-2017-17558 CVE-2017-17741 CVE-2017-17805 and more

The following CVEs were announced in Debian Security Advisory DSA-4086-1, and affect the Linux kernel.

These issues should be fixed in the next version of Cumulus Linux.

--------------------------------------------------------------------------
Debian Security Advisory DSA-4082-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
January 09, 2018 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : linux
CVE ID : CVE-2017-8824 CVE-2017-15868 CVE-2017-16538
CVE-2017-16939 CVE-2017-17448 CVE-2017-17449 CVE-2017-17450
CVE-2017-17558 CVE-2017-17741 CVE-2017-17805 CVE-2017-17806
CVE-2017-17807 CVE-2017-1000407 CVE-2017-1000410

Several vulnerabilities have been discovered in the Linux kernel that may lead to a privilege escalation, denial of service or information leaks.

CVE-2017-8824

Mohamed Ghannam discovered that the DCCP implementation did not correctly manage resources when a socket is disconnected and reconnected, potentially leading to a use-after-free. A local user could use this for denial of service (crash or data corruption) or possibly for privilege escalation. On systems that do not already have the dccp module loaded, this can be mitigated by disabling it:

echo >> /etc/modprobe.d/disable-dccp.conf install dccp false

CVE-2017-15868

Al Viro found that the Bluebooth Network Encapsulation Protocol (BNEP) implementation did not validate the type of the second socket passed to the BNEPCONNADD ioctl(), which could lead to memory corruption. A local user with the CAP_NET_ADMIN capability can use this for denial of service (crash or data corruption) or possibly for privilege escalation.

CVE-2017-16538

Andrey Konovalov reported that the dvb-usb-lmedm04 media driver did not correctly handle some error conditions during initialisation. A physically present user with a specially designed USB device can use this to cause a denial of service (crash).

CVE-2017-16939

Mohamed Ghannam reported (through Beyond Security's SecuriTeam Secure Disclosure program) that the IPsec (xfrm) implementation did not correctly handle some failure cases when dumping policy information through netlink. A local user with the CAP_NET_ADMIN capability can use this for denial of service (crash or data corruption) or possibly for privilege escalation.

CVE-2017-17448

Kevin Cernekee discovered that the netfilter subsystem allowed users with the CAP_NET_ADMIN capability in any user namespace, not just the root namespace, to enable and disable connection tracking helpers. This could lead to denial of service, violation of network security policy, or have other impact.

CVE-2017-17449

Kevin Cernekee discovered that the netlink subsystem allowed users with the CAP_NET_ADMIN capability in any user namespace to monitor netlink traffic in all net namespaces, not just those owned by that user namespace. This could lead to exposure of sensitive information.

CVE-2017-17450

Kevin Cernekee discovered that the xt_osf module allowed users with the CAP_NET_ADMIN capability in any user namespace to modify the global OS fingerprint list.

CVE-2017-17558

Andrey Konovalov reported that that USB core did not correctly handle some error conditions during initialisation. A physically present user with a specially designed USB device can use this to cause a denial of service (crash or memory corruption), or possibly for privilege escalation.

CVE-2017-17741

Dmitry Vyukov reported that the KVM implementation for x86 would over-read data from memory when emulating an MMIO write if the kvm_mmio tracepoint was enabled. A guest virtual machine might be able to use this to cause a denial of service (crash).

CVE-2017-17805

Dmitry Vyukov reported that the KVM implementation for x86 would over-read data from memory when emulating an MMIO write if the kvm_mmio tracepoint was enabled. A guest virtual machine might be able to use this to cause a denial of service (crash).

CVE-2017-17806

It was discovered that the HMAC implementation could be used with an underlying hash algorithm that requires a key, which was not intended. A local user could use this to cause a denial of service (crash or memory corruption), or possibly for privilege escalation.

CVE-2017-17807

Eric Biggers discovered that the KEYS subsystem lacked a check for write permission when adding keys to a process's default keyring. A local user could use this to cause a denial of service or to obtain sensitive information.

CVE-2017-1000407

Andrew Honig reported that the KVM implementation for Intel processors allowed direct access to host I/O port 0x80, which is not generally safe. On some systems this allows a guest VM to cause a denial of service (crash) of the host.

CVE-2017-1000410

Ben Seri reported that the Bluetooth subsystem did not correctly handle short EFS information elements in L2CAP messages. An attacker able to communicate over Bluetooth could use this to obtain sensitive information from the kernel.

For the oldstable distribution (jessie), these problems have been fixed in version 3.16.51-3+deb8u1.


RN-836 (CM-19353)
The `net del` and `net add bridge` commands do not work in the same net commit

If a bridge is previously configured and you run the net del all and the net add bridge commands in the same net commit, all bridge and VLAN commands fail and there will not be any bridge or VLAN configuration added to the switch.

This issue should be fixed in the next version of Cumulus Linux.


RN-1002 (CM-21566)
FRR next-hop resolution changes are not updated when applying VRF to an interface after routes are configured in FRR

When adding new SVIs and static VRF routes in FRR, the appropriate VRF is applied to the interface in the kernel after the static routes are configured in FRR. When the kernel interface changes to the appropriate VRF, FRR next-hop resolution is not updated with the valid connected next-hop interface.

To work around this issue, remove and re-add the static routes.

This issue is being investigated at this time.

Issues Fixed in Cumulus Linux 3.5.2

The following is a list of issues fixed in Cumulus Linux 3.5.2 from earlier versions of Cumulus Linux.

Release Note ID Summary Description

RN-800 (CM-19510)
Routed traffic received on an MLAG bond to VXLAN destination dropped over peerlink

In an MLAG environment, when both peer switches in an MLAG pair are enabled, intermittent connectivity resulted, with around 50% packet loss. This issue wasn't seen when shutting down the peerlink and traffic flowed only on one switch in the MLAG pair.

This issue is fixed in Cumulus Linux 3.5.2.


RN-801 (CM-19195)
In VXLAN routing, border leafs in MLAG use anycast IP address after FRR restart

For type-5 routes, when an MLAG pair is used as border leaf nodes, the MLAG primary and secondary nodes use their respective loopback IP addresses as the originator IP address to start, but switch to using the MLAG anycast IP address after an FRR restart.

This issue is fixed in Cumulus Linux 3.5.2.


RN-802 (CM-19343)
NCLU `net add` command for an existing configuration produces an error

When you use the net add command for a configuration that already exists, NCLU displays an error instead of a warning.

This issue is fixed in Cumulus Linux 3.5.2.


RN-803 (CM-19456)
EVPN and IPv4 routes change origin after redistribution

EVPN routes were getting re-injected back into EVPN as type-5 routes when a type-5 advertisement was enabled. This issue occurred when advertising different subnets from different VTEPs into a type-5 EVPN symmetric mode environment.

This issue is fixed in Cumulus Linux 3.5.2.


RN-816 (CM-19369)
Connected routes are not redistributed into a BGP VRF after restarting the networking service

When redistributing connected VRF routes into BGP, sometimes the routes are not redistributed into BGP after you restart the networking service or after certain interface flap conditions.

To work around this issue, restart FRR with the systemctl restart frr.service command.

This issue is fixed in Cumulus Linux 3.5.2.


RN-838 (CM-19897)
VXLAN flow dropped as ingress general discards on Mellanox switches

After creating a new VNI on a Mellanox switch, VXLAN encapsulated packets are dropped as ingress general discards.

To work around this issue, flap the newly created VNI using the sudo ifdown <vni> and sudo ifup <vni> commands, or trunk the associated VLAN on an interface that is not currently forwarding that VLAN.

This issue is fixed in Cumulus Linux 3.5.2.

New Known Issues in Cumulus Linux 3.5.2

The following issues are new to Cumulus Linux and affect the current release.

Release Note ID Summary Description

RN-806 (CM-19592)
FRR removes all static routes when service is stopped, including those created by ifupdown2

Whenever FRR is restarted, it deletes all routes in the kernel with a protocol type of BGP, ISIS, OSPF, and static. When you upgrade FRR and the service is stopped, the static routes defined in the interfaces file and installed using ifupdown2 are also removed.

To work around this issue, configure static routes in the /etc/network/interfaces file as follows:

post-up ip route add <prefix> via <next-hop address> proto kernel

For example:

auto swp2
iface swp2
    post-up ip route add 0.0.0.0/0 via 192.0.2.249 proto kernel

This issue should be fixed in the next release of Cumulus Linux.


RN-807 (CM-17159)
NCLU `net show interface <bond>` command shows interface counters that are not populated

The output of the NCLU net show interface <bond> command shows misleading and incorrect interface counters.

This issue is currently being investigated.


RN-808 (CM-15902)
In EVPN, sticky MAC addresses move from one bridge port to another

In EVPN environments, sticky MAC addresses move from one bridge port to another on soft nodes.

This issue is currently being investigated.


RN-809 (CM-19120)
`netshow lldp` command displays an error

When running the netshow lldp command, the output displays the following error:

cumulus@switch:~# netshow lldp
ERROR: The lldpd service is running, but '/usr/sbin/lldpctl -f xml' failed.

However, the NCLU net show lldp command works correctly.

This issue should be fixed in the next release of Cumulus Linux.


RN-810 (CM-19601)
Security: Wireshark vulnerabilities DSA-4101 CVE-2018-5334 CVE-2018-5335 CVE-2018-5336

The Wireshark security vulnerabilities were announced after the branch for Cumulus Linux 3.5.2 was frozen.

Wireshark is not installed in the base Cumulus Linux image; it is an optional package. And since these vulnerabilities are relatively minor, they will be fixed in the next Cumulus Linux release.

Following is the Debian security advisory DSA-4101-1:

--------------------------------------------------------------------------
Debian Security Advisory DSA-4101-1 security@debian.org
https://www.debian.org/security/ Moritz Muehlenhoff
January 28, 2018 https://www.debian.org/security/faq
--------------------------------------------------------------------------
Package : wireshark
CVE ID : CVE-2018-5334 CVE-2018-5335 CVE-2018-5336
It was discovered that wireshark, a network protocol analyzer, contained
several vulnerabilities in the dissectors/file parsers for IxVeriWave,
WCP, JSON, XML, NTP, XMPP and GDB, which could result in denial of
dervice or the execution of arbitrary code.
For the oldstable distribution (jessie), these problems have been fixed
in version (1.12.1+g01b65bf-4+deb8u13.
For the stable distribution (stretch), these problems have been fixed in
version 2.2.6+g32dac6a-2+deb9u2.

RN-811 (CM-18702)
BGP unnumbered neighbor stays down after reload and interface flap

The BGP unnumbered neighbor does not come back up after issuing the reload command or after an interface flap. This issue is specific to how RA suppression is disabled by FRRouting for BGP unnumbered interfaces. By default, RA suppression is enabled. However, when BGP unnumbered interfaces are configured, RA suppression is disabled.

The issue can arise depending on your FRR configuration. If you push an frr.conf configuration to the switch and it did not include the no ipv6 nd suppress-ra option, then FRR adds that to running configuration, but it is still not present in frr.conf. When you run FRR reload, RA suppression gets enabled again, as it's not in frr.conf. The next time an interface flaps, BGP unnumbered is unable to establish peering.

To work around this issue, frr.conf has been pushed and the frr.service gets restarted. The additional write in FRR syncs both saved and running configurations.

This issue should be fixed in the next release of Cumulus Linux.


RN-814 (CM-19658)
NCLU `net add VXLAN` commands out of order

When creating a VNI, the vxlan id parameter must be specified first to instantiate the VXLAN type interface object. When running net show configuration commands however, the commands are printed in a different order so a copy and paste will fail.

To work around this issue, manually reorder the commands so that vxlan id is executed before any other bridging or VXLAN parameters.

This issue should be fixed in the next release of Cumulus Linux.


RN-815 (CM-19630)
Bridge MAC address clashing when eth0 is part of the same broadcast domain

Cumulus Linux 3.5.0 and above uses the eth0 MAC address as the MAC address for bridges. If eth0 is part of the same broadcast domain, you experience outages when upgrading to 3.5.x.

To work around this issue, manually change the bridge MAC address in the /etc/network/interfaces file.

This issue should be fixed in the next release of Cumulus Linux.

Issues Fixed in Cumulus Linux 3.5.1

The following is a list of issues fixed in Cumulus Linux 3.5.1 from earlier versions of Cumulus Linux.

Release Note ID Summary Description

RN-691 (CM-18647, CM-19279)
Configuring a DHCP relay on a VRR interface with NCLU causes errors

When configuring a DHCP relay on a VRR interface using the NCLU commands, errors are seen when running sudo ifreload -a and net commit.

cumulus@switch:~$ sudo ifreload -a
error: 'scope'

To work around this issue, edit the /etc/network/interfaces file and remove any vlanX-v0 stanzas, then run sudo ifreload -a again.

This issue is fixed in Cumulus Linux 3.5.1.


RN-695 (CM-19156)
When adding an OSPF passive interface to a VRF that does not exist, ospfd crashes

Adding an OSPF passive interface to an OSPF VRF that has not been defined previously causes ospfd to crash.

To work around this issue, create the VRF before you add the passive OSPF interface.

This issue is fixed in Cumulus Linux 3.5.1.


RN-732 (CM-16550)
With management VRF, the `net show time ntp servers` command shows empty output

With management VRF, the output of the NCLU command net show time ntp servers is empty.

This issue is fixed in Cumulus Linux 3.5.1.


RN-738 (CM-18709)
On Dell S4148T-ON switches with Maverick ASICs, configuring 1G or 100M speeds on 10G fixed copper ports requires a ports.conf workaround

1G and 100M speeds on SFP ports are not working on the Dell S4148T-ON.

To enable a speed lower than 10G on a port on the S4148T platform, you must dedicate an entire port group (four interfaces) to a lower speed setting. Within a port group, you can mix 1G and 100M speeds, if needed. You cannot mix 10G and lower speeds.

To work around this issue:

  1. In the /etc/cumulus/ports.conf file, add each of the four ports in the port group as 1G interfaces. You must set each of the ports in the port group to be 1G. Port groups are swp1-4, swp5-8, swp9-12, and so on, and starting with swp31-35 on the right half of the switch. For example, to enable ports swp5-swp8 to autonegotiate to 100M or 1G speeds, add the following to the ports.conf file:
    5=1G
    6=1G
    7=1G
    8=1G
  2. Restart switchd:
    cumulus@switch:~$ sudo systemctl reset-failed switchd; sudo systemctl restart switchd

    After this is done ports swp5-8 will be enabled to autonegotiate with the neighbor devices to 1G or 100M speeds.

As of 3.5.1, 1G interfaces are supported when using the ports.conf file workaround as described above. As of 3.6.0, editing the ports.conf file is no longer required.


RN-748 (CM-19202)
The `link autoneg off` setting not applied to the last set of interfaces in a list if OFF already set on one of the interfaces

Using NCLU to assign the link autoneg off setting to a list of interfaces fails to complete the list if one of the interfaces in the list already has the link autoneg off setting.

This issue is fixed in Cumulus Linux 3.5.1.


RN-765 (CM-19133, 19139)
On Mellanox switches, unable to assign VLANs 1992 and above 

On Mellanox switches, you can have a maximum of 2048 VLANs, less the number of physical interfaces. Typically, you have no more than 48 physical interfaces, which leaves 2000 VLANs available. If you create more than 48 interfaces, there are only 1991 VLANs available.

This issue is fixed in Cumulus Linux 3.5.1.


RN-780 (CM-19193)
DC power supply on Edgecore AS5812 and AS5712 causes switch to be inoperable

An issue with the DC power supply on the Edgecore AS5812 and AS5712 models causes the switch to be inoperable.

This issue is fixed in Cumulus Linux 3.5.1.


RN-781 (CM-19067)
VXLAN symmetric routing: Packets are CPU forwarded after switchd restarts

When VXLAN symmetric routing in enabled, sometimes packets get forwarded to the CPU after switchd is restarted.

To work around this issue, restart the networking service:

cumulus@switch:~$ sudo systemctl restart networking

This issue is fixed in Cumulus Linux 3.5.1.


RN-786 (CM-19300)
NCLU `net show interface` command output for bridge interfaces is incorrect or missing

The output for the NCLU net show interface command for bridge interfaces is missing or incorrect. The interface mode does not show Bridge/L2 and the member interfaces are shown.

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.1.


RN-789 (CM-19280)
CPU not put into the flood group with ARP suppression off

For VXLAN routing, when ARP suppression is disabled, the CPU is not placed in the flood group so ARP requests do not reach the CPU.

This issue is fixed in Cumulus Linux 3.5.1.


RN-791 (CM-19218)
DNS issues on boot up cause first ZTP fetch to fail

When the switch boots, DNS issues sometimes causes ZTP to fail when searching for an install script.

This issue is fixed in Cumulus Linux 3.5.1.


RN-794 (CM-19153)
NCLU `net show config` command output is incorrectly formatted

The output of the NCLU net show config commands is not formatted correctly. Trying to copy and paste the output produces an error.

This issue is fixed in Cumulus Linux 3.5.1.


RN-796 (CM-19045)
netd sometimes crashes with SNMP trap configuration

The netd service crashes if you issue the snmp-server trap-link-up command with a non-default snmpd.conf file. The configuration file is expected to include the following default configuration option:

monitor -r 60 -o laNames -o laErrMessage "laTable" laErrorFlag != 0'

To workaround this issue, you can manually edit the /etc/snmp/snmpd.conf file and add the missing default configuration option.

This issue is fixed in Cumulus Linux 3.5.1.


RN-797 (CM-18980)
NCLU needs support for multiple access client IP addresses associated with a single community

Previously, with NCLU, you were unable to add multiple IP addresses without defining a unique community for each. You can now add multiple access IP addresses to use the same community password.

This issue is fixed in Cumulus Linux 3.5.1.


RN-839 (CM-19038)
NCLU DecodeError: 'ascii' codec can't decode byte 0xc2

Certain unicode characters in the /etc/network/interfaces.d/vlans.intf file, which is used by the interfaces file, cause a decode error when running the net show interface command.

This issue is fixed in Cumulus Linux 3.5.1.

New Known Issues in Cumulus Linux 3.5.1

The following issues are new to Cumulus Linux and affect the current release.

Release Note ID Summary Description

RN-785 (CM-19422)
NCLU `net show interface detail` command does not display detailed output

The net show interface swp# command returns the same output as net show interface swp# detail.

To view the additional information typically presented, use alternative commands. For example, to view the module information and statistics use ethtool swp# and ethtool -S swp#.

This issue is currently being investigated.


RN-787 (CM-19418)
NCLU: `net add hostname` creates an inconsistency between /etc/hostname and /etc/hosts files

Running net add hostname <hostname> updates both /etc/hostname and /etc/hosts. However, NCLU modifies the hostname value passed to /etc/hostname, removing certain characters and converting the hostname to lowercase, whereas the hostname passed to /etc/hosts is passed through as is, creating an inconsistency between the two files.

To work around this issue, manually set the hostname in both /etc/hostname and /etc/hosts using a text editor such as vi or nano.

This issue is currently being investigated.


RN-788 (CM-19381)
dhcrelay does not bind to interfaces that have names longer than 14 characters

The dhcrelay command does not bind to an interface if the interface's name is longer than 14 characters.

To work around this issue, change the interface name to be 14 or fewer characters if dhcrelay is required to bind to it.

This issue is currently being investigated.


RN-792 (CM-19323)
On Trident II+ switches with 3 meter 40G DACs, an auto-negotiation mismatch may prevent or delay the link from coming up

n a Trident II+ switch, it's been reported that when there is an auto-negotiation mismatch between the switch and the NIC, a 40G 3 meter DAC may not link up, or could take more than a minute to do so. The issue has not been seen with 1 meter DACs or any active cable.

To work around this issue, configure auto-negotiation for the interface to allow it to operate properly.

This issue is currently being investigated.


RN-793 (CM-19321)
FRR does not detect the bandwidth for 100G interfaces correctly

FRR correctly detects the bandwidth for both 10G interfaces and 40G interfaces. However, it does not do so for 100G interfaces. Setting link speed manually does not fix this issue.

To work around this issue, restart the FRR service:

cumulus@switch:~$ sudo systemctl restart frr.service

This issue is currently being investigated.


RN-795 (CM-19320)
`net add ospf auto-cost reference-bandwidth` command result applied to ospf6 instead of ospf

The results from running the net add ospf auto-cost reference-bandwidth command are applied to an OSPFv3 (net add ospf6) configuration even though it is expected to be applied to an OSPFv2 (net add ospf) configuration.

This issue should be fixed in the next release of Cumulus Linux.


RN-798 (CM-19257)
NCLU `net show config commands` doesn't parse the multiple forms for agentaddress in snmpd.conf

If you manually edit the snmpd.conf to specify the agentaddress, net show config commands outputs the command in a way that cannot be pasted back into the file.

For example, you can specify the agentaddress in any of the following ways:

agentaddress udp:1.1.1.1:161,2.2.2.2:171,3.3.3.3
agentaddress 4.4.4.4,5.5.5.5:171,6.6.6.6:161
agentaddress tcp:7.7.7.7

This issue is currently being investigated.


RN-799 (CM-16493)
 

You cannot use NCLU or ifupdown2 to enable or disable of the IPv6 link-local eui-64 format.

To work around this limitation, you can use the following iproute2 command:

cumulus@switch:~$ sudo ip link set swp# addrgenmode {eui-64|none}

Note that this command does not persist across a reboot of the switch.

This issue is currently being investigated.

Issues Fixed in Cumulus Linux 3.5.0

The following is a list of issues fixed in Cumulus Linux 3.5.0 from earlier versions of Cumulus Linux.

Release Note ID Summary Description

RN-125 (CM-1576)
Network LSA with an old router ID isn't flushed out by the originator

When the router ID is changed, the router should remove the previous network LSA (link-state advertisement) that it generated based on the IP address on the interface in the Network LSA.

Cumulus Linux did not remove the old LSA , so it would only change when it naturally aged out. 

This issue is fixed in Cumulus Linux 3.5.0. The fix now changes the router ID upon changing rather than having to wait for the max age timer.


RN-387 (CM-8163)
Quagga appears to not honor passive interfaces if VRR is active

In a VRR configuration, any interface-specific routing configuration (e.g., OSPF mode of operation) specified on the subinterface having a virtual IP address does not take effect. This is because when an operator has specified a virtual IP on a bridge, the system creates another internal interface bridge with the virtual IP and MAC. These two interfaces are treated distinctly by Quagga, so any interface-specific routing configuration on the bridge does not get carried over to the second bridge.

As a workaround, a VRR deployment needing any interface-specific routing configuration on the interface with a virtual IP address, the routing configuration must also be specified against the internally-created virtual interface.

This issue is fixed in Cumulus Linux 3.5.0.


RN-448 (CM-11302)
Using the json option in the "show ip bgp" command causes peer session flaps

This issue causes peer session flaps on Penguin Arctica 4806XP and Supermicro SSE-X3648S switches. It occurs with 16K IPv4 prefixes and only when you run show ip bgp json.

However, on switches with Tomahawk ASICs, with 61K IPv4 prefixes and default timers, the same show ip bgp json command causes all peer sessions to go down.

This issue is fixed in Cumulus Linux 3.5.0.


RN-542 (CM-13461)
Polling the BGP RIB with "show ip bgp" causes the peer to flap if the RIB has more than 600K entries

This is a known issue that's currently being investigated. The Quagga log shows these commands taking a very long to execute.

To work around this issue, Cumulus Networks recommends you use larger keepalive/hold timers — 60 and 180 seconds, respectively.

This issue is fixed in Cumulus Linux 3.5.0.


RN-598 (CM-15575)
clagd process restarts when updating backup-ip

An error was found when an accidental change was made to the backup IP, and then corrected. ifreload -a would restart the clagd process to invoke the daemon with the new backup IP, rather than updating the backup IP with the change.

This issue is fixed in Cumulus Linux 3.5.0.


RN-646 (CM-17704)
switchd crashes when auto-negotiation is enabled on 10G LR/SR interfaces  

When auto-negotiation is enabled on a 10G LR or SR interface, switchd might crash and cannot be restarted unless you reboot the whole switch.

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.0.


RN-649 (CM-17778)
The clagd service fails to start if the backup IP is over a management VRF

The clagd service fails to start up if the backup is in the non-default VRF. For example, a configuration like clagd-backup-ip 192.1.1.1 vrf green results in a clagd startup failure.

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.0.


RN-650 (CM-17843)
NCLU: cannot configure FRR if all FRR daemons are disabled  A regression occurred where upgraded instances did not keep previous Quagga configurations. This meant that once the instance booted into 3.4.0, FRR was not configured.

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.0.


RN-653 (CM-17856) 
Enabling PFC on Mellanox switches may cause switchd to crash

On Cumulus Linux versions 3.3.0 and later, enabling priority flow control (PFC), explicit congestion notification (ECN) or link pause on Mellanox Spectrum-based switches may cause the switchd process to crash.

To work around this issue, populate the appropriate unlimited_egress_buffer_port_set parameter in the /etc/cumulus/datapath/traffic.conf file. The default range should be "swp<a>-swp<z>", where "swp<a>" is the first front panel port in /var/lib/cumulus/porttab and "swp<z>" is the last front panel port in the porttab file. For example, to configure this parameter for PFC, use:

# priority flow control
pfc.port_group_list = [pfc_port_group]
pfc.pfc_port_group.cos_list = [0]
pfc.pfc_port_group.port_set = swp1-swp5
pfc.pfc_port_group.port_buffer_bytes = 25000
pfc.pfc_port_group.xoff_size = 10000
pfc.pfc_port_group.xon_delta = 2000
pfc.pfc_port_group.tx_enable = true
pfc.pfc_port_group.rx_enable = true
pfc.pfc_port_group.unlimited_egress_buffer_port_set = swp1-swp16

For ECN, the parameter would be ecn.ecn_port_group.unlimited_egress_buffer_port_set = swp1-swp16.

For link pause, the parameter would be link_pause.pause_port_group.unlimited_egress_buffer_port_set = swp1-swp16.

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.0.


RN-657 (CM-18080) 
All multicast traffic on Trident II+ switches is software forwarded when RIOT is enabled

When RIOT is enabled on a Trident II+ switch, all multicast traffic gets software forwarded.

To work around this issue, disable RIOT on the switch. Edit the /usr/lib/python2.7/dist-packages/cumulus/__chip_config/bcm/datapath.conf file and change the vxlan_routing_overlay.profile setting to disable:

vxlan_routing_overlay.profile = disable

Then restart switchd for the change to take effect. This causes all network ports to reset in addition to resetting the switch hardware configuration.

cumulus@switch:~$ sudo systemctl restart switchd.service

This issue is fixed in Cumulus Linux 3.5.0.


RN-658 (CM-17338) 
Power cycling a connected host may result in control plane traffic failure on a 10G BASE-T Trident II+ switch

Switches with the Trident II+ chipset running Cumulus Linux 3.3.0 or later may experience a failure to transmit frames from the control plane following a power-cycle of a device connected via 10GBASE-T. This can result in complete loss of connectivity from the switch control plane to connected devices.

To work around this issue, restart switchd with sudo systemctl restart switchd.

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.0.


RN-672 (CM-18154)  
Redistribute neighbor service rdnbrd does not add zebra route if connected host moves to a different interface

This situation occurs if the host was reachable via a given port (say swp1), and then also becomes reachable via a second port (say swp2). In this case, the routing table entry gets updated to point to swp2, but the neighbor entry on swp1 remains reachable.

If the host stops responding on swp2, the neighbor entry on swp1 remains reachable and keeps getting refreshed. As the entry on swp2 transitions to a FAILED status, the rdnbrd service removes the route from table 10, but table 10 does not get notified of a neighbor change and thus doesn't have an entry for this connected neighbor.

The only workaround is to restart the rdnbrd service, but this is not advised, especially in the case if the host moves around the network frequently, as would be the case if the host is a virtual machine.

This issue is fixed in Cumulus Linux 3.5.0.


RN-673 (CM-18254) 
On Mellanox switches, the switch ports are in operational DOWN status

This issue arose after upgrading to Cumulus Linux 3.4.2 and occurs under the following conditions:

  • Mellanox switch running Cumulus Linux 3.4.2 only
  • auto-negotiation is set to off
  • The remote side negotiates a different speed than what the switch uses as default speed setting

To work around this issue, set auto-negotiation to on for every affected interface at both ends:

cumulus@switch:~$ net add interface swp1-52 link autoneg on
cumulus@switch:~$ net pending
cumulus@switch:~$ net commit

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.0.


RN-674 (CM-17577)
Cannot set the MTU for switch ports that is different than the MTU for eth0

You cannot set both a global MTU and an individual MTU in a policy file. For example, this configuration does not work:

root@leaf01:/home/cumulus# cat /etc/network/ifupdown2/policy.d/mtu.json
{
 "address": {"defaults": { "mtu": "9216" }},
 "ethtool": {"iface_defaults": {"eth0": {"mtu": "1500"}}}
}

This issue is fixed in Cumulus Linux 3.5.0.


RN-675 (CM-17735) 
When EVPN with ARP suppression is enabled, the total neighbor entries are limited by the RIOT profile, which defaults to 8K entries

When ARP suppression is turned on, a VXLAN SVI is configured on the bridge. In Cumulus Linux 3.4.z, VXLAN routing is enabled by default, so the neighbors are being learned and programmed in the Broadcom Trident II+ ASIC. The Trident II+ default profile supports up to 8k next hop entries, after which the following error messages are logged:

switchd[20053]: hal_bcm_l3.c:1283 CRIT bcm_l3_egress_create unit 0 mod 0 \
        port -2147483640 vlan 0 intf 10240 failed: Table full

These messages do not affect forwarding.

To work around this issue, disable VXLAN routing.

  1. Edit the /usr/lib/python2.7/dist-packages/cumulus/__chip_config/bcm/datapath.conf file and set vxlan_routing_overlay.profile to disable.
  2. Restart switchd:
    cumulus@switch:~$ sudo systemctl restart switchd.service

Note: Restarting switchd causes all network ports to reset in addition to resetting the switch hardware configuration.

This issue is fixed in Cumulus Linux 3.5.0.


RN-689 (CM-18369) 
On a Trident II+ switch, using VXLAN RIOT breaks the VRF setting in the L3_IIF table 

On a Trident II+ switch, VXLAN RIOT is enabled by default. However, a problem occurred with the VLAN-to-L3_IIF table setting that indicates the VRF. This caused switchd to fail to set the L3_IIF attribute in a new table entry, thus incorrect L3_IIF and profile attributes, including the VRF ID, were used for packet processing. This affected any routed interface, such as bonds, VLAN subinterfaces, and SVIs. As a result, you might be unable to ping the address of the WAN-facing interface on a border leaf switch.

The workaround involved disabling the VXLAN RIOT setting in the datapath.conf file.

This issue was a regression of an earlier issue and has been fixed again in Cumulus Linux 3.5.0.


RN-696 (CM-17040)
After rebooting a Cumulus Express 5812-54X switch, ports with 1000Base-T SFP are down when auto-negotiation is on

For 1000Base-T interfaces, auto-negotation should be set to no. To work around this issue, disable auto-negotation on these interfaces.

This issue is fixed in Cumulus Linux 3.5.0.


RN-698 (CM-17205)
When updating neighbor entries in hardware, a Mellanox switch returns "neigh_add failed. err: Entry Already Exists" error

This error occurs when VRR is configured.

This issue is fixed in Cumulus Linux 3.5.0.


RN-699 (CM-18951)
ifupdown2 policy applied incorrectly for eth0 

On Cumulus Linux, the ifupdown2 policy files stored in /etc/network/ifupdown2/policy.d/ might not be applied correctly to the eth0 interface.

This issue is fixed in Cumulus Linux 3.5.0.


RN-700 (CM-17209)
When both MLAG switches share the same IP address, it causes a loop

When configuring MLAG, if the clagd-peer-ip is the same as the switch's IP address, it causes the switch to peer with itself, resulting in a loop.

This issue is fixed in Cumulus Linux 3.5.0. The clagd-peer-ip cannot be the same as the peerlink subinterface IP address.


RN-701 (CM-17226)
MLAG clagd service exits due to misconfiguration

The switch stops the clagd.service when it detects a mismatched configuration, such as different sys-mac or clagd-vxlan-anycast-ip among the MLAG pair.

This issue is fixed in Cumulus Linux 3.5.0.


RN-703 (CM-17432)
An ACL fails to match traffic after an interface is bounced and the internal VLAN ID is changed

This issue is fixed in Cumulus Linux 3.5.0.


RN-705 (CM-17468)
If lacp-bypass-allow is configured, `net show config commands` displays a bond configuration incorrectly

If lacp-bypass-allow is configured on an interface, the output of net show configuration commands is displayed in an order that NCLU rejects if you try to copy and paste the commands and run them again. Consider the following command output:

cumulus@leaf03:~$ net show config commands
...
net add bridge bridge vlan-aware
net add bond server03 bond lacp-bypass-allow
net add bond server03 bond slaves swp1
net add bond server03 bridge access 20
net add bond server03 clag id 3
net add bond server03 stp bpduguard
net add hostname leaf03

Since net add bond server03 bond lacp-bypass-allow appears before the bond is defined with bond slaves, NCLU will reject the command.

This issue is fixed in Cumulus Linux 3.5.0.


RN-706 (CM-18771)
On Broadcom switches, IGMP snooping not working as expected 

Multicast traffic is flooded to all bridge ports even if there is a valid snooped (*,G) entry.

This issue is fixed in Cumulus Linux 3.5.0.


RN-707 (CM-17804)
MLAG goodbye message over peerlink not always sent

In an MLAG configuration, when the primary switch goes down, Cumulus Linux now sends a goodbye message over the backup link as well as over the peerlink.


RN-708 (CM-18749)
MLAG bridge mbd timer issue 

MLAG does not sync the bridge mdb state between peers. 

This issue is fixed in Cumulus Linux 3.5.0.


RN-709 (CM-17839)
Mellanox switch returns parameter errors for bond configuration: "VLAN: Failure - port is LAG member"

A Mellanox switch returns the following errors in syslog for a bond configuration:

2017-08-28T10:06:17.596911-07:00 mlx-2700-01 sx_sdk: VLAN: Failure - port is LAG member (Parameter Error)
2017-08-28T10:06:17.616588-07:00 mlx-2700-01 sx_sdk: VLAN: Failure - port is LAG member (Parameter Error)
2017-08-28T10:06:17.619505-07:00 mlx-2700-01 sx_sdk: VLAN: Failure - port is LAG member (Parameter Error)

This issue is fixed in Cumulus Linux 3.5.0.


RN-710 (CM-18663)
Incorrect NCLU IPv6 SNMP configuration 

Valid IPv6 addresses cannot be bound by snmpd because the square brackets are missing in the configuration. These square brackets are required to distinguish between an IPv6 address and a port configuration (both contain a colon).

This issue is fixed in Cumulus Linux 3.5.0.


RN-711 (CM-17842)
NCLU net show lldp command reports wrong mode in LLDP output for Trunk/L2

The net show lldp command should display Access/L2 for the mode, but actually reports it as Trunk/L2.

This issue is fixed in Cumulus Linux 3.5.0.


RN-712 (CM-18634)
BGP IPv4 default-originate command fails next hop check when using unnumbered with IPv6 addresses 

BGP unnumbered does not support IPv6 GUA addresses on the interface which is peering IPv6. 

This issue is fixed in Cumulus Linux 3.5.0.


RN-713 (CM-18473)
New functionality within NCLU is enabled automatically after an upgrade 

All NCLU components are now enabled by default after an upgrade, unless explicitly disabled. If you edit the netd.conf file, you can keep your version of the file when performing an upgrade. 


RN-714 (CM-18458)
1G SFP ports flap when reloading settings with ifreload -a 

If a 1G fibre SFP is installed in a 10G SFP+ port and the port speed is not specified (auto-negotiation is on), reloading settings with the ifreload -a command causes the link to flap because of redundant ethtool set commands in ifupdown2.

This issue is fixed in Cumulus Linux 3.5.0.


RN-715 (CM-18012)
clagctl reports a host as single-attached when both MLAG peer switches are down

In an MLAG configuration, clagctl incorrectly reports offline servers as single connected when both of its MLAG switches are down. The proto-down reason should not indicate any active members.

This issue is fixed in Cumulus Linux 3.5.0.


RN-716 (CM-18433)
netd crashes if the default user cumulus is removed 

If you remove the default user cumulus from the system, netd fails to produce output and generates a traceback message when you run NCLU commands. Some commands return no output to the terminal screen, other commands indicate that netd is not working correctly. 

This issue is fixed in Cumulus Linux 3.5.0.


RN-717 (CM-18023)
NCLU does not add `ip igmp` before applying the `igmp join group` command

NCLU does not add ip igmp under an interface configuration before it applies the ip igmp join-group command, so a command like net add vlan 2 igmp join 192.0.2.0 0.0.0.0 silently fails.

This issue is fixed in Cumulus Linux 3.5.0.


RN-718 (CM-18031)
The NCLU OSPF message-digest-key command is incorrectly translated to the FRRouting configuration

The following NCLU command:

cumulus@switch:~$ net add vlan 501 ospf message-digest-key 7 md5 ospf

Gets incorrectly translated to the following in the FRRouting configuration, /etc/frr/frr.conf:

 ip ospf message-digest-key 7 md5 ip ospf

The correct syntax should be:

 ip ospf message-digest-key 7 md5 ospf

This issue is fixed in Cumulus Linux 3.5.0.


RN-719 (CM-18052)
After stopping the hsflowd service, sFlow continues to sample, causing buffer drops

If you stop the hsflowd service, the sFlow sampling appears to continue, sending the samples to the kernel. The sampled ports end up pushing a lot of traffic, and the added sFlow data was causing buffer drops.

This issue is fixed in Cumulus Linux 3.5.0.


RN-720 (CM-18355)
Change in default multicast buffer size 

Sending multicast traffic to several interfaces while one interface is congested leads to dropped packets on all receivers. In Cumulus Linux 3.5.0, the default multicast buffer size has been changed so that the buffer size per port cannot be more than 128K (1024 cells).


RN-721 (CM-18069)
OSPFv3 (IPv6) does not install IPv6 prefix into the OSPFv3 RIB

This issue is fixed in Cumulus Linux 3.5.0.


RN-722 (CM-18318)
On Mellanox Spectrum switches, sFlow does not work when enabled on a bond member interface

The Mellanox Spectrum switch does not create or export flow samples when the sampled traffic flow is ingress AND egress on a bond interface. 

This issue is fixed in Cumulus Linux 3.5.0.


RN-723 (CM-18161)
Running ifreload bounces the loopback interface if an IPv6 address defined before an IPv4 address

To work around this issue, edit the configuration in /etc/network/interfaces and move the IPv6 configuration after the IPv4 configuration.

This is incorrect:

auto lo 
iface lo inet loopback 
    address 2001:db8::1/128 
    address 192.0.2.1/32

This is correct:

auto lo 
iface lo inet loopback 
    address 192.0.2.1/32
    address 2001:db8::1/128 

This issue is fixed in Cumulus Linux 3.5.0.


RN-725 (CM-11824)
LACP protocol status flag in /proc/net/bonding/<name> output

 A new status line is added to the output to indicate the LACP protocol status per member interface.


RN-727 (CM-14152)
ifreload not re-enabling IGMP snooping

fupdown2 does not merge all the attributes defined in the policy files by default.

This issue is fixed in Cumulus Linux 3.5.0.


RN-728 (CM-14790)
No license error message from ifreload and NCLU commands

If a license file is not installed for switchd, ifreload and NCLU commands display an error on a setting that it can't apply (such as link speed).

You now see a warning message indicating that a license file is not installed.


RN-729 (CM-16099)
Logging for MLAG role change

It is not clear in MLAG logging what the switch's role is at any given time.

More logging is now added to specify what the role of the switch is. 


RN-731 (CM-16233)
netd crashes when configuring nameserver with no resolv.conf file

If you remove the /etc/resolv.conf file, then try to apply a name server configuration with NCLU, netd crashes.

This issue is fixed in Cumulus Linux 3.5.0.


RN-733 (CM-16612)
VXLAN interfaces stay down after ifreload -a 

After issuing the ifreload -a command when adding VXLAN-interfaces, the new XVLAN interfaces remain in the NO CARRIER state. This issue occurs

only if there is no MLAG peer connectivity.

This issue is fixed in Cumulus Linux 3.5.0.


RN-734 (CM-16716)
SPAN rules on a VXLAN VNI interface fail to install

Installing SPAN rules on a VXLAN VNI interface results in an installation error.

This issue is fixed in Cumulus Linux 3.5.0.


RN-735 (CM-16862)
Unable to start a service if VRF name contains a dash (-) 

If a VRF name contains a dash (-), any service you try to start fails with the message "Invalid VRF name." 

This issue is fixed in Cumulus Linux 3.5.0.


RN-736 (CM-18619)
Multiple DHCP relay forwarding requests overlap on outgoing interface 

Multiple DHCP relay forwarding requests are replicated erroneously to a server that does not serve that subnet.

This issue is fixed in Cumulus Linux 3.5.0.


RN-739 (CM-18790)
Confusing message received on IP unnumbered interface even though packet is forwarded

When DHCP relay is configured and a DHCP packet is received on an IP unnumbered interface, a Discard message is logged even though the DHCP packet is forwarded.

This issue is fixed in Cumulus Linux 3.5.0.


RN-740 (CM-18847)
Unreachable IPv6 route cache entries for connected network not removed when carrier restored

When traffic originating from the kernel is generated and destined to a connected VRF IPv6 global address while the connected interface is carrier-down, an unreachable route cache entry is created against the loopback interface:

cumulus@leaf01:~$ ip -6 ro ls cache table NAME
unreachable 2001:DB8::5 dev lo  metric 0 
    cache  error -101 pref medium

When the carrier is restored, this entry remains and subsequent route lookups continue to return unreachable results erroneously:

cumulus@leaf01:~$ sudo vrf task exec NAME ping6 2001:DB8::5
connect: Network is unreachable

This issue is fixed in Cumulus Linux 3.5.0.


RN-752 (CM-16683)
VXLAN MAC addresses change on reboot, which also affects the bridge MAC address

When you reboot the switch, VXLAN MAC addresses change. The bridge MAC address also changes and is set to the MAC address of eth0.

This issue is fixed in Cumulus Linux 3.5.0.


RN-767 (CM-17475)
Security: Linux kernel issues fixed in Cumulus Linux 3.5.0: DSA-3945-1 CVE-2017-7346 CVE-2017-7482 CVE-2017-7533 CVE-2017-7541 CVE-2017-7542 CVE-2017-9605 CVE-2017-10810 CVE-2017-10911 CVE-2017-11176 CVE-2017-1000365

The following CVEs that were announced in Debian Security Advisory DSA-3945-1 apply to packages maintained and built by Cumulus Networks. They have been fixed in Cumulus Linux 3.5.0 (package version 4.1.33-1+cl3u10):

--------------------------------------------------------------------------
Debian Security Advisory DSA-3945-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
August 17, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : linux
CVE ID : CVE-2017-7346 CVE-2017-7482 CVE-2017-7533 CVE-2017-7541
CVE-2017-7542 CVE-2017-9605 CVE-2017-10810 CVE-2017-10911
CVE-2017-11176 CVE-2017-1000365

Several vulnerabilities have been discovered in the Linux kernel that may lead to a privilege escalation, denial of service or information leaks.

CVE-2017-7346
Li Qiang discovered that the DRM driver for VMware virtual GPUs does not properly check user-controlled values in the vmw_surface_define_ioctl() functions for upper limits. A local user can take advantage of this flaw to cause a denial of service.

CVE-2017-7482
Shi Lei discovered that RxRPC Kerberos 5 ticket handling code does not properly verify metadata, leading to information disclosure, denial of service or potentially execution of arbitrary code.

CVE-2017-7533
Fan Wu and Shixiong Zhao discovered a race condition between inotify events and VFS rename operations allowing an unprivileged local attacker to cause a denial of service or escalate privileges.

CVE-2017-7541
A buffer overflow flaw in the Broadcom IEEE802.11n PCIe SoftMAC WLAN driver could allow a local user to cause kernel memory corruption, leading to a denial of service or potentially privilege escalation.

CVE-2017-7542
An integer overflow vulnerability in the ip6_find_1stfragopt() function was found allowing a local attacker with privileges to open raw sockets to cause a denial of service.

CVE-2017-9605
Murray McAllister discovered that the DRM driver for VMware virtual GPUs does not properly initialize memory, potentially allowing a local attacker to obtain sensitive information from uninitialized kernel memory via a crafted ioctl call.

CVE-2017-10810
Li Qiang discovered a memory leak flaw within the VirtIO GPU driver resulting in denial of service (memory consumption).

CVE-2017-10911 / XSA-216
Anthony Perard of Citrix discovered an information leak flaw in Xen blkif response handling, allowing a malicious unprivileged guest to obtain sensitive information from the host or other guests.

CVE-2017-11176
It was discovered that the mq_notify() function does not set the sock pointer to NULL upon entry into the retry logic. An attacker can take advantage of this flaw during a user-space close of a Netlink socket to cause a denial of service or potentially cause other impact.

CVE-2017-1000365
It was discovered that argument and environment pointers are not taken properly into account to the imposed size restrictions on arguments and environmental strings passed through RLIMIT_STACK/RLIMIT_INFINITY. A local attacker can take advantage of this flaw in conjunction with other flaws to execute arbitrary code.

For the oldstable distribution (jessie), these problems have been fixed in version 3.16.43-2+deb8u3.


RN-768 (CM-18121)
Security: Linux kernel issues fixed in Cumulus Linux 3.5.0: DSA-3981-1, CVE-2017-7518, 7558, 10661, 11600, 12134, 12146, 12153, 12154, 14106, 14140, 14156, 14340, 14489, 14497, 1000111, 1000112, 1000251, 1000252, 1000370, 1000371, 100038

The following CVEs that were announced in Debian Security Advisory DSA-3981-1 apply to packages maintained and built by Cumulus Networks. They have been fixed in Cumulus Linux 3.5.0 (package version 4.1.33-1+cl3u10):

--------------------------------------------------------------------------
Debian Security Advisory DSA-3981-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
September 20, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : linux
CVE ID : CVE-2017-7518 CVE-2017-7558 CVE-2017-10661 CVE-2017-11600
CVE-2017-12134 CVE-2017-12146 CVE-2017-12153 CVE-2017-12154
CVE-2017-14106 CVE-2017-14140 CVE-2017-14156 CVE-2017-14340
CVE-2017-14489 CVE-2017-14497 CVE-2017-1000111 CVE-2017-1000112
CVE-2017-1000251 CVE-2017-1000252 CVE-2017-1000370 CVE-2017-1000371
CVE-2017-1000380

Several vulnerabilities have been discovered in the Linux kernel that may lead to privilege escalation, denial of service or information leaks.

CVE-2017-7518
Andy Lutomirski discovered that KVM is prone to an incorrect debug exception (#DB) error occurring while emulating a syscall instruction. A process inside a guest can take advantage of this flaw for privilege escalation inside a guest.

CVE-2017-10661 (jessie only)
Dmitry Vyukov of Google reported that the timerfd facility does not properly handle certain concurrent operations on a single file descriptor. This allows a local attacker to cause a denial of service or potentially execute arbitrary code.

CVE-2017-11600
Bo Zhang reported that the xfrm subsystem does not properly validate one of the parameters to a netlink message. Local users with the CAP_NET_ADMIN capability can use this to cause a denial of service or potentially to execute arbitrary code.

CVE-2017-12134 / #866511 / XSA-229
Jan H. Schoenherr of Amazon discovered that when Linux is running in a Xen PV domain on an x86 system, it may incorrectly merge block I/O requests. A buggy or malicious guest may trigger this bug in dom0 or a PV driver domain, causing a denial of service or potentially execution of arbitrary code.

This issue can be mitigated by disabling merges on the underlying back-end block devices, e.g.:
echo 2 > /sys/block/nvme0n1/queue/nomerges

CVE-2017-12153
Bo Zhang reported that the cfg80211 (wifi) subsystem does not properly validate the parameters to a netlink message. Local users with the CAP_NET_ADMIN capability (in any user namespace with a wifi device) can use this to cause a denial of service.

CVE-2017-12154
Jim Mattson of Google reported that the KVM implementation for Intel x86 processors did not correctly handle certain nested hypervisor configurations. A malicious guest (or nested guest in a suitable L1 hypervisor) could use this for denial of service.

CVE-2017-14106
Andrey Konovalov discovered that a user-triggerable division by zero in the tcp_disconnect() function could result in local denial of service.

CVE-2017-14140
Otto Ebeling reported that the move_pages() system call performed insufficient validation of the UIDs of the calling and target processes, resulting in a partial ASLR bypass. This made it easier for local users to exploit vulnerabilities in programs installed with the set-UID permission bit set.

CVE-2017-14156
"sohu0106" reported an information leak in the atyfb video driver. A local user with access to a framebuffer device handled by this driver could use this to obtain sensitive information.

CVE-2017-14340
Richard Wareing discovered that the XFS implementation allows the creation of files with the "realtime" flag on a filesystem with no realtime device, which can result in a crash (oops). A local user with access to an XFS filesystem that does not have a realtime device can use this for denial of service.

CVE-2017-14489
ChunYu Wang of Red Hat discovered that the iSCSI subsystem does not properly validate the length of a netlink message, leading to memory corruption. A local user with permission to manage iSCSI devices can use this for denial of service or possibly to execute arbitrary code.

CVE-2017-14497 (stretch only)
Benjamin Poirier of SUSE reported that vnet headers are not properly handled within the tpacket_rcv() function in the raw packet (af_packet) feature. A local user with the CAP_NET_RAW capability can take advantage of this flaw to cause a denial of service (buffer overflow, and disk and memory corruption) or have other impact.

Cumulus Linux is not vulnerable. The vulnerable code is not present in the Cumulus Linux kernel.

CVE-2017-1000111
Andrey Konovalov of Google reported a race condition in the raw packet (af_packet) feature. Local users with the CAP_NET_RAW capability can use this for denial of service or possibly to execute arbitrary code.

CVE-2017-1000112
Andrey Konovalov of Google reported a race condition flaw in the UDP Fragmentation Offload (UFO) code. A local user can use this flaw for denial of service or possibly to execute arbitrary code.

CVE-2017-1000251 / #875881
Armis Labs discovered that the Bluetooth subsystem does not properly validate L2CAP configuration responses, leading to a stack buffer overflow. This is one of several vulnerabilities dubbed "Blueborne". A nearby attacker can use this to cause a denial of service or possibly to execute arbitrary code on a system with Bluetooth enabled.

CVE-2017-1000252 (stretch only)
Jan H. Schoenherr of Amazon reported that the KVM implementation for Intel x86 processors did not correctly validate interrupt injection requests. A local user with permission to use KVM could use this for denial of service.

Cumulus Linux does not enable KVM functionality, and therefore is not vulnerable.

CVE-2017-1000370
The Qualys Research Labs reported that a large argument or environment list can result in ASLR bypass for 32-bit PIE binaries.

CVE-2017-1000371
The Qualys Research Labs reported that a large argument or environment list can result in a stack/heap clash for 32-bit PIE binaries.

CVE-2017-1000380
Alexander Potapenko of Google reported a race condition in the ALSA (sound) timer driver, leading to an information leak. A local user with permission to access sound devices could use this to obtain sensitive information.

Debian disables unprivileged user namespaces by default, but if they are enabled (via the kernel.unprivileged_userns_clone sysctl) then CVE-2017-11600, CVE-2017-14497 and CVE-2017-1000111 can be exploited by any local user.

For the oldstable distribution (jessie), these problems have been fixed in version 3.16.43-2+deb8u5.


RN-769 (CM-18624)
Security: FRR and Quagga issue fixed in Cumulus Linux 3.5.0: DSA-4011-1 CVE-2017-16227

The following CVEs that were announced in Debian Security Advisory DSA-4011-1 apply to the FRRouting package and upstream Quagga package. They have been fixed in Cumulus Linux 3.5.0 (package version 3.1+cl3u1 and 3.1+cl3u3):

--------------------------------------------------------------------------
Debian Security Advisory DSA-4011-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
October 30, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : quagga
CVE ID : CVE-2017-16227

It was discovered that the bgpd daemon in the Quagga routing suite does not properly calculate the length of multi-segment AS_PATH UPDATE messages, causing bgpd to drop a session and potentially resulting in loss of network connectivity.

For the oldstable distribution (jessie), this problem has been fixed in version 0.99.23.1-1+deb8u4 or the stable distribution (stretch), this problem has been fixed in version 1.1.1-3+deb9u1.


RN-770 (CM-18462)
Security: mysql issues fixed in Cumulus Linux 3.5.0: DSA-4002-1 CVE-2017-10268 CVE-2017-10378 CVE-2017-10379 CVE-2017-10384

The following security issues announced in DSA-4002-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 5.5.58-0+deb8u1 of the mysql package):

-------------------------------------------------------------------------
Debian Security Advisory DSA-4002-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
October 19, 2017 https://www.debian.org/security/faq
-------------------------------------------------------------------------
Package : mysql-5.5
CVE ID : CVE-2017-10379 CVE-2017-10378 CVE-2017-10268 CVE-2017-10384
Several issues have been discovered in the MySQL database server. The vulnerabilities are addressed by upgrading MySQL to the new upstream version 5.5.58, which includes additional changes, such as performance improvements, bug fixes, new features, and possibly incompatible changes. Please see the MySQL 5.5 Release Notes and Oracle's Critical Patch Update advisory for further details:

https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-58.html
http://www.oracle.com/technetwork/security-advisory/cpuoct2017-3236626.html

For the oldstable distribution (jessie), these problems have been fixed in version 5.5.58-0+deb8u1.


RN-771 (CM-18606)
Security: curl issue fixed in Cumulus Linux 3.5.0: DSA-4007-1 CVE-2017-1000257

The following security issues announced in DSA-4007-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 7.38.0-4+deb8u8 of the curl package).

--------------------------------------------------------------------------
Debian Security Advisory DSA-4007-1 security@debian.org
https://www.debian.org/security/ Alessandro Ghedini
October 27, 2017 https://www.debian.org/security/faq
---------------------------------------------------------------------------

Package : curl
CVE ID : CVE-2017-1000257

Brian Carpenter, Geeknik Labs and 0xd34db347 discovered that cURL, an URL transfer library, incorrectly parsed an IMAP FETCH response with size 0, leading to an out-of-bounds read.

For the oldstable distribution (jessie), this problem has been fixed in version 7.38.0-4+deb8u7.


RN-772 (CM-19011)
Security: libcurl issue fixed in Cumulus Linux 3.5.0: DSA-4051 CVE-2017-8816 CVE-2017-8817

The following security issues announced in DSA-4051-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 7.38.0-4+deb8u8 of the curl and libcurl3 packages).

--------------------------------------------------------------------------
Debian Security Advisory DSA-4051-1 security@debian.org
https://www.debian.org/security/ Yves-Alexis Perez
November 29, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : curl
CVE ID : CVE-2017-8816 CVE-2017-8817

Two vulnerabilities were discovered in cURL, an URL transfer library.

CVE-2017-8816
Alex Nichols discovered a buffer overrun flaw in the NTLM authentication code which can be triggered on 32bit systems where an integer overflow might occur when calculating the size of a memory allocation.

CVE-2017-8817
Fuzzing by the OSS-Fuzz project led to the discovery of a read out of bounds flaw in the FTP wildcard function in libcurl. A malicious server could redirect a libcurl-based client to an URL using a wildcard pattern, triggering the out-of-bound read.

For the oldstable distribution (jessie), these problems have been fixed in version 7.38.0-4+deb8u8.


RN-773 (CM-18609)
Security: wget issue fixed in Cumulus Linux 3.5.0: DSA-4008-1 CVE-2017-13089 CVE-2017-13090

The following security issues announced in DSA-4008-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 1.16-1+deb8u4 of the wget package).

--------------------------------------------------------------------------
Debian Security Advisory DSA-4008-1 security@debian.org
https://www.debian.org/security/ Moritz Muehlenhoff
October 28, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : wget
CVE ID : CVE-2017-13089 CVE-2017-13090

Antti Levomaeki, Christian Jalio, Joonas Pihlaja and Juhani Eronen discovered two buffer overflows in the HTTP protocol handler of the Wget download tool, which could result in the execution of arbitrary code
when connecting to a malicious HTTP server.

For the oldstable distribution (jessie), these problems have been fixed in version 1.16-1+deb8u4.


RN-774 (CM-18676)
Security: openssl issue fixed in Cumulus Linux 3.5.0: DSA-4017-1 CVE-2017-3735 CVE-2017-3736

The following security issues announced in DSA-4017-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 1.0.1t-1+deb8u7 of the openssl package).

--------------------------------------------------------------------------
Debian Security Advisory DSA-4017-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
November 03, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : openssl1.0
CVE ID : CVE-2017-3735 CVE-2017-3736

Multiple vulnerabilities have been discovered in OpenSSL, a Secure Sockets Layer toolkit. The Common Vulnerabilities and Exposures project identifies the following issues:

CVE-2017-3735
It was discovered that OpenSSL is prone to a one-byte buffer overread while parsing a malformed IPAddressFamily extension in an X.509 certificate.

Details can be found in the upstream advisory: https://www.openssl.org/news/secadv/20170828.txt

CVE-2017-3736
It was discovered that OpenSSL contains a carry propagation bug in the x86_64 Montgomery squaring procedure.

Details can be found in the upstream advisory: https://www.openssl.org/news/secadv/20171102.txt


RN-775 (CM-18752)
Security: postgresql-common issue fixed in Cumulus Linux 3.5.0: DSA-4029-1 CVE-2017-8806

The following security issues announced in DSA-4029-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 165+deb8u3 of the postgresql-common package).

--------------------------------------------------------------------------
Debian Security Advisory DSA-4029-1 security@debian.org
https://www.debian.org/security/ Moritz Muehlenhoff
November 09, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : postgresql-common
CVE ID : CVE-2017-8806

It was discovered that the pg_ctlcluster, pg_createcluster and pg_upgradecluster commands handled symbolic links insecurely which could result in local denial of service by overwriting arbitrary files.

For the oldstable distribution (jessie), this problem has been fixed in version 165+deb8u3.


RN-776 (CM-18763)
Security: postgresql issue fixed in Cumulus Linux 3.5.0: DSA-4027-1 CVE-2017-15098

The following security issues announced in DSA-4027-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 9.4.15-0+deb8u1 of the postgresql-9.4 package).

--------------------------------------------------------------------------
Debian Security Advisory DSA-4027-1 security@debian.org
https://www.debian.org/security/ Moritz Muehlenhoff
November 09, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : postgresql-9.4
CVE ID : CVE-2017-15098

A vulnerabilitiy has been found in the PostgreSQL database system: Denial of service and potential memory disclosure in the json_populate_recordset() and jsonb_populate_recordset() functions.

For the oldstable distribution (jessie), this problem has been fixed in version 9.4.15-0+deb8u1.


RN-777 (CM-18907)
Security: libxml-libxml-perl issue fixed in Cumulus Linux 3.5.0: DSA-4042 CVE-2017-10672 

The following security issues announced in DSA-4042-1 apply to Debian packages distributed as part of Cumulus Linux. They have been fixed in the Cumulus Linux 3.5.0 release (version 2.0116+dfsg-1+deb8u2 of the libxml-libxml-perl package).

--------------------------------------------------------------------------
Debian Security Advisory DSA-4042-1 security@debian.org
https://www.debian.org/security/ Salvatore Bonaccorso
November 19, 2017 https://www.debian.org/security/faq
--------------------------------------------------------------------------

Package : libxml-libxml-perl
CVE ID : CVE-2017-10672

A use-after-free vulnerability was discovered in XML::LibXML, a Perl interface to the libxml2 library, allowing an attacker to execute arbitrary code by controlling the arguments to a replaceChild() call.

For the oldstable distribution (jessie), this problem has been fixed in version 2.0116+dfsg-1+deb8u2.


RN-779 (CM-19181)
Active cables (10G fiber, 1G fiber, sometimes 1G RJ45) not working on Dell S4148F-ON S4128F-ON

On Dell S4148F-ON and S4128F-ON switches, the following cables do not work on SFP ports:

  • 10G optical modules (10G SR, LR, AOC)
  • 1G optical modules (1G SX, LX, AOC)
  • 1G copper RJ45 modules might fail, depending on how the tx_enable signal is used

This issue is fixed in Cumulus Linux 3.5.0.

There is an additional issue that prevents 1G interfaces from working on Dell 4148F and 4128F switches. For further details and the workaround for 1G SFP ports on Dell 4148F and 4128F switches, see RN-778.


RN-840 (CM-18641)
The Mellanox kernel driver does not handle bond sample packets correctly

A Mellanox switch cannot create and export flow samples when the sampled traffic flow is both ingress and egress on a bond interface. This affects topologies where bonded hosts are transiting the switch to a bonded uplink.

This issue is fixed in Cumulus Linux 3.5.0.

New Known Issues in Cumulus Linux 3.5.0

The following issues are new to Cumulus Linux and affect Cumulus Linux 3.5.0.

Release Note ID Summary Description

RN-691 (CM-18647)
Configuring a DHCP relay on a VRR interface with NCLU causes errors

When configuring a DHCP relay on a VRR interface using the NCLU commands, errors are seen when running sudo ifreload -a and net commit.

cumulus@switch:~$ sudo ifreload -a
error: 'scope'

To work around this issue, edit the /etc/network/interfaces file and remove any vlanX-v0 stanzas, then run sudo ifreload -a again.

This issue is fixed in Cumulus Linux 3.5.1.


RN-695 (CM-19156)
When adding an OSPF passive interface to a VRF that does not exist, ospfd crashes

Adding an OSPF passive interface to an OSPF VRF that has not been defined previously causes ospfd to crash.

To work around this issue, create the VRF before you add the passive OSPF interface.

This issue is fixed in Cumulus Linux 3.5.1.


RN-704 (CM-18886, CM-20027)
ifreload causes MTU to drop on bridge SVIs

When you run the ifreload command on a bridge SVI with an MTU higher than 1500, the MTU gets reset to 1500 after the initial ifreload -a, then resets to its original value when running ifreload -a a second time.

This is a known issue and should be fixed in the next release of Cumulus Linux.


RN-732 (CM-16550)
With management VRF, the `net show time ntp servers` command shows empty output

With management VRF, the output of the NCLU command net show time ntp servers is empty.

This issue is fixed in Cumulus Linux 3.5.1.


RN-738 (CM-18709)
On Dell S4148T-ON switches with Maverick ASICs, configuring 1G or 100M speeds on 10G fixed copper ports requires a ports.conf workaround

1G and 100M speeds on SFP ports are not working on the Dell S4148T-ON.

To enable a speed lower than 10G on a port on the S4148T platform, you must dedicate an entire port group (four interfaces) to a lower speed setting. Within a port group, you can mix 1G and 100M speeds, if needed. You cannot mix 10G and lower speeds.

To work around this issue:

  1. In the /etc/cumulus/ports.conf file, add each of the four ports in the port group as 1G interfaces. You must set each of the ports in the port group to be 1G. Port groups are swp1-4, swp5-8, swp9-12, and so on, and starting with swp31-35 on the right half of the switch. For example, to enable ports swp5-swp8 to autonegotiate to 100M or 1G speeds, add the following to the ports.conf file:
    5=1G
    6=1G
    7=1G
    8=1G
  2. Restart switchd:
    cumulus@switch:~$ sudo systemctl reset-failed switchd; sudo systemctl restart switchd

    After this is done ports swp5-8 will be enabled to autonegotiate with the neighbor devices to 1G or 100M speeds.

As of 3.5.1, 1G interfaces are supported when using the ports.conf file workaround as described above. As of 3.6.0, editing the ports.conf file is no longer required.


RN-742 (CM-18742)
VRR interface MAC is missing in fdb table causing duplicate packets

When running VXLAN routing, the VRR interface MAC address is missing in the fdb table (permanent entry), which causes duplicate packets.

If you encounter this issue, update the interface configuration on all gateway VTEPs using the ifreload -a -X eth0 command.


RN-744 (CM-18986)
Unable to modify BGP ASN for a VRF associated with layer 3 VNI

After editing the frr.conf file to modify the the BGP ASN for a VRF associated with a layer 3 VNI, the change is not applied.

To work around this issue, first delete the layer 3 VNI, then try to modify the BGP VRF instance.


RN-745 (CM-19033)
On Dell Z9100, 4x10G breakout not working, shows as 25G

On a Dell Z9100, the 4x10G breakout ports not working and the speed is still set to the default of 25G.

To work around this issue, change the interface speed for each port in the /etc/cumulus/ports.conf file to 10G, then restart switchd.


RN-748 (CM-19202)
The `link autoneg off` setting not applied to the last set of interfaces in a list if OFF already set on one of the interfaces

Using NCLU to assign the link autoneg off setting to a list of interfaces fails to complete the list if one of the interfaces in the list already has the link autoneg off setting.

This issue is fixed in Cumulus Linux 3.5.1.


RN-756 (CM-19134)
Out of memory issues when running net show bgp ipv4 unicast json

When you run the net show bgp ipv4 unicast json command on a large configuration (for example, 64K routes from each of a dozen or more peers), an out of memory issue occurs.

Cumulus Networks is currently working to fix this issue. Avoid running this command on large configurations. Instead, you can run the command vtysh -c 'show bgp ipv4 unicast json.


RN-765 (CM-19133, 19139)
On Mellanox switches, unable to assign VLANs 1992 and above 

On Mellanox switches, you can have a maximum of 2048 VLANs, less the number of physical interfaces. Typically, you have no more than 48 physical interfaces, which leaves 2000 VLANs available. If you create more than 48 interfaces, there are only 1991 VLANs available.

This issue is fixed in Cumulus Linux 3.5.1.


RN-766 (CM-19006)
On the Broadcom Trident II+ and Maverick platform, in an external VXLAN routing environment, the switch doesn't rewrite MAC addresses and TTL, so packets are dropped by the next hop

On the Broadcom Trident II+ and Maverick based switch, in an external VXLAN routing environment, when a lookup is done on the external-facing switch (exit/border leaf) after VXLAN decapsulation, the switch doesn't rewrite the MAC addresses and TTL. So for through traffic, packets are dropped by the next hop instead of correctly routing from a VXLAN overlay network into a non-VXLAN external network (for example, to the Internet).

This issue affects all traffic from VXLAN overlay hosts that need to be routed after VXLAN decapsulation on an exit/border leaf, including:

  • Traffic destined to external networks (through traffic)
  • Traffic destined to the exit leaf SVI address

This issue should be fixed in the Trident III ASIC.

To work around this issue, modify the external-facing interface for each VLAN sub-interface by creating a temporary VNI and associating it with the existing VLAN ID.

For example, if the expected interface configuration is:

auto swp3.2001
iface swp3.2001
    vrf vrf1
    address 45.0.0.2/24
# where swp3 is the external facing port and swp3.2001 is the VLAN sub-interface

auto bridge
iface bridge
    bridge-vlan-aware yes
    bridge ports vx-4001
    bridge-vids 4001

auto vx-4001
iface vx-4001
    vxlan-id 4001
    <... usual vxlan config ...>
    bridge-access 4001
# where vnid 4001 represents the L3 VNI

auto vlan4001
iface vlan4001
    vlan-id 4001
    vlan-raw-device bridge
    vrf vrf1

You would modify the configuration as follows:

auto swp3
iface swp3
    bridge-access 2001
# associate the port (swp3) with bridge 2001

auto bridge
iface bridge
    bridge-vlan-aware yes
    bridge ports swp3 vx-4001 vx-16000000
    bridge-vids 4001 2001
# where vx-4001 is the existing VNI and vx-16000000 is a new temporary VNI
# this is now bridging the port (swp3), the VNI (vx-4001),
# and the new temporary VNI (vx-16000000)
# the bridge VLAN IDs are now 4001 and 2001

auto vlan2001
iface vlan2001
    vlan-id 2001
    vrf vrf1
    address 45.0.0.2/24
    vlan-raw-device bridge
# create a VLAN 2001 with the associated VRF and IP address

auto vx-16000000
iface vx-16000000
    vxlan-id 16000000
    bridge-access 2001
    <... usual vxlan config ...>
# associate the temporary VNI (vx-16000000) with bridge 2001

auto vx-4001
iface vx-4001
    vxlan-id 4001
    <... usual vxlan config ...>
    bridge-access 4001
# where vnid 4001 represents the L3 VNI

auto vlan4001
iface vlan4001
    vlan-id 4001
    vlan-raw-device bridge
    vrf vrf1

RN-780 (CM-19193)
DC power supply on Edgecore AS5812 and AS5712 causes switch to be inoperable

An issue with the DC power supply on the Edgecore AS5812 and AS5712 models causes the switch to be inoperable.

This issue is fixed in Cumulus Linux 3.5.1.


RN-781 (CM-19067)
VXLAN symmetric routing: Packets are CPU forwarded after switchd restarts

When VXLAN symmetric routing in enabled, sometimes packets get forwarded to the CPU after switchd is restarted.

To work around this issue, restart the networking service:

cumulus@switch:~$ sudo systemctl restart networking

This issue is fixed in Cumulus Linux 3.5.1.

Previously Known Issues in Cumulus Linux 3.5.0

The following issues also affect the current release.

Release Note ID Summary Description

RN-52 (CM-997,
CM-1013)
Parameters like the router ID and DR priority cannot be changed while OSPFv2/v3 is running Router ID and DR priority can only be changed by shutting down OSPFv2/v3, changing the ID, and restarting the OSPF process.

A change to the DR priority may not properly be reflected in the LSAs that are still aging out.

RN-56 (CM-343)
IPv4/IPv6 forwarding disabled mode not recognized

If either of the following is configured:

net.ipv4.ip_forward == 0

or:

net.ipv6.conf.all.forwarding == 0

The hardware still forwards packets if there is a neighbor table entry pointing to the destination.


RN-77 (CM-265)
New routes/ECMPs can evict existing/installed Cumulus Linux syncs routes between the kernel and the switching silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes to move from being fully allocated to being partially allocated.

To avoid this, routes in the hardware should be monitored and kept below the ASIC limits.

For example, on systems with Trident+ chips, the limits are as follows:
routes: 16384 <<<< if all routes are ipv4 
 long mask routes 256 <<<< i.e., routes with a mask longer 
       than the route mask limit 
 route mask limit 64
 host_routes: 8192 
 ecmp_nhs: 4044 
 ecmp_nhs_per_route: 52
That translates to about 77 routes with ECMP NHs, if every route has the maximum ECMP NHs.

Monitoring this in Cumulus Linux is performed via the cl-resource-query command:
cumulus@switch:~$ sudo cl-resource-query
 hosts : 3 
 all routes : 29 
 IP4 routes : 17 
 IP6 routes : 12 
 nexthops : 3 
 ecmp_groups : 0
 ecmp_nexthops : 0
 mac entries : 0 / 131072 
 bpdu entries : 500 / 512
The resource to monitor is the ecmp_nexthops. If this count is close to 4044, new ECMPs may evict existing routes.

RN-198 (CM-3290)
Port LEDs behave differently on different switch models

It's been observed that port LEDs behave differently depending upon the make and model of the switch. For example:

  • Agema AG-7448CU: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • Edge-Core AS4600-54T: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • QuantaMesh T3048-LY2R: the LED is on when the link is up. It blinks off briefly when there is traffic.

Cumulus Networks is currently working to fix this issue.


RN-199 (CM-2624)
When a Quagga route-map is modified, the switch could use the partial map before edits are completed

Cumulus Linux triggers a route-map update before the user finishes editing the route map, resulting in an incorrect route map being used. The route-map update trigger should only occur when user finishes editing the map.

Cumulus Networks is working to fix this issue.


RN-221 (CM-3926, CM-4501)
BGP graceful restart, including helper mode, not fully supported If you encounter issues with this, please submit a support request and include the output from cl-support with your ticket.

RN-327 (CM-4290)
Changing the route-map parameter of the redistribute command in OSPF and BGP doesn't affect the state of the resulting redistribution in those protocols

To work around this issue, remove any old redistribute command configurations before adding a new one with or without route-map as a parameter.

For example, if OSPF has a redistribute configuration such as redistribute bgp route-map redist-map-name, you would enable redistribution without a route-map by following these steps in OSPF configuration mode:

  1. no redistribute bgp
  2. redistribute bgp

You would perform a similar sequence of commands for redistribution changes in BGP as well.


RN-382 (CM-6692)
Quagga: Removing bridge via ifupdown2 does not remove it from Quagga Removing a bridge using ifupdown2 does not remove it from the Quagga configuration files. This issue is being investigated; however, restarting Quagga will successfully remove the bridge.

RN-384 (CM-7684)
Keeping VXLAN single-connected devices up on MLAG secondary node In the current MLAG secondary design, if the VXLAN device is not dual-connected, it is kept in a protodown state. You can keep them up with individual IP addresses rather than anycast IPs when the peerlink is down, so that all single-connected hosts will have connectivity. Further investigation regarding this issue is underway.

RN-389 (CM-8410)
switchd supports only port 4789 as the UDP port for VXLAN packets

switchd currently allows only the standard port 4789 as the UDP port for VXLAN packets. There are cases where a hypervisor could be using non-standard UDP port, which would cause VXLAN exchanges with the hardware VTEP to not work. In such a case, packets would not be terminated and encapsulated packets would be sent out on UDP port 4789.


RN-404 (CM-4407)
Aggregating routes in BGP with as-set can result in high CPU usage

When BGP is configured with aggregate addresses with as-set configuration and there are many routes to be aggregated, the BGP process gets into high CPU usage.

To work around this issue, do not specify the as-set parameter for the aggregate-address configuration.


RN-406 (CM-9895)
Mellanox SN2700 power off issues

On the Mellanox SN2700 and SN2700B switches, if any of the following occur:

  • A shutdown or poweroff command is executed
  • A temperature sensor hits a critical value and shuts down the box

Once a PDU power cycle is issued, the box appears to be dead for at least 3 minutes.

This issue is currently being investigated.


RN-409 (CM-10054)
BGP may show an inaccessible path as the best path

Existing BGP issues caused peering between a VRF device and a loopback BGP session to stay up if the loopback session doesn’t advertise its local address.

This issue will be fixed in a future release.


RN-526 (CM-13037)
Upgrading the clag package fails when logrotate contains an invalid date

While upgrading from Cumulus Linux 3.0.1 to 3.1.z, the clag package does not update if the logrotate file contains an invalid date. This can occur due to bad batteries in the switch or RTC clock chip issues. The error may look like the following:

error: bad year 1929 for file /var/log/boot.log in state file /var/lib/logrotate/status
dpkg: error processing package clag (--configure):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 clag
E: Sub-process /usr/bin/dpkg returned an error code (1)

You can work around this issue by removing the /var/lib/logrotate/status file, and forcing a logrotate. After you do this, the upgrade should be successful.


RN-537 (CM-12967)
Pause frames sent by a Tomahawk switch are not honored by the upstream switch

An issue exists when link pause or priority flow control (PFC) is enabled on a Broadcom Tomahawk-based switch, and there is over-subscription on a link, where the ASIC sends pause frames aggressively, causing the upstream switch not to throttle enough.

If you need link pause or PFC functionality, then to work around this issue, you must use a switch that does not use the Tomahawk ASIC.


RN-540 (CM-13428)
Quagga reload fills FIB after renaming the table map

Trying to rename a table map — essentially deleting the current table map and adding a new one — causes Quagga to try to install all the routes again, which fills the FIB. The table map begins functioning again on its own after some time and without any intervention, and the FIB usage returns to a normal level.

This issue is currently being investigated.


RN-545 (CM-13800)
OSPFv3 redistribute connected with route-map broken at reboot (or ospf6d start) This issue only affects OSPFv3 (IPv6) and is being investigated at this time.

RN-548 (CM-14061)
ethtool --show-fec does not reflect correct state The ethtool --show-fec command does not accurately report the current link state. The cause of this issue is being investigated.

RN-602 (CM-15094)
sFlow interface speed incorrect in counter samples

Counter samples exported from the switch show an incorrect interface speed.

 This issue is being investigated at this time.


RN-603 (CM-16012, CM-15739)
Broadcom Tomahawk ASIC doesn't support a mix of 10G and 25G speeds within the same warp core

On a switch with a Broadcom Tomahawk ASIC, you cannot mix 10G and 25G speeds within the same warp core (a warp core is a cluster of 4 sequential switch ports, such as swp1 through swp4 or swp29 through swp32).

On a 100G switch, each of the 4 physical ports in the core can be broken out into 4x10G or each of the 4 ports can be broken out into 4x25G speeds; you cannot have a mix of speeds.

On a 25G switch, each port in the warp core must be configured for the same speed. You can configure all ports in the warp core for 10G speed, but you cannot have a mix of 25G and 10G speeds.

These restrictions apply to SFP modules; for QSPF modules, the restriction applies if you want to break out one of the ports to 4x10G or 4x25G.

If you want to modify the speed of the ports in a warp core, you must hand edit the /etc/cumulus/ports.conf file. You cannot use NCLU to configure the port speeds on a Tomahawk switch.

This issue is being investigated at this time.


RN-604 (CM-15959)
ARP suppression does not work well with VXLAN active-active mode

In some instances, ARP requests do not get suppressed (when they ought to be) in a VXLAN active-active scenario, but instead get flooded over VXLAN tunnels. This issue is caused because there is no control plane syncing the snooped local neighbor entries between the MLAG pair; MLAG does not perform this sync, and neither does EVPN.

This issue is being investigated at this time.


RN-606 (CM-6366)
BGP: MD5 password is not enforced for dynamic neighbors

It was determined that the MD5 password configured against a BGP listen-range peer-group (used to accept and create dynamic BGP neighbors) is not enforced. This means that connections are accepted from peers that don't specify a password; and only if they don't.

This issue is being investigated.


RN-608 (CM-16145)
Buffer monitoring default port group discards_pg only accepts packet collection type

The default port group discards_pg does not accept packet_extended or packet_all collection types.

This issue is being investigated at this time.


RN-640 (CM-16461)
Cumulus VX OVA image for VMware reboots due to critical readings from sensors

It has been verified that, after booting a Cumulus VX virtual machine running the VMware OVA image, sometimes messages from sensors appear, saying the "Avg state" is critical, with all values displayed as 100.0; then it generates a cl-support.

This issue is being investigated at this time.


RN-656 (CM-17617)
The switchd heartbeat fails on Tomahawk switches with VXLAN scale configuration (512 VXLAN interfaces)

When a Tomahawk switch has 512 VXLAN interfaces configured, the switchd heartbeat fails. This can cause switchd to dump core.

To work around this issue, disable VXLAN statistics in switchd. Edit /etc/cumulus/switchd.conf and comment out the following line:

cumulus@switch:~$ sudo nano /etc/cumulus/switchd.conf

...

#stats.vxlan.member = BRIEF

...

Then restart switchd for the change to take effect. This causes all network ports to reset in addition to resetting the switch hardware configuration.

cumulus@switch:~$ sudo systemctl restart switchd.service
 

RN-671 (CM-17062)
On a Wedge-100 switch, you cannot change advertisement settings on eth0

On a Facebook Wedge-100 switch, you cannot change the advertisement settings on eth0 and still have a link. eth0 only operates at 1000BASE-T. This is a hardware limitation and there is no workaround.


RN-743 (CM-18612)
Routes learned through BGP unnumbered become unusable

In certain scenarios, the routes learned through BGP unnumbered become unusable. The BGP neighbor relationships remain but the routes cannot be forwarded due to a failure in layer 2 and layer 3 next hop/MAC address resolution.

To work around this issue, restart FRR.


RN-750 (CM-17457)
On Maverick switches, multicast traffic limited by lowest speed port in the group

The Maverick switch limits multicast traffic by the lowest speed port that has joined a particular group.

This issue is being investigated at this time.


RN-751 (CM-17157)
Pull source-node replication schema patch from upstream

The upstream OVSDB VTEP schema has been updated multiple times and now contains a patch to support source-node replication. This patch is not included with the latest version of Cumulus Linux.

Cumulus Networks is currently working to fix this issue.


RN-753 (CM-18170)
MLAG neighbor entries deleted on link down, but ARP table out of sync when bond comes back up and system MAC address changed

The MLAG neighbor entries are deleted when the switch goes down; however, the ARP table is out of sync when the bond comes back up and the system MAC address is changed.

To work around this issue, ping the SVI address of the MLAG switch or issue an arping command to the host from the broken switch.


RN-754 (CM-15812)
Multicast forwarding fails for IP addresses whose DMAC overlaps with reserved DIPs

Multicast forwarding fails for IP addresses whose DMAC overlaps with reserved DIPs.

This issue is being investigated at this time.


RN-755 (CM-16855)
Auto-negotiation ON sometimes results in NO-CARRIER

If a two nodes on both sides of a link change from auto-negotiation off to auto-negotiation on for both sides during a short interval (around one second), the link might start flapping or stay down.

To work around this issue and stop the flapping, turn the link down on the switch with the command ifdown swpX, wait a few seconds, then bring the link back up with the command ifup swpX. Repeat this on the other side if necessary.


RN-757 (CM-18537)
On Mellanox switches, congestion drops not counted

On the Mellanox switch, packet drops due to congestion are not counted.

To work around this issue, run the command sudo ethtool -S swp1 to collect interface traffic statistics.


RN-758 (CM-17557)
If sFlow is enabled, some sampled packets (such as multicast) are forwarded twice

When sFlow is enabled, some sampled packets, such as IPMC, are forwarded twice (in the ASIC and then again through the kernel networking stack).

This issue is being investigated at this time.


RN-759 (CM-18401)
The output for the NCLU net show config command is incorrect

The output for the NCLU net show config command is incorrect.

This issue is being investigated at this time.


RN-760 (CM-18682)
smonctl utility JSON parsing error

There is a parsing error with the smonctl utility. In some cases when JSON output is chosen, the smonctl utility crashes. The JSON output is necessary to make the information available through SNMP.

This issue should be fixed in the next release of Cumulus Linux.


RN-761 (CM-19081)
The vxlan-ageing and bridge-ageing timers do not align

For static tunnels, the vxlan-ageing command defaults to 300 seconds, which does not align with the bridge-ageing timer of 1800 seconds. As a result, silent hosts learned over VXLAN static tunnels time out after five minutes, which causes the VTEPs to flood all traffic.

To work around this issue, configure the vxlan-ageing timer to align with the brige-aging timer. 


RN-762 (CM-15677)
SBUS error warnings on Tomahawk switches

SBUS error warnings display on Tomahawk switches.

This issue is being investigated at this time.


RN-763 (CM-16139)
OSPFv3 does not handle ECMP properly

IPv6 ECMP not working in OSPFv3 as expected.

This issue is being investigated at this time.


RN-764 (CM-17434)
On Broadcom switches, all IP multicast traffic uses only queue 0 

On Broadcom switches, IPv4 and IPv6 multicast traffic always maps into queue 0.

This issue is being investigated at this time.


RN-778 (CM-19203)
On Dell 4148F-ON and 4128F-ON switches with Maverick ASICs, configuring 1G or 100M speeds requires a ports.conf workaround

1G and 100M speeds on SFP ports do not work automatically on Dell S4148F-ON and S4128F-ON switches.

To enable a speed lower than 10G on a port on the S4148F and S4128F platforms, you must dedicate an entire port group (four interfaces) to a lower speed setting. Within a port group, you can mix 1G and 100M speeds, if needed. You cannot mix 10G and lower speeds.

To work around this issue:

  1. In the /etc/cumulus/ports.conf file, set each of the four ports in the port group to 1G. Port groups are swp1-4, swp5-8, swp9-12, and so on, and start with swp31-35 on the right half of the switch. For example, to enable ports swp5-swp8 to link up at to 100M or 1G speeds, add the following to the ports.conf file:
    5=1G
    6=1G
    7=1G
    8=1G
  2. Restart switchd:
    cumulus@switch:~$ sudo systemctl reset-failed switchd; sudo systemctl restart switchd
  3. Configure the interfaces.

    On RJ45 SFPs (1G-BaseT), set the link speed to 1000 for 1G or 100 for 100M and turn off auto-negotiation for each of the four ports in the port group, as shown in the example commands below. Note that auto-negotiation still functions internally on the RJ45 side from within the 1G-BaseT SFP PHY to the neighboring NIC.

    cumulus@switch:~$ net add interface swpXX
    cumulus@switch:~$ net add interface swpXX link speed 1000
    cumulus@switch:~$ net add interface swpXX link autoneg off
    cumulus@switch:~$ net commit

    These commands create the following configuration in the /etc/network/interfaces file:

    auto swpXX
    iface swpXX
     link-speed 1000
     link-duplex full
     link-autoneg off

    On Fiber SFPs (1G-BaseSX, 1G-BaseLX), enable auto-negotiation for each of the four ports in the port group, as shown in the example commands below. Auto-negotiation is not required but allows unidirectional fiber link detection.

    cumulus@switch:~$ net add interface swpXX
    cumulus@switch:~$ net add interface swpXX link autoneg on
    cumulus@switch:~$ net commit

    These commands create the following configuration in the /etc/network/interfaces file:

    auto swpXX
    iface swpXX
      link-autoneg on

RN-804 (CM-19421)
Convergence after a spine goes down takes longer than expected

When a spine-to-leaf link is brought down or when a spine switch is powered down completely, traffic does not fail over to the remaining ECMP paths with minimal losses, and convergence takes longer than expected.


RN-837 (CM-19919)
PCIe bus error (Malformed TLP) on the Dell Z9100 switch

Certain Dell Z9100 switches running Cumulus Linux 3.5.2 have a different string coded in the Manufacturer field of the SMBIOS/DMI information. This discrepancy sometimes causes a problem with timing during the boot sequence that leaves switchd in a failed state.

To work around this issue, perform a cold reboot or power cycle the switch.

This issue should be fixed in an upcoming version of Cumulus Linux.


RN-838 (CM-19897)
VXLAN flow dropped as ingress general discards on Mellanox switches

After creating a new VNI on a Mellanox switch, VXLAN encapsulated packets are dropped as ingress general discards.

To work around this issue, flap the newly created VNI using the sudo ifdown <vni> and sudo ifup <vni> commands, or trunk the associated VLAN on an interface that is not currently forwarding that VLAN.

This issue is fixed in Cumulus Linux 3.5.2.


RN-839 (CM-19038)
NCLU DecodeError: 'ascii' codec can't decode byte 0xc2

Certain unicode characters in the /etc/network/interfaces.d/vlans.intf file, which is used by the interfaces file, cause a decode error when running the net show interface command.

This issue is fixed in Cumulus Linux 3.5.1.

Have more questions? Submit a request

Comments

Powered by Zendesk