Cumulus Linux 3.1.0 Release Notes



These release notes support Cumulus Linux 3.1.0 and describe currently available features and known issues.

Stay up to Date 

  • Please sign in and click Follow above so you can receive a notification when we update these release notes.
  • Subscribe to our product bulletin mailing list to receive important announcements and updates about issues that arise in our products.
  • Subscribe to our security announcement mailing list to receive alerts whenever we update our software for security issues.


What's New in Cumulus Linux 3.1.0

Cumulus Linux 3.1.0 includes these new features:

And these new platforms:

Early Access Features

The following early access features are included in Cumulus Linux 3.1.0:

To install an early access feature, do the following:

  1. After you upgrade to Cumulus Linux 3.1.0, make sure you reboot the switch.
  2. Open the /etc/apt/sources.list file in a text editor.
  3. Uncomment the early access repository lines and save the file:

    #deb CumulusLinux-3-early-access cumulus
    #deb-src CumulusLinux-3-early-access cumulus
  4. Run the following commands in a terminal to install an early access package:

    cumulus@switch:~$ sudo apt-get update
    cumulus@switch:~$ sudo apt-get install <early access package>

    Where <early access package> is any of the following:

    • Network command line utility: nclu
    • netq: cumulus-netq
    • OpenStack ML2: python-falcon and python-cumulus-restapi
    • PIM: cumulus-pim
    • TACACS+: tacplus-client
  5. If you're installing the cumulus-pim package, complete the installation by running:
    cumulus@switch:~$ sudo apt-get upgrade


Cumulus Linux is licensed on a per-instance basis. Each network system is fully operational, enabling any capability to be utilized on the switch with the exception of forwarding on switch panel ports. Only eth0 and console ports are activated on an un-licensed instance of Cumulus Linux. Enabling front panel ports requires a license.

You should have received a license key from Cumulus Networks or an authorized reseller. To install the license, read the Cumulus Linux Quick Start Guide.

Installing Version 3.1.0

If you are upgrading from version 3.0.0, use apt-get to update the software. Version 3.1.0 introduces a new meta-package of kernel and platform modules, which requires you to run apt-get upgrade a second time.

  1. Run apt-get update.
  2. Run apt-get upgrade.
  3. To install the meta-package, run apt-get upgrade a second time.
  4. Reboot the switch.

To confirm the upgrade was successful, run: 

dpkg -l | grep linux-image

This returns linux-image-4.1.0-cl-2-amd64 4.1.25-1+cl3u4, where cl-2 is the new kernel ABI version and linux-image-amd64 4.1+63+cl3u1 indicates the meta-package installation.

New Install or Upgrading from Versions Older than 3.0.0

If you are upgrading from a version older than 3.0.0, or installing Cumulus Linux for the first time, download the Cumulus Linux 3.1.0 installer for Broadcom or Mellanox switches from the Cumulus Networks website, then use ONIE to perform a complete install, following the instructions in the quick start guide.

Note: This method is destructive; any configuration files on the switch will not be saved, so please copy them to a different server before upgrading via ONIE.

Important! After you install, run apt-get update, then apt-get upgrade on your switch to make sure you update Cumulus Linux to include any important or other package updates.

Updating a Deployment that Has MLAG Configured

If you are using MLAG to dual connect two switches in your environment, and those switches are still running Cumulus Linux 2.5 ESR or any other release earlier than 3.0.0, the switches will not be dual-connected after you upgrade the first switch. To ensure a smooth upgrade, follow these steps:

  1. Run cl-img-select -fr to boot the switch in the secondary role into ONIE, then reboot the switch.
  2. Install Cumulus Linux 3.1.0 onto the secondary switch using ONIE. At this time, all traffic is going to the switch in the primary role.
  3. After the install, copy the license file and all the configuration files you backed up, then restart the switchd and networking services.
    cumulus@switch:~$ sudo systemctl restart switchd.service
    cumulus@switch:~$ sudo systemctl restart networking.service
    All traffic is still going to the primary switch.
  4. Run cl-img-select -fr to boot the switch in the primary role into ONIE, then reboot the switch. Now, all traffic is going to the switch in the secondary role that you just upgraded to version 3.1.0.
  5. Install Cumulus Linux 3.1.0 onto the primary switch using ONIE. 
  6. After the install, copy the license file and all the configuration files you backed up, then restart the switchd and networking services.
    cumulus@switch:~$ sudo systemctl restart switchd.service
    cumulus@switch:~$ sudo systemctl restart networking.service
  7. Now the two switches are dual-connected again and traffic flows to both switches.

 SNMP Not Supported in Quagga

There is no SNMP support for Quagga in Cumulus Linux. However, it's possible to get it via SNMP by:

  • Using Nagios
  • Writing a pass persist script in Perl or Python by filling in the OSPF or BGP (rfc) MIBs manually.
  • Creating your own private MIB for the information you need.

Due to this circumstance, you must remove all references to smux in each of the following configuration files. You must also remove these references before upgrading Cumulus Linux using apt-get. If the smux entries are present in the configuration files, the daemons in the 2.5 packaged version of Quagga will not start.

  1. cd /etc/quagga
  2. grep smux *
  3. Delete all lines in the config files containing the smux keyword.

The references to smux that must be removed are:

  • In bgpd.conf, remove this line:
    smux peer quagga_bgpd
  • In ospf6d.conf, remove this line:
    smux peer quagga_ospf6d
  • In ospfd.conf, remove this line:
    smux peer quagga_ospfd
  • In zebra.conf, remove this line:
    smux peer quagga_zebra

 Perl, Python and BDB Modules

Any Perl scripts that use the DB_File module or Python scripts that use the bsddb module won't run under Cumulus Linux 3.1.0.


You can read the technical documentation here.

Issues Fixed in Cumulus Linux 3.1.0

The following is a list of issues fixed in Cumulus Linux 3.1.0 from earlier versions of Cumulus Linux.

Release Note ID Summary Description

RN-134 (CM-2303)
Installing Chef under Cumulus Linux

The Cumulus Linux repository contains two versions of Chef, the automation tool: 11.6.2 (the current version) and 10.30.4.

To install the latest version, connect to the switch and use apt-get:

cumulus@switch:~$ sudo apt-get install chef

To install 10.30.4, connect to the switch and use apt-get:

cumulus@switch:~$ sudo apt-get install chef=10.30.4-0.debian.7.3 

RN-179 (CM-3410)

10GTek 10G SR cables exhibit high rate of errors on Penguin Arctica 4804X switch

Some PHY-less Penguin Arctica 4804X platforms using 10GTek 10G MM SR cables exhibit high rates of errors and low bandwidth one direction.

RN-281 (CM-5118)
Default route not removed on ifup after removing gateway statement from eth0 configuration

If you try to remove the default route from eth0 (either by commenting out or removing the gateway statement in the eth0 configuration), the route remains after running ifup.

To work around this issue, first run ifdown, then ifup on the interface via the console. After this, the route disappears.

RN-315 (CM-7318)
Dell S4048-ON: Various power off commands render the switch unusable

Issuing a poweroff or shutdown command with certain options shuts down the switch, but it cannot be powered on again. The options that cause this issue are:

  • shutdown -h, shutdown -P
  • poweroff -n, poweroff -d, poweroff -f, poweroff -i, poweroff -h, poweroff -p
  • halt -n, halt -d, halt -f, halt -i, halt -h, halt -p
  • init 0

Issuing a reboot command, or using other options, does not trigger this issue.

This issue affects only Dell S4048-ON switches with BIOS version or earlier. To determine the BIOS version of the switch, run:

cumulus@switch:~$ sudo dmidecode -s system-version

This issue has been fixed. Please contact Cumulus Networks support for assistance is resolving it.

RN-324 (CM-7228)
On Edge-Core AS4610-54P, the Power over Ethernet poectl service reports 5V difference in power consumption The voltage reported by the poectl -i command and measured through a power meter connected to the device varies by 5V. The current and power readings are correct and no difference is seen for them.

RN-381 (CM-6307)
Implement IPv6 initial neighbor discovery process to speed peer startup If the IPv6 nd ra-interval <interval> command is not run, the default max value of 600 seconds is used. This can delay peer discovery for up to 10 minutes for some peers. The ra-interval must be set to avoid this issue.

RN-402 (CM-9627)
smond PSU warnings on Supermicro X3648S

The Supermicro X3648S uses a different PSU (DPS-550) than the equivalent Penguin Arctica 4806XP switch (DPS-460). Cumulus Linux was written to work with the DPS-460, but the DPS-550 behaves differently.

On the Penguin Arctica 4806XP, a PWM value is written to the DPS-460 fan, and the power supply fan remains at that setting. But on the Supermicro X3648S, when a PWM value is written to the DPS-550 fan, the power supply fan initially goes to the setting; however, after five seconds, it may revert back to its internal setting.

As a result, you will see smond warning messages that the fan input RPM is lower than expected, which can be safely ignored.

RN-403 (CM-10968) is set to

The for systemd is mistakenly set to in this release, instead of You may see this message in the journal or syslog at system boot:

systemd[1]: Cannot add dependency job for unit display-manager.service, ignoring: Unit display-manager.service failed to load: No such file or directory.

You should ignore this message, as the correct systemd state is reached, since the is a prerequisite of

This issue will be fixed in a future release of Cumulus Linux.

RN-407 (CM-11103)
When the sx_sdk service is restarted manually or during a package upgrade, switchd receives "Invalid Handle" errors

When the sx_sdk service is restarted manually, or when the sx_sdk Debian package is upgraded, switchd receives "Invalid Handle" errors:

2016-05-20T20:05:15.144736+00:00 mlx-2700-01 switchd[14396]: hal_mlx.c:4620 [SX_API_INTERNAL     ]: Invalid handle: handle is not valid.
2016-05-20T20:05:15.145099+00:00 mlx-2700-01 switchd[14396]: hal_mlx_port.c:1789 ERR port_pfc_stats_get failed for lid 0x13c00 prio 2: Invalid Handle
2016-05-20T20:05:15.145460+00:00 mlx-2700-01 switchd[14396]: hal_mlx.c:4620 [SX_API_INTERNAL     ]: Invalid handle: handle is not valid.
2016-05-20T20:05:15.145847+00:00 mlx-2700-01 switchd[14396]: hal_mlx_port.c:1789 ERR port_pfc_stats_get failed for lid 0x13c00 prio 3: Invalid Handle

These error messages are being investigated, but in the meantime, you should restart switchd when you restart sx-sdk. This ensures that the services recognize the other has been restarted manually.

RN-410 (CM-10215)
Mellanox SN2700 breakout cables always report errors/packets in pause

On a Mellanox SN2700 switch, any port set to breakout mode on boot generates errors or sets packets in pause on the counters. The cable does not need to be plugged in, just set to breakout mode, then restart switchd.

This issue will be fixed in the next Mellanox SDK update.

RN-425 (CM-11308)
While configuring LNV, incomplete LNV configuration cause hang during boot up

If you configure the LNV registration node in /etc/network/interfaces for a switch running Cumulus Linux 3.0 then reboot the switch, the switch does not boot again.

To work around this issue, configure the registration node in /etc/vxrd.conf instead.

RN-426 (CM-5970)
isc-dhcp-relay must be restarted after flapping an interface or if logical interface is down

There are two known issues regarding isc-dhcp-relay:

  • If isc-dhcp-relay is already running and a host-facing interface is flapped (brought down then up using ifupdown2), the DHCP relay will not work for that interface until isc-dhcp-relay is restarted.
  • If a logical interface defined in /etc/default/isc-dhcp-relay is down when isc-dhcp-relay is started or restarted, then isc-dhcp-relay will not start; thus, the DHCP relay will not work for any interfaces.

To work around the issue, apply the following configuration to each interface specified in INTERFACES from /etc/default/isc-dhcp-relay to the corresponding iface in /etc/network/interfaces:

post-up test -e /var/run/boot.done && service isc-dhcp-relay restart

RN-456 (CM-11587)
netshow does not display VLAN information for VLAN-aware bridge


VLAN information for VLAN-aware bridges was not displayed previously, due to an error in the interface configuration. This has been corrected.

RN-457 (CM-11431)
Added MLAG bonds to isolated switch remain in init


When a single clagd switch is present, MLAG bonds are taken out of protodown state after the reload time expired. However, clagd bonds created after the reload timer expired were still initialized in protodown and not removed, as the reload timer had already expired. The initial state of the clagd bond has been changed, so that it will only be set to protodown if the peer has not been heard from, and the reload timer hasn't expired. This corrects the issue.

RN-458 (CM-11172)
On Mellanox SN2700, configuring 1G links through /etc/network/interfaces requires a link down/up


On Mellanox SN2700 switches, configuring 1G link speed in /etc/network/interfaces requires the link to be brought down/up after booting, as ifupdown2 configures the port speed after boot. In later releases of Cumulus Linux, ifupdown2 configures the port settings prior to the link being up.

RN-460 (CM-8764)
cl-acltool -i errors with hardware sync failed (sync_acl hardware installation failed)


Previously, cl-acltool -i install failures due to a lack of hardware resources failed to provide a meaningful error message. The error messages have been updated to show where the failure happened, and which resources failed.

Known Issues in Cumulus Linux 3.1.0

Issues are categorized for easy review. Some issues are fixed but will be available in a later release.

Release Note ID Summary Description

RN-52 (CM-997,
Parameters like the router ID and DR priority cannot be changed while OSPFv2/v3 is running Router ID and DR priority can only be changed by shutting down OSPFv2/v3, changing the ID, and restarting the OSPF process.

A change to the DR priority may not properly be reflected in the LSAs that are still aging out.

RN-56 (CM-343)
IPv4/IPv6 forwarding disabled mode not recognized

If either of the following is configured:

net.ipv4.ip_forward == 0 


net.ipv6.conf.all.forwarding == 0 

The hardware still forwards packets if there is a neighbor table entry pointing to the destination.

RN-77 (CM-265)
New routes/ECMPs can evict existing/installed Cumulus Linux syncs routes between the kernel and the switching silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes to move from being fully allocated to being partially allocated.

In order to avoid this, routes in the hardware should be monitored and kept below the ASIC limits.

For example, on systems with Trident+ chips, the limits are as follows:
routes: 16384 <<<< if all routes are ipv4 
 long mask routes 256 <<<< i.e., routes with a mask longer 
       than the route mask limit 
 route mask limit 64
 host_routes: 8192 
 ecmp_nhs: 4044 
 ecmp_nhs_per_route: 52 
That translates to about 77 routes with ECMP NHs, if every route has the maximum ECMP NHs.

Monitoring this in Cumulus Linux is performed via the cl-resource-query command:
cumulus@switch:~$ sudo cl-resource-query
 hosts : 3 
 all routes : 29 
 IP4 routes : 17 
 IP6 routes : 12 
 nexthops : 3 
 ecmp_groups : 0
 ecmp_nexthops : 0
 mac entries : 0 / 131072 
 bpdu entries : 500 / 512
The resource to monitor is the ecmp_nexthops. If this count is close to 4044, new ECMPs may evict existing routes.

RN-120 (CM-477)
ethtool LED blinking does not work with switch ports Linux uses ethtool -p to identify the physical port backing an interface, or to identify the switch itself. Usually this identification is by blinking the port LED until ethtool -p is stopped.

This feature does not apply to switch ports (swpX) in Cumulus Linux.

RN-121 (CM-2123)
PTMD: When a physical interface is in a PTM FAIL state, its subinterface still exchanges information Issue:
When PTMD is incorrectly in a failure state and the Zebra interface is enabled, PIF BGP sessions are not establishing the route, but the subinterface on top of it does establish routes.

If the subinterface is configured on the physical interface and the physical interface is incorrectly marked as being in a PTM FAIL state, routes on the physical interface are not processed in Quagga, but the subinterface is working.

Steps to reproduce:
cumulus@switch:$ sudo vtysh -c 'show int swp8' 
Interface swp8 is up, line protocol is up 
PTM status: fail
index 10 metric 1 mtu 1500 
 HWaddr: 44:38:39:00:03:88 
 inet broadcast 
 inet6 2001:cafe:0:38::1/64 
 inet6 fe80::4638:39ff:fe00:388/64 
cumulus@switch:$ ip addr show | grep swp8 
  mtu 1500 qdisc pfifo_fast state UP qlen 500 
  inet brd scope global swp8 
 104: swp8.2049@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet brd scope global swp8.2049 
 105: swp8.2050@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet brd scope global swp8.2050 
 106: swp8.2051@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet brd scope global swp8.2051 
 107: swp8.2052@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet brd scope global swp8.2052 
 108: swp8.2053@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP>
  mtu 1500 qdisc noqueue state UP 
  inet brd scope global swp8.2053 
 109: swp8.2054@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> 
  mtu 1500 qdisc noqueue state UP 
  inet brd scope global swp8.2054
 110: swp8.2055@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP>
  mtu 1500 qdisc noqueue state UP 
  inet brd scope global swp8.2055
cumulus@switch:$ bgp sessions: ,4 ,64057 , 958 , 1036 , 0 , 0 , 0 ,15:55:42, 0, 10472 ,4 ,64058 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285 ,4 ,64059 , 958 , 1049 , 0 , 0 , 0 ,15:55:40, 187, 10285 ,4 ,64060 , 958 , 1039 , 0 , 0 , 0 ,15:55:45, 187, 10285 ,4 ,64061 , 958 , 1014 , 0 , 0 , 0 ,15:55:46, 187, 10285 ,4 ,64062 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285 ,4 ,64063 , 958 , 1029 , 0 , 0 , 0 ,15:55:43, 187, 10285 ,4 ,64064 , 958 , 1036 , 0 , 0 , 0 ,15:55:44, 187, 10285 

RN-125 (CM-1576)
Network LSA with an old router ID isn't flushed out by the originator
When the router ID is changed, the router should remove the previous network LSA (link-state advertisement) that it generated based on the IP address on the interface in the Network LSA.

Cumulus Networks doesn't remove this LSA, so it will be naturally aged out.

RN-132 (CM-2272)
You must run "apt-get update" before running any apt-get commands or after changing sources.list

Before running any apt-get commands or after changing the source.list file in /etc/apt, you need to run apt-get update.

RN-133 (CM-2273)
Interface names in Cumulus Linux cannot exceed 15 characters

Device names, including interface names, in Cumulus Linux cannot exceed 16 characters – including the terminator. Cumulus Linux truncates longer interface names.

To avoid this issue, do not assign long names to your interfaces.

The following example configuration reproduces this issue:

cumulus@switch:/sys/class/net$ grep 'iface br' /etc/network/interfaces 
iface br2-pubmgmt inet static
iface br3-prvmgmt inet manual
iface br400-quarantine inet manual
iface br401-peering-1k5 inet manual
iface br402-peering-9k inet manual
iface br500-pi-exa inet manual
iface br501-akamai-exa inet manual
iface br502-exa-internetfactory inet manual
cumulus@switch:/sys/class/net$ brctl show | grep br
bridge name	bridge id	 STP enabled	interfaces
br2-pubmgmt	 8000.089e01cebe37	no	 bond0.2
br3-prvmgmt	 8000.089e01cebe3a	no	 bond0.3
br400-quarantin	 8000.089e01cebe37	no	 bond0.400
br401-peering-1	 8000.089e01cebe3a	no	 bond0.401 <<<

RN-198 (CM-3290)
Port LEDs behave differently on different switch models

It's been observed that port LEDs behave differently depending upon the make and model of the switch. For example:

  • Agema AG-7448CU: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • Edge-Core AS4600-54T: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • QuantaMesh T3048-LY2R: the LED is on when the link is up. It blinks off briefly when there is traffic.

Cumulus Networks is currently working to fix this issue.

RN-199 (CM-2624)
When a Quagga route-map is modified, the switch could use the partial map before edits are completed

Cumulus Linux triggers a route-map update before the user finishes editing the route map, resulting in an incorrect route map being used. The route-map update trigger should only occur when user finishes editing the map.

Cumulus Networks is working to fix this issue.

RN-221 (CM-4501)
BGP graceful restart, including helper mode, not fully supported If you encounter issues with this, please submit a support request and include the output from cl-support with your ticket.

RN-227 (CM-3388)
BGP dynamic capability is not supported BGP peer sessions with dynamic capability are not supported under any version of Cumulus Linux at this time.

RN-275 (CM-5794)
BGP import-check fails for IPv6 route if static routes to null0 are used

The path that Cumulus Linux originates should not be invalid since there is a matching route in the RIB. The import check works fine for IPv4 routes.

RN-322 (CM-7387)
Interfaces disabled using iproute2 become enabled after restarting Quagga By default, all interfaces have a "no shutdown" associated with them in Quagga. Thus, when you restart Quagga, it enables the interfaces. This is expected behavior in Quagga. There is no workaround at this time.

RN-327 (CM-4290)
Changing the route-map parameter of the redistribute command in OSPF and BGP doesn't affect the state of the resulting redistribution in those protocols

To work around this issue, remove any old redistribute command configurations before adding a new one with or without route-map as a parameter.

For example, if OSPF has a redistribute configuration such as redistribute bgp route-map redist-map-name, you would enable redistribution without a route-map by following these steps in OSPF configuration mode:

  1. no redistribute bgp
  2. redistribute bgp

You would perform a similar sequence of commands for redistribution changes in BGP as well.

RN-355 (CM-7994)
OSPFv2 Area ID being implicitly translated from Integer format to dotted decimal format

While OSPF area ID configuration in Quagga allows for the value to be specified in either dotted decimal format, or as an integer, values specified as an integer will be converted into dotted decimal format when displayed, causing potential confusion for the operator.

This issue does not impact OSPF functionality; only the display output. However, it is recommended that the OSPF area ID is specified in dotted decimal format for consistency.


RN-380 (CM-6110)
ifupdown2: adjust VLAN subinterface MTU based on MTU settings specified under lowerdev by the user

The following kernel error occurs when the MTU is specified under a subinterface rather than under the VLAN interface:

root@dell-s3000-04:~# ifreload -a -X eth0
warning: failed to execute cmd 'ip -force -batch - [addr add dev swp52.100
link set dev swp52.100 mtu 9000 
]'(RTNETLINK answers: Numerical result out of range
Command failed -:2)

This issue is being investigated.

RN-382 (CM-6692)
Quagga: Removing bridge via ifupdown2 does not remove it from Quagga Removing a bridge using ifupdown2 does not remove it from the Quagga configuration files. This issue is being investigated; however, restarting Quagga will successfully remove the bridge.

RN-383 (CM-7196)
admin down of link deletes IPv6 nexthop static route entry, but not for IPv4

When a link is admin down and carrier is on, the IPv4 nexthop entry is marked dead, but the IPv6 nexthop entry is deleted, and will not be restored when the link is admin up. However, if carrier is off, the IPv6 nexthop is marked dead and not deleted. This inconsistency in admin down behavior is being investigated.

RN-384 (CM-7684)
Keeping VXLAN single-connected devices up on MLAG secondary node In the current MLAG secondary design, if the VXLAN device is not dual-connected, it is kept in a protodown state. You can keep them up with individual IP addresses rather than anycast IPs when the peerlink is down, so that all single-connected hosts will have connectivity. Further investigation regarding this issue is underway.

RN-387 (CM-8163)
Quagga appears to not honor passive interfaces if VRR is active

In a VRR configuration, any interface-specific routing configuration (e.g., OSPF mode of operation) specified on the subinterface having a virtual IP address does not take effect. This is because when an operator has specified a virtual IP on a bridge, the system creates another internal interface bridge with the virtual IP and MAC. These two interfaces are treated distinctly by Quagga, so any interface-specific routing configuration on the bridge does not get carried over to the second bridge.

In a VRR deployment needing any interface-specific routing configuration on the interface with a virtual IP address, the routing configuration has to be specified against the internally-created virtual interface also.

RN-389 (CM-8410)
switchd supports only port 4789 as the UDP port for VXLAN packets

switchd currently allows only the standard port 4789 as the UDP port for VXLAN packets. There are cases where a hypervisor could be using non-standard UDP port, which would cause VXLAN exchanges with the hardware VTEP to not work. In such a case, packets would not be terminated and encapsulated packets would be sent out on UDP port 4789.

RN-390 (CM-9055)
If you’re logged into the serial console and type reboot, the system may hang indefinitely

In Cumulus Linux 3.1.0, systemd may block and stop handling systemctl changes, and fail to start or restart services, if the serial console columns or rows are changed. For example, running stty rows 30 columns 96 may cause this. In this state, a new login session will not be started on the serial console (/dev/ttyS0) after logout.

To verify whether systemd is hung in this manner, run cat /proc/1/stack; you should see output similar to the following:

[<ffffffff81466f5d>] tty_port_block_til_ready+0x1bd/0x330
[<ffffffff81093e90>] ? wait_woken+0x90/0x90
[<ffffffff8147abbb>] ? uart_startup.part.16+0xbb/0x1f0
[<ffffffff8147adef>] uart_open+0xff/0x180
[<ffffffff8145f025>] tty_open+0xf5/0x660
[<ffffffff811a3ad8>] chrdev_open+0xa8/0x1a0
[<ffffffff811a3a30>] ? cdev_put+0x30/0x30
[<ffffffff8119cc79>] do_dentry_open.isra.15+0x159/0x310
[<ffffffff8119e183>] vfs_open+0x53/0x60
[<ffffffff811acfd9>] do_last+0x249/0x11f0
[<ffffffff811af7c0>] path_openat+0x80/0x5f0
[<ffffffff811ec90e>] ? locks_dispose_list+0x3e/0x50
[<ffffffff811edae0>] ? __posix_lock_file+0xe0/0x630
[<ffffffff811b129a>] do_filp_open+0x3a/0xb0
[<ffffffff813cb8aa>] ? find_next_zero_bit+0x1a/0x30
[<ffffffff811bd8fe>] ? __alloc_fd+0x7e/0x120
[<ffffffff8119e52c>] do_sys_open+0x12c/0x220
[<ffffffff8119e63e>] SyS_open+0x1e/0x20
[<ffffffff816fdc57>] system_call_fastpath+0x12/0x6a

At this point it is necessary to power cycle or otherwise reset the switch to recover. A reboot or shutdown command will block, because systemd is hung.

RN-391 (CM-9631)
Dell S4048 unresponsive after TX Unit Hang detected

After booting a Dell S4048 switch, the switch becomes unresponsive and errors like the following appear in the console log:

[ 1206.440277] igb 0000:00:14.0: Detected Tx Unit Hang
[ 1206.440277]   Tx Queue             <0>
[ 1206.440277]   TDH                  <2d>
[ 1206.440277]   TDT                  <2e>
[ 1206.440277]   next_to_use          <2e>
[ 1206.440277]   next_to_clean        <2d>
[ 1206.440277] buffer_info[next_to_clean]
[ 1206.440277]   time_stamp           <1000dcd20>
[ 1206.440277]   next_to_watch        <ffff88007d81b2d0>
[ 1206.440277]   jiffies              <1000dd5d4>
[ 1206.440277]   desc.status          <300000>
[ 1208.439856] igb 0000:00:14.0: Detected Tx Unit Hang
[ 1208.439856]   Tx Queue             <0>
[ 1208.439856]   TDH                  <2d>
[ 1208.439856]   TDT                  <2e>
[ 1208.439856]   next_to_use          <2e>
[ 1208.439856]   next_to_clean        <2d>
[ 1208.439856] buffer_info[next_to_clean]
[ 1208.439856]   time_stamp           <1000dcd20>
[ 1208.439856]   next_to_watch        <ffff88007d81b2d0>
[ 1208.439856]   jiffies              <1000ddda4>
[ 1208.439856]   desc.status          <300000>
[ 1210.439414] igb 0000:00:14.0: Detected Tx Unit Hang
[ 1210.439414]   Tx Queue             <0>
[ 1210.439414]   TDH                  <2d>
[ 1210.439414]   TDT                  <2e>
[ 1210.439414]   next_to_use          <2e>
[ 1210.439414]   next_to_clean        <2d>
[ 1210.439414] buffer_info[next_to_clean]
[ 1210.439414]   time_stamp           <1000dcd20>
[ 1210.439414]   next_to_watch        <ffff88007d81b2d0>
[ 1210.439414]   jiffies              <1000de574>
[ 1210.439414]   desc.status          <300000>
[ 1212.438966] igb 0000:00:14.0: Detected Tx Unit Hang
[ 1212.438966]   Tx Queue             <0>
[ 1212.438966]   TDH                  <2d>
[ 1212.438966]   TDT                  <2e>
[ 1212.438966]   next_to_use          <2e>
[ 1212.438966]   next_to_clean        <2d>
[ 1212.438966] buffer_info[next_to_clean]
[ 1212.438966]   time_stamp           <1000dcd20>
[ 1212.438966]   next_to_watch        <ffff88007d81b2d0>
[ 1212.438966]   jiffies              <1000ded44>
[ 1212.438966]   desc.status          <300000>
[ 1212.490329] igb 0000:00:14.0 eth0: Reset adapter

Rebooting the switch again stops the behavior.

RN-404 (CM-4407)
Aggregating routes in BGP with as-set can result in high CPU usage

When BGP is configured with aggregate addresses with as-set configuration and there are many routes to be aggregated, the BGP process gets into high CPU usage.

To work around this issue, do not specify the as-set parameter for the aggregate-address configuration.

RN-409 (CM-10054)
BGP may show an inaccessible path as the best path

Existing BGP issues caused peering between a VRF device and a loopback BGP session to stay up if the loopback session doesn’t advertise its local address.

This issue will be fixed in a future release.

RN-412 (CM-11162)
Spectrum switches have a CPU RX burst size of 128 packets per queue

On switches with Mellanox Spectrum ASICs the burst size for packets received at the CPU is limited to 128 packets per queue. Larger bursts can result in packet drops until packets are drained from the queue.


RN-414 (CM-11175)
Kernel source not added to Cumulus Networks repository

Kernel source (linux<vers>.orig.tar.xz and linux<vers>.debian.tar.xz) files are not being added properly to the Cumulus Networks repository.

You can retrieve these packages manually with apt-get source <package name>. For example, to retrieve the Cumulus Networks linux-image source package, run:

cumulus@switch:~$ sudo apt-get install linux-source-4.1

The archive file is stored at /usr/src/linux-source-4.1.tar.xz.

RN-445 (CM-12370)
An interface cannot have both inet and inet6 DHCP configurations

If you configure an interface so it can to obtain both IPv4 and IPv6 IP addresses via DHCP, ifupdown2 will honor only the first configuration and ignore the second.

In the following example configuration, ifupdown2 will only issue an IPv4 DHCP address for swp1, but not the IPv6 address.

auto swp1
iface swp1 inet dhcp
    link-speed 10000
    link-duplex full
    link-autoneg off

auto swp1
iface swp1 inet6 dhcp

This is a known issue that should be fixed in a future release of Cumulus Linux.

RN-446 (CM-10513)
Redistribute neighbor does not work with more than 1024 interfaces

The rdnbrd service crashes because it cannot work with more than 1024 interfaces.

This issue should be fixed in a future release of Cumulus Linux.

RN-447 (CM-11280)
"portwd: invalid SFF identifier: 0x0c" messages appear continuously in syslog

The following SFF message appears every 5 seconds in syslog:

cumulus@switch:~$ tail -f /var/log/syslog 
2016-08-06T12:18:56.095606-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:01.113397-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:01.121068-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:01.121698-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:06.139373-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:06.147045-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:06.147677-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:11.165355-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:11.173134-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:11.173747-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:16.191418-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:16.199154-04:00 cumulus portwd: invalid SFF identifier: 0x0c
2016-08-06T12:19:16.199805-04:00 cumulus portwd: invalid SFF identifier: 0x0c

This is a known issue that should be fixed in a future version of Cumulus Linux.

RN-448 (CM-11302)
Using the json option in the "show ip bgp" command causes peer session flaps

This issue causes peer session flaps on Penguin Arctica 4806XP and Supermicro SSE-X3648S switches. It occurs with 16K IPv4 prefixes and only when you run show ip bgp json.

However, on switches with Tomahawk ASICs, with 61K IPv4 prefixes and default timers, the same show ip bgp json command causes all peer sessions to go down.

This is a known issue that should be fixed in a future release of Cumulus Linux.

RN-449 (CM-11584)
In traditional bridge mode, clagd syncs MAC addresses in the wrong VLAN when the peerlink is tagged and the bond is native

When a traditional mode bridge is configured and the peerlink is tagged but the clagd bonds are native VLANs, clagd appears to try and sync the MAC addresses learned using the VLAN tag from the peerlink.

This causes the MAC address not to be synced correctly on the peer.

This is a known issue that should be fixed in a future release of Cumulus Linux.

RN-450 (CM-12252)
802.1p remark in traffic.conf behaves differently on Mellanox vs. Broadcom switches

The 802.1p remark defined in traffic.conf acts differently on a Mellanox switch when compared to a Broadcom switch.

On the Mellanox platform, the remark defined in the traffic.conf file takes precedence even if there is an ACL rule that is matched.

On the Broadcom platform, the ACL rule takes precedence over the remark defined in the traffic.conf file.

RN-451 (CM-12344)
Mellanox switch rejects SPAN ACL rule for an output interface that is a subinterface

This is a known issue at this time.

RN-452 (CM-12363)
Mellanox switch rejects ACL rule in ebtables with both -d and -o options

On a Mellanox switch, cl-acltool rejects an ebtables rule that contains both the destination MAC address and the output interface.

This is a known issue.

RN-453 (CM-12564)
Default routes learned via DHCP are moved to the management VRF even if they are not in the management VRF

The mgmt-vrf package has a dhclient exit hook that incorrectly assumes that DHCP is used only with the management interface. If management VRF is enabled, it inserts default routes from the DHCP server into the management table.

Until this issue is resolved, do not use DHCP with the front panel (switch) ports. If you need DHCP for a switch port, disable Management VRF and assign the port to a VRF.

RN-454 (CM-12565)
When VRF is enabled, the ICMP "need to fragment" is not respected

Cumulus Linux defaults to using PMTU to discover if a path has a lower MTU than what is set on its local interfaces. If an ICMP - Fragmentation Needed packet is received on an interface associated with a VRF, the MTU change for the remote address is not properly handled. The end result is that packets exceeding the MTU going through that network segment are dropped.

To work around this issue until it is resolved, do one of the following:

  • Disable PMTU. This does not set the "don't fragment" bit in the IP header, allowing nodes to fragment the packet if needed.
    cumulus@switch:~$ sudo sysctl -w net.ipv4.ip_no_pmtu_disc=1
  • Manually reduce the MTU on the interface. Add mtu <value> to the stanza for that interface in /etc/network/interfaces.

RN-455 (CM-12578)
The cumulus-poe package is not installed on ARM switches after upgrading to version 3.1

After upgrading Cumulus Linux to version 3.1, the cumulus-poe package does not get installed on ARM-based switches. THe following message appears after you reboot the switch:

[ OK ] Stopped Cumulus Linux POE Daemon.
Starting Cumulus Linux POE Daemon...
[FAILED] Failed to start Cumulus Linux POE Daemon.

In order to use Power over Ethernet (PoE) on an ARM switch, you need to install the cumulus-poe package:

cumulus@switch:~$ sudo apt-get update
cumulus@switch:~$ sudo apt-get install cumulus-poe
cumulus@switch:~$ sudo apt-get upgrade

RN-489 (CM-12643)
IGMP snooping doesn't prevent flooding of unregistered multicast data packets

When broadcast video is sent over multicast with a number of senders running, the flooding approaches caused major issues, as the maximum available bandwidth was saturated on an individual port, and started dropping. This issue is being further investigated.

RN-525 (CM-12715)
On a Quanta IX1 switch, 100G OSI LR4 module (QSFP28-LR4-OSI) doesn't advertise 40G in transceiver codes, so cannot be used at 40G speed

Cumulus Linux can only support speeds on modules that are advertised in each module's transceiver codes. You can determine this by running ethtool -m:

cumulus@switch:~$ sudo ethtool -m

    Transceiver codes                       : 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x00
    Transceiver type                        : 100G Ethernet: 100G Base-LR4


The example above shows that this module supports only 100G; it cannot support 40G speeds. Similarly, if a module advertises 40G support only, it cannot support 100G speeds.

RN-526 (CM-13037)
Upgrading the clag package fails when logrotate contains an invalid date

While upgrading from Cumulus Linux 3.0.1 to 3.1.z, the clag package does not update if the logrotate file contains an invalid date. This can occur due to bad batteries in the switch or RTC clock chip issues. The error may look like the following:

error: bad year 1929 for file /var/log/boot.log in state file /var/lib/logrotate/status
dpkg: error processing package clag (--configure):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
E: Sub-process /usr/bin/dpkg returned an error code (1)

You can work around this issue by removing the /var/lib/logrotate/status file, and forcing a logrotate. After you do this, the upgrade should be successful.

RN-527 (CM-13588)
Cannot set MTU to 9216 on a Mellanox switch

Setting the MTU to 9216 on a Mellanox switch returns the following error after running ifreload -a to reload the updated configuration:

error: swp1: cmd 'ip -force -batch - [link set dev swp1 mtu 9216
]' failed: returned 1 (RTNETLINK answers: Invalid argument
Command failed -:1

This issue is currently being investigated.

RN-528 (CM-13254)
Links don't come up with 100G AOC cables connected to QLogic 100G NIC (QL45611HLCU-CK) on Tomahawk-based switches

QLogic only supports DAC and SR optics with these NICs; the QLogic QL45611HLCU-CK NIC does not support AOC and LR optics.

RN-529 (CM-13243)
Links don't come up between Intel XL710 40G NICs and Tomahawk-based switches while using non-Intel-branded 40G-SR4 optics

If you see this error: "there was an unqualified module detected on the server", please refer to the list of supported fiber modules for the XL710-QDA2 NIC.

Links work fine between XL710 NICs and Tomahawk-based switches with any DAC cables.

RN-530 (CM-13255)
Links don't come up between Tomahawk switches and ConnectX-4 NICs while using 100G LR4 optics

Mellanox ConnectX-4 NICs do not support LR4 optics when connected to third party hardware, such as a Broadcom ASIC.

RN-550 (CM-13674)
The ZTP daemon shuts itself down after 5 minutes of inactivity

The zero touch provisioning (ZTP) daemon ztpd shuts itself down after 5 minutes of inactivity but the service remains enabled for the next reboot.

This can affect deployments where a switch might be powered up in a remote data center for weeks without ever being configured. In such a case, there is no way to automatically initiate the ZTP process.

This is a known issue that will be fixed in a future release of Cumulus Linux.

Have more questions? Submit a request


Powered by Zendesk