Cumulus Linux 2.5.2 Release Notes

Follow

Overview

These release notes support Cumulus Linux 2.5.2 and describe currently available features and known issues.

Licensing

Cumulus Linux is licensed on a per-instance basis. Each network system is fully operational, enabling any capability to be utilized on the switch with the exception of forwarding on switch panel ports. Only eth0 and console ports are activated on an un-licensed instance of Cumulus Linux. Enabling front panel ports requires a license.

You should have received a license key from Cumulus Networks or an authorized reseller. To install the license, read the Cumulus Linux quick start guide.

Installing Version 2.5.2

If you are upgrading from version 2.5.0 or 2.5.1, use apt-get to update the software.

  1. Run apt-get update.
  2. Run apt-get install jdoo.
  3. Run apt-get upgrade.

Caution: While this method doesn't overwrite the target image slot, the disk image does occupy a lot of disk space used by both Cumulus Linux image slots.

New Install or Upgrading from Versions Older than 2.5.0

If you are upgrading from a version older than 2.5.0, or installing Cumulus Linux for the first time, choose one of the following methods. They are ordered from the most recommended method to least recommended.

  • Download Cumulus Linux 2.5.2 - Final Latest Version from the Downloads page of the Cumulus Networks website, then use cl-img-install to install the software.

    Warning: This method overwrites the target image slot, so if you want to preserve your configuration, you should create a persistent configuration on /mnt/persist.

  • Download Cumulus Linux 2.5.2 - Final Latest Version from the Downloads page of the Cumulus Networks website, then use ONIE to perform a complete install, following the instructions in the quick start guide.

    Warning: This method is destructive; any configuration files on the switch will not be saved, so please copy them to a different server before upgrading via ONIE.

Enabling Quagga

There is no SNMP support for Quagga in this release (see RN 88 below). Due to this circumstance, you must remove all references to smux in each of the following configuration files. You must also remove these references before upgrading Cumulus Linux using apt-get. If the smux entries are present in the configuration files, the daemons in the 2.5 packaged version of Quagga will not start.

  1. cd /etc/quagga
  2. grep smux *
  3. Delete all lines in the config files containing the smux keyword.

The references to smux that must be removed are:

  • In bgpd.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.2 quagga_bgpd
  • In ospf6d.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.6 quagga_ospf6d
  • In ospfd.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.5 quagga_ospfd
  • In zebra.conf, remove this line:
    smux peer 1.3.6.1.4.1.3317.1.2.1 quagga_zebra

 Perl, Python and BDB Modules

Any Perl scripts that use the DB_File module or Python scripts that use the bsddb module won't run under Cumulus Linux 2.5.2.

What's New in Cumulus Linux 2.5.2

Cumulus Linux 2.5.2 supports these new features:

  • Resilient hashing, which ensures that when a member in an ECMP link group fails, flows on that link are moved to the unaffected links in the group, and existing flows on the unaffected links remain.
  • jdoo replaces monit, for monitoring hardware. jdoo forks monit version 5.2.5. Read this knowledge base article for more information about upgrading to jdoo.

Experimental Features

The following experimental features are included in Cumulus Linux 2.5.2:

Documentation

You can read the technical documentation here.

Issues Fixed in Cumulus Linux 2.5.2

The following is a list of issues fixed in Cumulus Linux 2.5.2 from earlier versions of Cumulus Linux.

Release Note ID Summary
RN-259 HwIfInErrors encountered, link stays up
RN-261 Quagga/Zebra: Interface remains DOWN while kernel says it is UP
RN-262 Buffer utilization monitoring not implemented in Cumulus Linux on Trident II platforms
RN-263 Onlink route does not get installed when one link is down
RN-264 BGP advertises prefix for down interface with import-check enabled if default route exists
RN-265 ledasm error encountered when booting a QuantaMesh T5048-LY9

Known Issues in Cumulus Linux 2.5.2

Issues are categorized for easy review. Some issues are fixed but will be available in a later release.

Release Note ID Summary Description
RN-4 ifup/ifdown must be used for interfaces with IPv6 addresses defined in /etc/network/interfaces, otherwise the global IPv6 address will not be restored Two scenarios are shown below; one with ifup/ifdown, the other with ifconfig down.

With ifup/ifdown:
 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81
 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0
 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link
 inet6 addr: fec0:1000:1000:1000::2/10 Scope:Site
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 RX packets:4231 errors:0 dropped:0 overruns:0 frame:0
 TX packets:4342 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:500
 RX bytes:412115 (402.4 KiB) TX bytes:425688 (415.7 KiB)

cumulus@switch$ sudo ifdown swp1
cumulus@switch$ sudo ifconfig swp1 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:4248 errors:0 dropped:0 overruns:0 frame:0 TX packets:4356 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:413990 (404.2 KiB) TX bytes:427074 (417.0 KiB)
cumulus@switch$ sudo ifconfig swp1 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:4248 errors:0 dropped:0 overruns:0 frame:0 TX packets:4356 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:413990 (404.2 KiB) TX bytes:427074 (417.0 KiB) cumulus@dni-7448-13$ sudo ifup swp1 ADDRCONF(NETDEV_UP): swp1: link is not ready cumulus@switch$ sudo ifconfig swp1ADDRCONF(NETDEV_CHANGE): swp1: /
link becomes ready swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link inet6 addr: fec0:1000:1000:1000::2/10 Scope:Site UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4250 errors:0 dropped:0 overruns:0 frame:0 TX packets:4362 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:414178 (404.4 KiB) TX bytes:427610 (417.5 KiB)
cumulus@switch$

With ifconfig down:
 sudo ifconfig swp1
 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81
 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0
 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link 
 inet6 addr: fec0:1000:1000:1000::2/10 Scope:Site
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
 RX packets:98 errors:0 dropped:0 overruns:0 frame:0
 TX packets:111 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:500 
 RX bytes:13310 (12.9 KiB) TX bytes:12786 (12.4 KiB)

cumulus@switch$ sudo ifconfig swp1 down
cumulus@switch$ sudo ifconfig swp1 swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:126 errors:0 dropped:0 overruns:0 frame:0 TX packets:138 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:16998 (16.5 KiB) TX bytes:15998 (15.6 KiB)
cumulus@switch$ sudo ifconfig swp1 up ADDRCONF(NETDEV_UP): swp1: link is not ready
cumulus@switch$ sudo ifconfig swp1ADDRCONF(NETDEV_CHANGE): swp1: link becomes ready swp1 Link encap:Ethernet HWaddr 44:38:39:00:01:81 inet addr:11.0.0.2 Bcast:11.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::4638:39ff:fe00:181/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:130 errors:0 dropped:0 overruns:0 frame:0 TX packets:149 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:17474 (17.0 KiB) TX bytes:17154 (16.7 KiB)
RN-10 cl-phy-update doesn't support aggregated ports Ports can be aggregated into a larger interface in Cumulus Linux. Unfortunately support for aggregated ports is not yet supported when running cl-phy-update.

If there are any ganged ports during a SW upgrade it is recommended to ungang these ports
RN-52 Parameters like the router ID and DR priority cannot be changed while OSPFv2/v3 is running Router ID and DR priority can only be changed by shutting down OSPFv2/v3, changing the ID, and restarting the OSPF process.

A change to the DR priority may not properly be reflected in the LSAs that are still aging out.
RN-56 IPv4/IPv6 forwarding disabled mode not recognized

If either of the following is configured:

 net.ipv4.ip_forward == 0 

or:

 net.ipv6.conf.all.forwarding == 0 

The hardware still forwards packets if there is a neighbor table entry pointing to the destination.

RN-58 IPv6 route is installed and active in the routing table when the associated interface is down If an IPv6 address is assigned to a "down" interface, the associated route is still installed into the route table.

Also, the type of IPv6 address doesn't matter. Link local, site local, and global all exhibit the same problem.

If the interface is bounced up and down, then the routes are no longer in the route table.
RN-61 BGP4 notifications missing for several conditions In certain conditions, Quagga bgpd silently closes the peering without sending a notification. For example, if BGP receives a message with an invalid message type or invalid message length.

Ideally on any one of these cases, bgpd should send out a notification message to the peer.

General functionality of BGP4 is not affected.
RN-64 Configuring route-reflector-client requires specific order In configuring a route to be a route reflector client, the Quagga configuration must be specified in a specific order; otherwise, the router will not be a route reflector client.

The "neighbor <IPv4/IPV6> route-reflector-client" command must be done after the "neighbor <IPV4/IPV6> Activate" command; otherwise, the route-reflector-client command is ignored.

Sample configuration:
 router bgp 65000
 bgp router-id 0.0.0.4 
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 bgp cluster-id 0.0.0.4 
 bgp bestpath as-path multipath-relax 
 redistribute connected 
 neighbor 14.0.0.1 remote-as 65000 
 neighbor 14.0.0.1 route-reflector-client 
 neighbor 14.0.0.1 activate 
 neighbor 14.0.0.1 next-hop-self 
 neighbor 14.0.0.9 remote-as 65000 
 neighbor 14.0.0.9 activate 
 neighbor 14.0.0.9 next-hop-self 
 neighbor 2001:ded:beef::1 remote-as 65000 
 neighbor 2001:ded:beef:2::1 remote-as 65000 
 maximum-paths 4 
 maximum-paths ibgp 4 
 ! 
 address-family ipv6 
 redistribute connected 
 neighbor 2001:ded:beef::1 activate 
 neighbor 2001:ded:beef::1 next-hop-self 
 neighbor 2001:ded:beef:2::1 route-reflector-client 
 neighbor 2001:ded:beef:2::1 activate 
 neighbor 2001:ded:beef:2::1 next-hop-self 
 maximum-paths 4 
 maximum-paths ibgp 4 
 exit-address-family 

At runtime:
 cumulus@switch:$ show ip bgp neighbor 14.0.0.1 
 BGP neighbor is 14.0.0.1, remote AS 65000, local AS 65000, internal link
 BGP version 4, remote router ID 0.0.0.6 
 BGP state = Established, up for 00:23:49
 Last read 23:31:36, hold time is 180, keepalive interval is 60 seconds 
 Neighbor capabilities: 
 4 Byte AS: advertised and received 
 Route refresh: advertised and received(old & new)
 Address family IPv4 Unicast: advertised and received 
 Message statistics: 
 Inq depth is 0 
 Outq depth is 0
 Sent Rcvd 
 Opens: 2 0
 Notifications: 0 0
 Updates: 1 1
 Keepalives: 25 24 
 Route Refresh: 0 0
 Capability: 0 0 
 Total: 28 25 
 Minimum time between advertisement runs is 5 seconds
 For address family: IPv4 Unicast 
 >>>>>>>>>>>>>>>>>>>>>> ROUTE REFLECTOR CLIENT NOT DISPLAYED 
 NEXT_HOP is always this router 
 Community attribute sent to this neighbor(both) 
 6 accepted prefixes 
 Connections established 1; dropped 0 
 Last reset never 
 Local host: 14.0.0.2, Local port: 179 
 Foreign host: 14.0.0.1, Foreign port: 40290 
 Nexthop: 14.0.0.2 
 Nexthop global: 2001:ded:beef::2 
 Nexthop local: fe80::202:ff:fe00:4
 BGP connection: non shared network
 Read thread: on Write thread: off 
 cumulus@switch:$ 

Workaround:
 Define in following order 
 address-family ipv4 unicast
 neighbor 14.0.0.9 activate 
 neighbor 14.0.0.9 next-hop-self
 neighbor 14.0.0.9 route-reflector-client >>> Must be after Activate 
 exit-address-family 
 neighbor 2001:ded:beef:2::1 remote-as 65000
 address-family ipv6 unicast 
 redistribute connected
 maximum-paths 4 
 maximum-paths ibgp 4 
 neighbor 2001:ded:beef:2::1 activate 
 neighbor 2001:ded:beef:2::1 next-hop-self 
 neighbor 2001:ded:beef:2::1 route-reflector-client >>> Must be after activate 
 exit-address-family 
 Runtime status after change: 

cumulus@switch:$ show ip bgp neighbors 14.0.0.9 BGP neighbor is 14.0.0.9, remote AS 65000, local AS 65000, internal link BGP version 4, remote router ID 0.0.0.7 BGP state = Established, up for 00:13:59 Last read 22:35:13, hold time is 180, keepalive interval is 60 seconds Neighbor capabilities: 4 Byte AS: advertised and received Route refresh: advertised and received(old & new) Address family IPv4 Unicast: advertised and received Message statistics: Inq depth is 0 Outq depth is 0 Sent Rcvd Opens: 1 1 Notifications: 0 0 Updates: 2 1 Keepalives: 15 14 Route Refresh: 0 0 Capability: 0 0 Total: 18 16 Minimum time between advertisement runs is 5 seconds For address family: IPv4 Unicast Route-Reflector Client >>>>>>>>>> PLEASE NOTE ME NEXT_HOP is always this router Community attribute sent to this neighbor(both) 6 accepted prefixes Connections established 1; dropped 0 Last reset never Local host: 14.0.0.10, Local port: 38813 Foreign host: 14.0.0.9, Foreign port: 179 Nexthop: 14.0.0.10 Nexthop global: 2001:ded:beef:2::2 Nexthop local: fe80::202:ff:fe00:6 BGP connection: non shared network Read thread: on Write thread: off cumulus@switch:$
RN-68 Blackhole/Unreachable/Prohibit route addition in IPv6 returns corresponding error codes IPv6 route operations indicate the destination action via returned error codes. In the example shown below where an unreachable route is being added, the return code is:
 #define ENETUNREACH 101 /* Network is unreachable */ 
cumulus@switch:$ sudo ip addr add 9000:1000:1000:1000::1/80 dev lo cumulus@switch:$ sudo ip -6 route unreachable 9000:1000:1000:1000::/80 dev lo proto kernel metric 256 error -101
RN-70 ACL: Bridge traffic that matches a LOG ACTION rule is not logged in syslog For example, a bridge with switch ports swp1, swp2, swp3 as bridge members is configured. ACL rules to LOG and DROP for icmp traffic are configured.

Ping requests are sent from host1 on swp1 to host3 on swp3, and the following was observed:
* Counters for both LOG and DROP ACL rules are incrementing properly, but the packets are not showing up on /var/log/syslog.
* Packets that are copied to the CPU from hardware for the LOG rule are dropped due to the check in kernel to disable software bridging for hardware bridged packets.
RN-77 New routes/ECMPs can evict existing/installed Cumulus Linux syncs routes between the kernel and the switching silicon. If the required resource pools in hardware fill up, new kernel routes can cause existing routes to move from being fully allocated to being partially allocated.

In order to avoid this, routes in the hardware should be monitored and kept below the ASIC limits.

For example, on systems with Trident+ chips, the limits are as follows:
 routes: 16384 <<<< if all routes are ipv4 
 long mask routes 256 <<<< i.e., routes with a mask longer than the route mask limit 
 route mask limit 64
 host_routes: 8192 
 ecmp_nhs: 4044 
 ecmp_nhs_per_route: 52 
That translates to about 77 routes with ECMP NHs, if every route has the maximum ECMP NHs.

Monitoring this in Cumulus Linux is performed via the cl-resource-query command:
 cumulus@switch:~# sudo cl-resource-query
 hosts : 3 
 all routes : 29 
 IP4 routes : 17 
 IP6 routes : 12 
 nexthops : 3 
 ecmp_groups : 0
 ecmp_nexthops : 0
 mac entries : 0 / 131072 
 bpdu entries : 500 / 512 
The resource to monitor is the ecmp_nexthops. If this count is close to 4044, new ECMPs may evict existing routes.
RN-88 SNMP support for Quagga is NOT provided in Cumulus Linux Cumulus Linux does not provide SNMP support for Quagga.
RN-120 ethtool LED blinking does not work with switch ports Linux uses ethtool -p to identify the physical port backing an interface, or to identify the switch itself. Usually this identification is by blinking the port LED until ethtool -p is stopped.

This feature does not apply to switch ports (swpX) in Cumulus Linux.
RN-121 PTMD: When a physical interface is in a PTM FAIL state, its subinterface still exchanges information Issue:
When PTMD is incorrectly in a failure state and the Zebra interface is enabled, PIF BGP sessions are not establishing the route, but the subinterface on top of it does establish routes.

If the subinterface is configured on the physical interface and the physical interface is incorrectly marked as being in a PTM FAIL state, routes on the physical interface are not processed in Quagga, but the subinterface is working.

Steps to reproduce:
cumulus@switch:$ sudo vtysh -c 'show int swp8' 
Interface swp8 is up, line protocol is up
PTM status: fail
index 10 metric 1 mtu 1500
flags: <UP,BROADCAST,RUNNING,MULTICAST>
HWaddr: 44:38:39:00:03:88
inet 12.0.0.225/30 broadcast 12.0.0.227
inet6 2001:cafe:0:38::1/64
inet6 fe80::4638:39ff:fe00:388/64
cumulus@switch:$ ip addr show | grep swp8
10: swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 500
inet 12.0.0.225/30 brd 12.0.0.227 scope global swp8
104: swp8.2049@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.229/30 brd 12.0.0.231 scope global swp8.2049
105: swp8.2050@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.233/30 brd 12.0.0.235 scope global swp8.2050
106: swp8.2051@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.237/30 brd 12.0.0.239 scope global swp8.2051
107: swp8.2052@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.241/30 brd 12.0.0.243 scope global swp8.2052
108: swp8.2053@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.245/30 brd 12.0.0.247 scope global swp8.2053
109: swp8.2054@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.249/30 brd 12.0.0.251 scope global swp8.2054
110: swp8.2055@swp8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
inet 12.0.0.253/30 brd 12.0.0.255 scope global swp8.2055
cumulus@switch:$
bgp sessions:
12.0.0.226 ,4 ,64057 , 958 , 1036 , 0 , 0 , 0 ,15:55:42, 0, 10472
12.0.0.230 ,4 ,64058 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285
12.0.0.234 ,4 ,64059 , 958 , 1049 , 0 , 0 , 0 ,15:55:40, 187, 10285
12.0.0.238 ,4 ,64060 , 958 , 1039 , 0 , 0 , 0 ,15:55:45, 187, 10285
12.0.0.242 ,4 ,64061 , 958 , 1014 , 0 , 0 , 0 ,15:55:46, 187, 10285
12.0.0.246 ,4 ,64062 , 958 , 1016 , 0 , 0 , 0 ,15:55:46, 187, 10285
12.0.0.250 ,4 ,64063 , 958 , 1029 , 0 , 0 , 0 ,15:55:43, 187, 10285
12.0.0.254 ,4 ,64064 , 958 , 1036 , 0 , 0 , 0 ,15:55:44, 187, 10285


RN-125 Network LSA with an old router ID isn't flushed out by the originator Issue:
When the router ID is changed, the router should remove the previous network LSA (link-state advertisement) that it generated based on the IP address on the interface in the Network LSA.

Resolution:
Cumulus Networks isn't removing this LSA, so it will be naturally aged out.
RN-132 You must run "apt-get update" before running any apt-get commands or after changing sources.list

Before running any apt-get commands or after changing the source.list file in /etc/apt, you need to run apt-get update.

RN-133 Interface names in Cumulus Linux cannot exceed 15 characters

Device names, including interface names, in Cumulus Linux cannot exceed 16 characters – including the terminator. Cumulus Linux truncates longer interface names.

To avoid this issue, do not assign long names to your interfaces.

The following example configuration reproduces this issue:

cumulus@switch:/sys/class/net$ grep 'iface br' /etc/network/interfaces 
iface br2-pubmgmt inet static
iface br3-prvmgmt inet manual
iface br400-quarantine inet manual
iface br401-peering-1k5 inet manual
iface br402-peering-9k inet manual
iface br500-pi-exa inet manual
iface br501-akamai-exa inet manual
iface br502-exa-internetfactory inet manual
cumulus@switch:/sys/class/net$ brctl show | grep br
bridge name	bridge id	 STP enabled	interfaces
br2-pubmgmt	 8000.089e01cebe37	no	 bond0.2
br3-prvmgmt	 8000.089e01cebe3a	no	 bond0.3
br400-quarantin	 8000.089e01cebe37	no	 bond0.400
br401-peering-1	 8000.089e01cebe3a	no	 bond0.401 <<<
RN-134 Installing Chef under Cumulus Linux

The Cumulus Linux 2.2 repository contains two versions of Chef, the automation tool: 11.6.2 (the current version) and 10.30.4.

To install the latest version, connect to the switch and use apt-get:

cumulus@switch:~# sudo  apt-get install chef

To install 10.30.4, connect to the switch and use apt-get:

cumulus@switch:~# sudo apt-get install chef=10.30.4-0.debian.7.3 
RN-162 Priority Flow Control doesn't work on Trident II switches

Priority Flow Control (PFC) configuration is not correct for switches on the Trident II platform. As a result, PFC doesn't work.

There is no workaround at this time.

RN-163 VXLAN: ovsdb-server cannot select loopback interface as source IP address, causing TOR registration to the controller to fail

In a VXLAN using VMware NSX, ovsdb-server cannot select the loopback interface as the source IP address. This causes TOR registration to the controller to fail.

To work around this issue, run:

cl-bgp redistribute add connected
RN-165 Quanta LY6 switch has memory parity error _soc_mem_array_sbusdma_read: L2_ENTRY.ipipe0 failed(ERR)

On a Quanta LY6 switch, you may see some memory parity errors in the switchd log that look like this:

switchd.log.1.gz:1402208356.479833 2014-06-08 06:19:16 
 sync.c:2803 IPv4 Route Summary (90) : 0 Added, 1 Deleted, 0 Updated in 30943 usecs
switchd.log.1.gz:1402208356.521537 2014-06-08 06:19:16 
 hal_acl_bcm.c:2352 ACL: installation succeeded, switched over
switchd.log.1.gz:1402208357.679543 2014-06-08 06:19:17 sync.c:2803 
 IPv4 Route Summary (91) : 1 Added, 299 Deleted, 0 Updated in 220336 usecs
switchd.log.1.gz:1402208357.719811 2014-06-08 06:19:17 
 hal_acl_bcm.c:2352 ACL: installation succeeded, switched over
switchd.log.1.gz:1402208359.486625 2014-06-08 06:19:19 sync.c:2803 
 IPv4 Route Summary (92) : 299 Added, 0 Deleted, 0 Updated in 200425 usecs
switchd.log.1.gz:1402208361.042769 2014-06-08 06:19:21 
 hal_acl_bcm.c:2352 ACL: installation succeeded, switched over
switchd.log.1.gz:1402208419.059525 2014-06-08 06:20:19 hal_bcm.c:408 
 caught a parity error of type parity data error: 0x4000001, 0x1c0043cd
switchd.log.1.gz:1402208419.059594 2014-06-08 06:20:19 switchd.c:533 
 No switchd restart: restart trigger has been disabled
switchd.log.1.gz:1402208419.061002 2014-06-08 06:20:19 hal_bcm.c:408 
 caught a parity error of type corrected data error: 0x7d6, 0x43cd
switchd.log.1.gz:1402208419.061032 2014-06-08 06:20:19 switchd.c:533 
 No switchd restart: restart trigger has been disabled
switchd.log.1.gz:1402208419.061091 2014-06-08 06:20:19 
 hal_bcm_console.c:169 WARN STATUS: 0x00000083
switchd.log.1.gz:OPCODE: 0x1c110200
switchd.log.1.gz:START ADDR: 0x04790180
switchd.log.1.gz:CUR ADDR: 0x1c004394
switchd.log.1.gz:_soc_mem_array_sbusdma_read: L2_ENTRY.ipipe0 failed(ERR)
switchd.log.1.gz:H/W received sbus nack with error bit set.
switchd.log.1.gz:Unit: 0 
switchd.log.1.gz:
switchd.log.1.gz:Mem: Parity error..
switchd.log.1.gz:Error in: SBUS transaction.
switchd.log.1.gz:Blk: 1, Pipe: 0, Address: 0x1c0043cd, base: 0x0, stage: 7, index: 17357

While troubleshooting this issue, the error occurred only once. If you encounter this error more than once, please submit a support request.

RN-179

10GTek 10G SR cables exhibit high rate of errors on Penguin Arctica 4804X switch

Some PHY-less Penguin Arctica 4804X platforms using 10GTek 10G MM SR cables exhibit high rates of errors and low bandwidth one direction.

RN-183

Link not coming up (NO-CARRIER) on Penguin Arctica 4804xp with CAB-10GSFP-P9M 10Gtek 9 meter cable

The link does not come up on a Penguin Arctica 4804xp 720G PHY-less switch with 10Gtek 9 meter cables (CAB-10GSFP-P9M). You can determine this by running:

root@switch:~# ip link show swp15
17: swp15: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT qlen 500
    link/ether 44:38:39:00:70:a1 brd ff:ff:ff:ff:ff:ff

Cumulus Linux supports copper passive cables no longer than 7m.

RN-187

Rolling back installation causes Debian packages to be unusable

When you run apt-get install, Cumulus Linux creates a new snapshot automatically if at least one file is added to /etc. If you roll back the installation using cl-persistify, the installer reverts the /etc directory to the specific snapshot, but does not do anything with installed packages.

As a result, these packages will be unusable because missing configuration files. The Cumulus Linux installer does not record which Debian packages changed after a rollback. As a workaround, you can try installing again.

RN-192

PTM may crash when a large topology file has a syntax error

If there is a syntax error in a large (~ 4000 entries) topology.dot file, PTM may crash while reading/parsing this file. This can occur because the libcgraph API that PTM uses for parsing the file cannot handle the error. You should check topology.dot for syntax errors using a Graphviz tool like dot or dotty that can identify the syntax error.

RN-196 For a VXLAN in NSX, ovsdb-server cannot select a loopback interface as the SRC IP

As a result, the TOR registration to the controller fails.

To work around this issue, run:

cl-bgp redistribute add connected
RN-198 Port LEDs behave differently on different switch models

It's been observed that port LEDs behave differently depending upon the make and model of the switch. For example:

  • Agema AG-7448CU: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • Edge-Core AS4600-54T: the LED is off when the link is up. It blinks on briefly when there is traffic.
  • QuantaMesh T3048-LY2R: the LED is on when the link is up. It blinks off briefly when there is traffic.

Cumulus Networks is currently working to fix this issue.

RN-199 When a Quagga route-map is modified, the switch could use the partial map before edits are completed

Cumulus Linux triggers a route-map update before the user finishes editing the route map, resulting in an incorrect route map being used. The route-map update trigger should only occur when user finishes editing the map.

Cumulus Networks is working to fix this issue.

RN-200 On an Edge-Core AS5610-52X switch with long reach QSFP cables, link stays down

To work around this issue, create a file in /etc/network/ called qsfphp, and populate it with the content below. Then run the following for each swp with these high-powered cables:

qsfphp swp#

The contents of qsfphp:

#!/bin/bash

usage() {
    echo "Software override power settings for QSFP cables allowing high power"
    echo "(>1.5W) operation."
    echo "Usage: $0 "
    exit -1
}

if [[ "$#" -ne 1 ]]; then
    usage
fi

interface="$1"

if [[ ${interface:0:3} != "swp" ]]; then
    usage
fi

port=${interface:3}
eeprom_dev=`grep -l '^port'${port}'$' /sys/class/eeprom_dev/*/label`

if [[ -z "$eeprom_dev" ]]; then
    echo "no such interface: $interface"
    exit -1
fi

eeprom="`dirname $eeprom_dev`/device/eeprom"

identifier=`dd if=$eeprom bs=1 count=1 2>/dev/null | hexdump -e '/1 "0x%02x"'`

if [[ "$identifier" != "0x0d" ]]; then
    echo "$interface is not a QSFP cable"
    exit -1
fi

echo -en '\x01' | dd of=$eeprom seek=93 bs=1 count=1 2>/dev/null
if [[ $? -eq 0 ]]; then
    echo done
else
    echo failed
    exit -1
fi
RN-210 Restarting or clearing BGP can lead to SYN flooding warning, triggering an unnecessary alarm

When you run bgp clear or restart BGP, this message can appear on console (in vtysh):

179 dynamic neighbor(s), limit 90
switch07# clear bgp *
switch07#TCP: Possible SYN flooding on port 179. /  Sending cookies. Check SNMP counters. 

The warning also appears in syslog:

Sep 18 17:27:40 switch07 kernel: TCP: Possible SYN flooding on port 179. /  Sending cookies. Check SNMP counters.

This issue is caused by the maximum number of socket connections that are accepted on a listening socket is reached.

You can safely ignore this warning. Or to silence, it, you can increase the number of socket connections by changing the value for net.conf.somaxconn, which defaults to 128. If you increase net.conf.somaxconn to a larger number, the issue is no longer occurs.

RN-216 Broadcast traffic in a bridge gets sent to the CPU even if the bridge doesn't have SVI enabled

Any time a bridge or VLAN interface is created, like bridge br0 or VLAN br0.100, that interface enables a switch virtual interface (SVI) and broadcast traffic in the bridge will be sent up to CPU.

With a VLAN-unaware bridge, a bridge interface is necessary to do bridging. Thus, an SVI gets automatically created, and broadcast traffic is always sent to the CPU.

Normally, with a VLAN-aware bridge, the SVI gets enabled only when a VLAN (like br0.100) is created, so the broadcast traffic is not always sent to the CPU.

Note that for both VLAN-aware and VLAN-unaware bridges, this behavior applies only to broadcast traffic and nothing else.

However, there is a known issue where broadcast traffic will get sent to the CPU even if a VLAN interface is not configured in a VLAN-aware bridge. It will be fixed in a future Cumulus Linux release.

RN-217 LNV: Network restart removes vxsnd anycast IP address from loopback interface

Given the following conditions:

  • You have not configured a loopback anycast IP address in /etc/network/interfaces
  • You enabled the vxsnd (service node daemon) log to automatically add anycast IP addresses

When you restart networking (with service networking restart), the anycast IP address gets removed from the loopback interface.

To prevent this issue from occurring, you should specify an anycast IP address for the loopback interface in both /etc/network/interfaces and vxsnd.conf. This way, in case the vxsnd fails, you can withdraw the IP address.

RN-218 On Quanta T5048-LY8 and T3048-LY9 switches, "Operation timed out" error occurs while removing and reinserting QSFP module The QSPFx2 module cannot be removed while the switch is powered on, as it is not hot-swappable.
RN-221 BGP graceful restart, including helper mode, not fully supported If you encounter issues with this, please submit a support request and include the output from cl-support with your ticket.
RN-227 BGP dynamic capability is not supported BGP peer sessions with dynamic capability are not supported under any version of Cumulus Linux at this time.
RN-228 ifupdown2 doesn't bring up loopback interface (lo) by default

ifupdown2 doesn't bring up loopback interface (lo) by default.

One effect is that since monit uses loopback interfaces, if you delete the lo interface in /etc/network/interfaces, monit commands will not work.

RN-229 When a bond subinterface that is part of a non-VLAN-aware bridge is brought down, it flaps that bridge This issue has been encountered in environments where both VLAN-aware and non-VLAN-aware bridges are in use, where a non-VLAN-aware bridge has a subinterface of a bond that is present as a normal interface in a VLAN-aware bridge.
RN-230 bond-min-links defaults to 0, which causes the bond carrier to be incorrectly up when no slaves are active Even though bond-min-links defaults to 0, Cumulus Networks recommends a minimum link setting of 1 or higher for bonds. Cumulus Linux issues a warning when the setting is 0.
RN-244 Amphenol 1 or 2 meter DACs report linkFlapErrDisabled error

This error has been noticed when Cumulus Linux switches are connected to Cisco Nexus 3000 series switch.

To work around this issue, disable auto-negotiation for the 40G DAC QSFP on the Cisco switch:

dev-nx3064-01(config-if)# no negotiate auto 
RN-245 Copper SFP ports remain up when cable is removed

When a copper SFP 1G link is disconnected from the remote port, the link status will remain up, when link is really down.

There is no workaround at this time.

RN-246 10GTek QSFP-SR4 optical cable on QuantaMesh BMS T3048-LY9 switches exhibits low traffic rate

For QuantaMesh BMS T3048-LY9 switches, the 10GTek QSFP-SR4 optical cable is not supported in Cumulus Linux 2.5.2.

Use the JDSU JQP-04SWAA1 or comparable Multi-Mode QSFP+ cable instead.

Cumulus Networks is working to support this transceiver in a future release.

RN-248 Dual-connected bonds never come up on both CLAG switches after restarting clagd

This issue has been seen when the clagd service is restarted while the peerlink is down on the switch in the primary role. The dual-connected bonds never come up on both CLAG switches.

To work around this issue, bring up the peerlink interface on the primary node, then restart clagd again.

RN-249 x86 switch stuck in RC.Script failure during switchd init time

Under rare and unusual conditions on various x86 Trident II platforms (such as the Penguin Computing Arctica 4806XP, switchd does not start after system reboot and displays this error message:

MiimTimeOut:soc_miim_write, timeout (id=0x1da addr=0x1f data=0x8630)
system_init: Device reset failed: Operation timed out
PORT: Error: bcm ports not initialized
Error: file /etc/bcm.d/rc.ports_0: line 42 (error code -1): script terminated
ERROR loading rc script on unit 0
loading of rc script failed, aborting!

This condition is extremely rare, and does not occur if the switch booted correctly and is operational.

RN-250 switchd error on Quanta QuantaMesh LY9: WARN PHY8481 firmware handshake failed: u=0 p=35 status=0x0002

When switchd comes up on a Quanta QuantaMesh LY9 switch, this error appears in the switchd log:

WARN PHY8481 firmware handshake failed: u=0 p=35 status=0x0002 

This issue is currently being investigated.

RN-252 Accton AS5610-52X LR QSFP link stays down

This issue has been seen on Accton AS5610-52X switches with QSFP ports using high power optics (that is, most LR4 cables).

To work around this issue, create the following script and run it against your QSFP switch ports. For example, if you called the script qsfpfix and want to fix swp49, run:

qsfpfix swp49

Here is the script:

#!/bin/bash

usage() {
    echo "Software override power settings for QSFP cables allowing high power"
    echo "(>1.5W) operation."
    echo "Usage: $0 "
    exit -1
}

if [[ "$#" -ne 1 ]]; then
    usage
fi

interface="$1"

if [[ ${interface:0:3} != "swp" ]]; then
    usage
fi

port=${interface:3}
eeprom_dev=`grep -l '^port'${port}'$' /sys/class/eeprom_dev/*/label`

if [[ -z "$eeprom_dev" ]]; then
    echo "no such interface: $interface"
    exit -1
fi

eeprom="`dirname $eeprom_dev`/device/eeprom"

identifier=`dd if=$eeprom bs=1 count=1 2>/dev/null | hexdump -e '/1 "0x%02x"'`

if [[ "$identifier" != "0x0d" ]]; then
    echo "$interface is not a QSFP cable"
    exit -1
fi

echo -en '\x01' | dd of=$eeprom seek=93 bs=1 count=1 2>/dev/null
if [[ $? -eq 0 ]]; then
    echo done
else
    echo failed
    exit -1
fi
RN-266 eBGP peers cannot set global next hop via outbound route-map

BGP next hop modification to a specific value using an outbound route-map does not take effect when sending updates to eBGP peers. iBGP peers are unaffected.

This issue is currently being investigated.

RN-267 BGP: When peering on link-local addresses with no global IPv6 address, a null next hop value is sent

When BGP peering is established using link-local IPv6 addresses and there are no global IPv6 addresses configured on the peering interface, the BGP speakers may exchange updates that include an unspecified next hop value in the MP_REACH_NLRI attribute. If the receiver of such a BGP update is a device that is not running Cumulus Linux, it may flag this as an error and treat the update as an implicit withdraw.

This issue is currently being investigated.

RN-268 bridge fdb add command IP address type options inconsistent with Debian

The bridge fdb add commands handles the IP address type in a non-standard way:

  • bridge fdb add [MAC address] dev [INTERFACE] master defaults to a static address
  • bridge fdb add [MAC address] dev [INTERFACE] master temp maps to dynamic address
  • bridge fdb add [MAC address] dev [INTERFACE] master local maps to permanent address

This issue should be fixed in a future version of Cumulus Linux.

RN-269 cl-img-install -u upgrade issues

The cl-img-install -u feature has an issue whereby under certain conditions can lead to undesirable results while upgrading software. Cumulus Networks recommends you use apt-get to upgrade Cumulus Linux.

This issue is currently being investigated.

RN-270 inotify support

inotify is not supported by the overlayfs root filesystem on PowerPC platforms.

RN-280 L2 learning disabled on peer bond when the bond subinterface is administratively brought down then up

In this situation, the kernel has the correct setting but the hardware doesn't. This can be caused by changes to administratively bringing the bond down then up, which overwrites the learn setting on the bond member ports.

This issue will be fixed in a future release of Cumulus Linux.


RN-372 (CM-9360)
Security Update for CVE-2015-7547: glibc getaddrinfo Stack-based Buffer Overflow Vulnerability For details on this issue and how to upgrade, read this article.

RN-402 (CM-9627)
smond PSU warnings on Supermicro X3648S

The Supermicro X3648S uses a different PSU (DPS-550) than the equivalent Penguin Arctica 4806XP switch (DPS-460). Cumulus Linux was written to work with the DPS-460, but the DPS-550 behaves differently. 

On the Penguin Arctica 4806XP, a PWM value is written to the DPS-460 fan, and the power supply fan remains at that setting. But on the Supermicro X3648S, when a PWM value is written to the DPS-550 fan, the power supply fan initially goes to the setting; however, after five seconds, it may revert back to its internal setting.

As a result, you will see smond warning messages that the fan input RPM is lower than expected, which can be safely ignored.

Have more questions? Submit a request

Comments

Powered by Zendesk