Issue
The following error is observed from smond
in /var/log/syslog
on an Edge-Core AS-4600-54T switch:
2016-01-20T20:53:58.406544+09:00 switch : /usr/sbin/smond : : Temp2(P2020 CPU die sensor): state changed from OK to BAD
2016-01-20T20:53:58.406707+09:00 switch : /usr/sbin/smond : : Temp2(P2020 CPU die sensor): Following outliers were found: [16315.0] C
2016-01-20T20:55:39.117503+09:00 switch : /usr/sbin/smond : : Temp2(P2020 CPU die sensor): state changed from BAD to OK
Similarly, high readings may occasionally be observed from the same sensor, named adt7473-i2c-1-2e, under libsensors
.
Environment
- Cumulus Linux 2.5.z, versions 2.5.3a through 2.5.9
- Edge-Core AS-4600-54T
Resolution
This error can safely be ignored. smond
filters out the reading from thermal management considerations.
Note: This issue is fixed in Cumulus Linux 2.5.10.
Root Cause
Readings of ~ 16315 C are occasionally observed from Temp2 (P2020 CPU die sensor) on the AS-4600-54T. A filter was introduced in Cumulus Linux 2.5.3a to exclude these readings from thermal management considerations, and to log the errant reading instead.
Comments