check_cpu_perf.sh plugin script showing wrong cpu idle value

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
kaushalshriyan
Posts: 119
Joined: Fri May 22, 2015 7:12 am

check_cpu_perf.sh plugin script showing wrong cpu idle value

Post by kaushalshriyan »

Hi,

I am facing the below weird issue of CPU idle showing as 7.22% but as per sar 1 10 output it is showing 98% idle. The details are as below

Code: Select all

[root@plugins]# ./check_cpu_perf.sh 20 10
CRITICAL: [b]CPU Idle = 7.22%[/b] | CpuUser=69.44; CpuNice=0.00; CpuSystem=23.21; CpuIowait=0.10; CpuSteal=0.03; CpuIdle=7.22:20:10
[root@plugins]# sar 1 10
Linux 3.10.0-514.16.1.el7.x86_64 (tapzo-ds-spark-01) 	06/26/17 	_x86_64_	(4 CPU)

16:16:30        CPU     %user     %nice   %system   %iowait    %steal     %idle
16:16:31        all      0.25      0.00      0.76      0.00      0.25     98.73
16:16:32        all      2.29      0.00      3.05      0.00      0.25     94.40
16:16:33        all      0.00      0.00      0.26      0.00      0.00     99.74
16:16:34        all      0.25      0.00      0.25      0.00      0.25     99.24
16:16:35        all      0.26      0.00      0.51      0.00      0.26     98.98
16:16:36        all      0.00      0.00      0.26      0.00      0.00     99.74
16:16:37        all      0.25      0.00      0.25      0.00      0.25     99.24
16:16:38        all      0.00      0.00      0.00      0.00      0.26     99.74
16:16:39        all      0.25      0.00      0.51      0.00      0.00     99.24
16:16:40        all      0.26      0.00      0.77      0.00      0.26     98.72
Average:        all      0.38      0.00      0.66      0.00      0.18     98.78
[root@plugins]# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
[root@plugins]# cat check_cpu_perf.sh
#!/bin/bash
#
# Check CPU Performance plugin for Nagios
#
# Licence : GPL - http://www.fsf.org/licenses/gpl.txt
#
# Author        : Luke Harris
# version       : 2011090802
# Creation date : 1 October 2010
# Revision date : 8 September 2011
# Description   : Nagios plugin to check CPU performance statistics.
#               This script has been tested on the following Linux and Unix platforms:
#		RHEL 4, RHEL 5, RHEL 6, CentOS 4, CentOS 5, CentOS 6, SUSE, Ubuntu, Debian, FreeBSD 7, AIX 5, AIX 6, and Solaris 8 (Solaris 9 & 10 *should* work too)
#               The script is used to obtain key CPU performance statistics by executing the sar command, eg. user, system, iowait, steal, nice, idle
#		The Nagios Threshold test is based on CPU idle percentage only, this is NOT CPU used.
#		Support has been added for Nagios Plugin Performance Data for integration with Splunk, NagiosGrapher, PNP4Nagios,
#		opcp, NagioStat, PerfParse, fifo-rrd, rrd-graph, etc
#
# USAGE         : ./check_cpu_perf.sh {warning} {critical}
#
# Example: ./check_cpu_perf.sh 20 10
# OK: CPU Idle = 84.10% | CpuUser=12.99; CpuNice=0.00; CpuSystem=2.90; CpuIowait=0.01; CpuSteal=0.00; CpuIdle=84.10:20:10
#
# Note: the option exists to NOT test for a threshold. Specifying 0 (zero) for both warning and critical will always return an exit code of 0.


#Ensure warning and critical limits are passed as command-line arguments
if [ -z "$1" -o -z "$2" ]
then
 echo "Please include two arguments, eg."
 echo "Usage: $0 {warning} {critical}"
 echo "Example :-"
 echo "$0 20 10"
exit 3
fi

#Disable nagios alerts if warning and critical limits are both set to 0 (zero)
if [ $1 -eq 0 ]
 then
  if [ $2 -eq 0 ]
   then
    ALERT=false
  fi
fi

#Ensure warning is greater than critical limit
if [ $1 -lt $2 ]
 then
  echo "Please ensure warning is greater than critical, eg."
  echo "Usage: $0 20 10"
  exit 3
fi

#Detect which OS and if it is Linux then it will detect which Linux Distribution.
OS=`uname -s`

GetVersionFromFile()
{
	VERSION=`cat $1 | tr "\n" ' ' | sed s/.*VERSION.*=\ // `
}

if [ "${OS}" = "SunOS" ] ; then
	OS=Solaris
	DIST=Solaris
	ARCH=`uname -p`
elif [ "${OS}" = "AIX" ] ; then
	DIST=AIX
elif [ "${OS}" = "FreeBSD" ] ; then
	DIST=BSD
elif [ "${OS}" = "Linux" ] ; then
	KERNEL=`uname -r`
	if [ -f /etc/redhat-release ] ; then
		DIST='RedHat'
	elif [ -f /etc/SuSE-release ] ; then
		DIST=`cat /etc/SuSE-release | tr "\n" ' '| sed s/VERSION.*//`
	elif [ -f /etc/mandrake-release ] ; then
		DIST='Mandrake'
	elif [ -f /etc/debian_version ] ; then
		DIST="Debian `cat /etc/debian_version`"
	fi
	if [ -f /etc/UnitedLinux-release ] ; then
		DIST="${DIST}[`cat /etc/UnitedLinux-release | tr "\n" ' ' | sed s/VERSION.*//`]"
	fi
fi

#Define package format
case "`echo ${DIST}|awk '{print $1}'`" in
'RedHat')
PACKAGE="rpm"
;;
'SUSE')
PACKAGE="rpm"
;;
'Mandrake')
PACKAGE="rpm"
;;
'Debian')
PACKAGE="dpkg"
;;
'UnitedLinux')
PACKAGE="rpm"
;;
'Solaris')
PACKAGE="pkginfo"
;;
'AIX')
PACKAGE="lslpp"
;;
'BSD')
PACKAGE="pkg_info"
;;
esac

#Define locale to ensure time is in 24 hour format
LC_MONETARY=en_AU.UTF-8
LC_NUMERIC=en_AU.UTF-8
LC_ALL=en_AU.UTF-8
LC_MESSAGES=en_AU.UTF-8
LC_COLLATE=en_AU.UTF-8
LANG=en_AU.UTF-8
LC_TIME=en_AU.UTF-8

#Collect sar output
case "$PACKAGE" in
'rpm')
SARCPU=`/usr/bin/sar -P ALL|grep all|grep -v Average|tail -1`
SYSSTATRPM=`rpm -q sysstat|awk -F\- '{print $2}'|awk -F\. '{print $1}'`
if [ $SYSSTATRPM -gt 5 ]
 then
  SARCPUIDLE=`echo ${SARCPU}|awk '{print $8}'|awk -F. '{print $1}'`
  CPU=`echo ${SARCPU}|awk '{print "CPU Idle = " $8 "% | " "CpuUser=" $3 "; CpuNice=" $4 "; CpuSystem=" $5 "; CpuIowait=" $6 "; CpuSteal=" $7 "; CpuIdle=" $8":20:10"}'`
 else
  SARCPUIDLE=`echo ${SARCPU}|awk '{print $7}'|awk -F. '{print $1}'`
  CPU=`echo ${SARCPU}|awk '{print "CPU Idle = " $7 "% | " "CpuUser=" $3 "; CpuNice=" $4 "; CpuSystem=" $5 "; CpuIowait=" $6 "; CpuIdle=" $7":20:10"}'`
fi
;;
'dpkg')
SARCPU=`/usr/bin/sar -P ALL|grep all|grep -v Average|tail -1`
SYSSTATDPKG=`dpkg -l sysstat|grep sysstat|awk '{print $3}'|awk -F\. '{print $1}'`
if [ $SYSSTATDPKG -gt 5 ]
 then
  SARCPUIDLE=`echo ${SARCPU}|awk '{print $8}'|awk -F. '{print $1}'`
  CPU=`echo ${SARCPU}|awk '{print "CPU Idle = " $8 "% | " "CpuUser=" $3 "; CpuNice=" $4 "; CpuSystem=" $5 "; CpuIowait=" $6 "; CpuSteal=" $7 "; CpuIdle=" $8":20:10"}'`
 else
  SARCPUIDLE=`echo ${SARCPU}|awk '{print $7}'|awk -F. '{print $1}'`
  CPU=`echo ${SARCPU}|awk '{print "CPU Idle = " $7 "% | " "CpuUser=" $3 "; CpuNice=" $4 "; CpuSystem=" $5 "; CpuIowait=" $6 "; CpuIdle=" $7":20:10"}'`
fi
;;
'lslpp')
SARCPU=`/usr/sbin/sar -P ALL|grep "\-"|grep -v U|tail -2|head -1`
SYSSTATLSLPP=`lslpp -l bos.acct|tail -1|awk '{print $2}'|awk -F\. '{print $1}'`
if [ $SYSSTATLSLPP -gt 4 ]
 then
  CpuPhysc=`echo ${SARCPU}|awk '{print $6}'`
  LPARCPU=`/usr/bin/lparstat -i | grep "Maximum Capacity" | awk '{print $4}' |head -1`
  SARCPUIDLE=`echo "scale=2;100-(${CpuPhysc}/${LPARCPU}*100)" | bc | awk -F. '{print $1}'`
  PERFDATA=`echo ${SARCPU}|awk '{print "CpuUser=" $2 "; CpuSystem=" $3 "; CpuIowait=" $4 "; CpuPhysc=" $6 "; CpuEntc=" $7 "; CpuIdle=" $5":20:10"}'`
  CPU=`echo "CPU Idle = "${SARCPUIDLE}"% |" ${PERFDATA}"; LparCpuIdle="${SARCPUIDLE}"; LparCpuTotal="$LPARCPU`
 else
  echo "AIX $SYSSTATLSLPP Not Supported"
  exit 3
fi
;;
'pkginfo')
SARCPU=`/usr/bin/sar -u|grep -v Average|tail -2|head -1`
SYSSTATPKGINFO=`pkginfo -l SUNWaccu|grep VERSION|awk '{print $2}'|awk -F\. '{print $1}'`
if [ $SYSSTATPKGINFO -ge 11 ]
 then
  SARCPUIDLE=`echo ${SARCPU}|awk '{print $5}'`
  CPU=`echo ${SARCPU}|awk '{print "CPU Idle = " $5 "% | " "CpuUser=" $2 "; CpuSystem=" $3 "; CpuIowait=" $4 "; CpuIdle=" $5":20:10"}'`
 else
  echo "Solaris $SYSSTATPKGINFO Not Supported"
  exit 3
fi
;;
'pkg_info')
SARCPU=`/usr/local/bin/bsdsar -u|tail -1`
SYSSTATPKGINFO=`pkg_info | grep ^bsdsar | awk -F\- '{print $2}' | awk -F\. '{print $1}'`
if [ $SYSSTATPKGINFO -ge 1 ]
 then
  SARCPUIDLE=`echo ${SARCPU}|awk '{print $6}'`
  CPU=`echo ${SARCPU}|awk '{print "CPU Idle = " $6 "% | " "CpuUser=" $2 "; CpuSystem=" $3 "; CpuNice=" $4 "; CpuIntrpt=" $5 "; CpuIdle=" $6":20:10"}'`
 else
  echo "BSD $SYSSTATPKGINFO Not Supported"
  exit 3
fi
;;
esac

#Display CPU Performance without alert
if [ "$ALERT" == "false" ]
 then
		echo "$CPU"
		exit 0
 else
        ALERT=true
fi

#Display CPU Performance with alert
if [ ${SARCPUIDLE} -lt $2 ]
 then
		echo "CRITICAL: $CPU"
		exit 2
 elif [ $SARCPUIDLE -lt $1 ]
		 then
		  echo "WARNING: $CPU"
		  exit 1
         else
		  echo "OK: $CPU"
		  exit 0
fi

[root@plugins]#
Any help will be highly appreciable
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: check_cpu_perf.sh plugin script showing wrong cpu idle v

Post by tgriep »

That is strange that the numbers do not closely match, can you run the following as root on the Nagios server and post the output?

Code: Select all

./check_cpu_perf.sh 20 10 ; /usr/bin/sar -P ALL|grep all|grep -v Average|tail -1
rpm -q sysstat|awk -F\- '{print $2}'|awk -F\. '{print $1}'
Be sure to check out our Knowledgebase for helpful articles and solutions!
kaushalshriyan
Posts: 119
Joined: Fri May 22, 2015 7:12 am

Re: check_cpu_perf.sh plugin script showing wrong cpu idle v

Post by kaushalshriyan »

Hi tgriep,

Please find the below details

Code: Select all

[root@plugins]# ./check_cpu_perf.sh 20 10 ; /usr/bin/sar -P ALL|grep all|grep -v Average|tail -1
CRITICAL: CPU Idle = 6.35% | CpuUser=69.60; CpuNice=0.00; CpuSystem=23.86; CpuIowait=0.16; CpuSteal=0.03; CpuIdle=6.35:20:10
23:50:01        all     69.60      0.00     23.86      0.16      0.03      6.35
[root@plugins]# rpm -q sysstat|awk -F\- '{print $2}'|awk -F\. '{print $1}'
10
[root@plugins]#
[root@plugins]# rpm -qa | grep sysstat
sysstat-10.1.5-11.el7.x86_64
[root@plugins]# rpm -qil sysstat
Name        : sysstat
Version     : 10.1.5
Release     : 11.el7
Architecture: x86_64
Install Date: Tue Apr 25 13:53:25 2017
Group       : Applications/System
Size        : 1158918
License     : GPLv2+
Signature   : RSA/SHA256, Mon Nov 21 02:19:58 2016, Key ID 24c6a8a7f4a80eb5
Source RPM  : sysstat-10.1.5-11.el7.src.rpm
Build Date  : Sat Nov  5 23:54:24 2016
Build Host  : worker1.bsys.centos.org
Relocations : (not relocatable)
Packager    : CentOS BuildSystem <http://bugs.centos.org>
Vendor      : CentOS
URL         : http://sebastien.godard.pagesperso-orange.fr/
Summary     : Collection of performance monitoring tools for Linux
Description :
The sysstat package contains sar, sadf, mpstat, iostat, pidstat, nfsiostat-sysstat,
tapestat, cifsiostat and sa tools for Linux.
The sar command collects and reports system activity information. This
information can be saved in a file in a binary format for future inspection. The
statistics reported by sar concern I/O transfer rates, paging activity,
process-related activities, interrupts, network activity, memory and swap space
utilization, CPU utilization, kernel activities and TTY statistics, among
others. Both UP and SMP machines are fully supported.
The sadf command may be used to display data collected by sar in various formats
(CSV, XML, etc.).
The iostat command reports CPU utilization and I/O statistics for disks.
The tapestat command reports statistics for tapes connected to the system.
The mpstat command reports global and per-processor statistics.
The pidstat command reports statistics for Linux tasks (processes).
The nfsiostat-sysstat command reports I/O statistics for network file systems.
The cifsiostat command reports I/O statistics for CIFS file systems.
/etc/cron.d/sysstat
/etc/sysconfig/sysstat
/etc/sysconfig/sysstat.ioconf
/usr/bin/cifsiostat
/usr/bin/iostat
/usr/bin/mpstat
/usr/bin/nfsiostat-sysstat
/usr/bin/pidstat
/usr/bin/sadf
/usr/bin/sar
/usr/bin/tapestat
/usr/lib/systemd/system/sysstat.service
/usr/lib64/sa
/usr/lib64/sa/sa1
/usr/lib64/sa/sa2
/usr/lib64/sa/sadc
/usr/share/doc/sysstat-10.1.5
/usr/share/doc/sysstat-10.1.5/CHANGES
/usr/share/doc/sysstat-10.1.5/COPYING
/usr/share/doc/sysstat-10.1.5/CREDITS
/usr/share/doc/sysstat-10.1.5/FAQ
/usr/share/doc/sysstat-10.1.5/README
/usr/share/doc/sysstat-10.1.5/sysstat-10.1.5.lsm
/usr/share/locale/af/LC_MESSAGES/sysstat.mo
/usr/share/locale/cs/LC_MESSAGES/sysstat.mo
/usr/share/locale/da/LC_MESSAGES/sysstat.mo
/usr/share/locale/de/LC_MESSAGES/sysstat.mo
/usr/share/locale/eo/LC_MESSAGES/sysstat.mo
/usr/share/locale/es/LC_MESSAGES/sysstat.mo
/usr/share/locale/eu/LC_MESSAGES/sysstat.mo
/usr/share/locale/fi/LC_MESSAGES/sysstat.mo
/usr/share/locale/fr/LC_MESSAGES/sysstat.mo
/usr/share/locale/hr/LC_MESSAGES/sysstat.mo
/usr/share/locale/id/LC_MESSAGES/sysstat.mo
/usr/share/locale/it/LC_MESSAGES/sysstat.mo
/usr/share/locale/ja/LC_MESSAGES/sysstat.mo
/usr/share/locale/ky/LC_MESSAGES/sysstat.mo
/usr/share/locale/lv/LC_MESSAGES/sysstat.mo
/usr/share/locale/mt/LC_MESSAGES/sysstat.mo
/usr/share/locale/nb/LC_MESSAGES/sysstat.mo
/usr/share/locale/nl/LC_MESSAGES/sysstat.mo
/usr/share/locale/nn/LC_MESSAGES/sysstat.mo
/usr/share/locale/pl/LC_MESSAGES/sysstat.mo
/usr/share/locale/pt/LC_MESSAGES/sysstat.mo
/usr/share/locale/pt_BR/LC_MESSAGES/sysstat.mo
/usr/share/locale/ro/LC_MESSAGES/sysstat.mo
/usr/share/locale/ru/LC_MESSAGES/sysstat.mo
/usr/share/locale/sk/LC_MESSAGES/sysstat.mo
/usr/share/locale/sr/LC_MESSAGES/sysstat.mo
/usr/share/locale/sv/LC_MESSAGES/sysstat.mo
/usr/share/locale/uk/LC_MESSAGES/sysstat.mo
/usr/share/locale/vi/LC_MESSAGES/sysstat.mo
/usr/share/locale/zh_CN/LC_MESSAGES/sysstat.mo
/usr/share/locale/zh_TW/LC_MESSAGES/sysstat.mo
/usr/share/man/man1/cifsiostat.1.gz
/usr/share/man/man1/iostat.1.gz
/usr/share/man/man1/mpstat.1.gz
/usr/share/man/man1/nfsiostat-sysstat.1.gz
/usr/share/man/man1/pidstat.1.gz
/usr/share/man/man1/sadf.1.gz
/usr/share/man/man1/sar.1.gz
/usr/share/man/man1/tapestat.1.gz
/usr/share/man/man5/sysstat.5.gz
/usr/share/man/man8/sa1.8.gz
/usr/share/man/man8/sa2.8.gz
/usr/share/man/man8/sadc.8.gz
/var/log/sa
[root@plugins]#
Regards,

Kaushal
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: check_cpu_perf.sh plugin script showing wrong cpu idle v

Post by tgriep »

The check_cpu_perf.sh script seems to be working in the test that I had you run.
That script runs the sar command and grep's the Idle time from the 8th column of the sar output and the test seems to match the output of the plugin.
Is the plugin showing the wrong output when the Nagios process is running it?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked