Page 1 of 1

Getting incorrect CPU load values from RHEL 7 systems

Posted: Thu Nov 01, 2018 3:28 pm
by TDBruno
I'll preface this by saying that the Nagios Core version is very old...3.2.0. I am hoping i will not need to update it for this but...maybe?

The issue is that CPU_Load is wildly incorrect on all RHEL 7 systems.

On RHEL6 systems, there is no issue.
Nagios is just using the standard command: /usr/local/nagios/libexec/check_nrpe -H x.x.x.x -p 5666 -c check_load

On an RHEL6 server, the CPU load value presented by running htop on the server itself and the check_nrpe command on the Nagios server are a perfect match.

On an RHEL7 server, the values extremely vary.
As i was typing this, one server which has a load average of 3.42 3.72 3.99 was reported by check_nrpe as 0.05 0.06 0.06

Running yum install nagios-plugins on the RHEL 7 server shows they are up to date (version 2.2.1)

I have tried searching online but don't see this issue reported anywhere? Any ideas what i could check?

Re: Getting incorrect CPU load values from RHEL 7 systems

Posted: Fri Nov 02, 2018 12:50 pm
by npolovenko
@TDBruno, Please run the following command locally on the RHEL 7 server.
/usr/local/nagios/libexec/check_load -r -w 0.15,0.10,0.05 -c 0.30,0.25,0.20
And also run the command you've used to get the avergae CPU load and upload the output as well.

Re: Getting incorrect CPU load values from RHEL 7 systems

Posted: Fri Nov 02, 2018 2:37 pm
by scottwilkerson
Can you share the check_load command definition from your nrpe.cfg on the system that is reporting in error?

Re: Getting incorrect CPU load values from RHEL 7 systems

Posted: Mon Nov 05, 2018 1:04 pm
by TDBruno
@npolovenko

This is the result of the command on the local RHEL7 system:
/usr/lib64/nagios/plugins/check_load -r -w 0.15,0.10,0.05 -c 0.30,0.25,0.20
OK - load average per CPU: 0.08, 0.05, 0.05|load1=0.075;0.150;0.300;0; load5=0.053;0.100;0.250;0; load15=0.049;0.050;0.200;0;
The only other command i use to get the CPU load is simply htop. Output of that is:
Load average: 3.17 3.19 3.10

@scottwilkerson
This is the definition:
command[check_load]=/usr/lib64/nagios/plugins/check_load -r -w 19,18,17 -c 30,28,26

Re: Getting incorrect CPU load values from RHEL 7 systems

Posted: Mon Nov 05, 2018 2:46 pm
by scottwilkerson
TDBruno wrote:

Code: Select all

command[check_load]=/usr/lib64/nagios/plugins/check_load -r -w 19,18,17 -c 30,28,26
You have the -r flag in your command which divides the load by the number of CPUs on the system

Remove that from the command, and then restart NRPE and it should report correctly

Re: Getting incorrect CPU load values from RHEL 7 systems

Posted: Mon Nov 05, 2018 3:54 pm
by TDBruno
Thanks, that was it!

Re: Getting incorrect CPU load values from RHEL 7 systems

Posted: Mon Nov 05, 2018 4:02 pm
by scottwilkerson
TDBruno wrote:Thanks, that was it!
Great!

Locking thread