I'll preface this by saying that the Nagios Core version is very old...3.2.0. I am hoping i will not need to update it for this but...maybe?
The issue is that CPU_Load is wildly incorrect on all RHEL 7 systems.
On RHEL6 systems, there is no issue.
Nagios is just using the standard command: /usr/local/nagios/libexec/check_nrpe -H x.x.x.x -p 5666 -c check_load
On an RHEL6 server, the CPU load value presented by running htop on the server itself and the check_nrpe command on the Nagios server are a perfect match.
On an RHEL7 server, the values extremely vary.
As i was typing this, one server which has a load average of 3.42 3.72 3.99 was reported by check_nrpe as 0.05 0.06 0.06
Running yum install nagios-plugins on the RHEL 7 server shows they are up to date (version 2.2.1)
I have tried searching online but don't see this issue reported anywhere? Any ideas what i could check?
Getting incorrect CPU load values from RHEL 7 systems
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Getting incorrect CPU load values from RHEL 7 systems
@TDBruno, Please run the following command locally on the RHEL 7 server.
And also run the command you've used to get the avergae CPU load and upload the output as well./usr/local/nagios/libexec/check_load -r -w 0.15,0.10,0.05 -c 0.30,0.25,0.20
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Getting incorrect CPU load values from RHEL 7 systems
Can you share the check_load command definition from your nrpe.cfg on the system that is reporting in error?
Re: Getting incorrect CPU load values from RHEL 7 systems
@npolovenko
This is the result of the command on the local RHEL7 system:
@scottwilkerson
This is the definition:
This is the result of the command on the local RHEL7 system:
The only other command i use to get the CPU load is simply htop. Output of that is:/usr/lib64/nagios/plugins/check_load -r -w 0.15,0.10,0.05 -c 0.30,0.25,0.20
OK - load average per CPU: 0.08, 0.05, 0.05|load1=0.075;0.150;0.300;0; load5=0.053;0.100;0.250;0; load15=0.049;0.050;0.200;0;
Load average: 3.17 3.19 3.10
@scottwilkerson
This is the definition:
command[check_load]=/usr/lib64/nagios/plugins/check_load -r -w 19,18,17 -c 30,28,26
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Getting incorrect CPU load values from RHEL 7 systems
You have the -r flag in your command which divides the load by the number of CPUs on the systemTDBruno wrote:Code: Select all
command[check_load]=/usr/lib64/nagios/plugins/check_load -r -w 19,18,17 -c 30,28,26
Remove that from the command, and then restart NRPE and it should report correctly
Re: Getting incorrect CPU load values from RHEL 7 systems
Thanks, that was it!
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Getting incorrect CPU load values from RHEL 7 systems
Great!TDBruno wrote:Thanks, that was it!
Locking thread