Page 1 of 1

Getting load warnings when load OK

Posted: Thu Sep 27, 2012 10:36 am
by indy500
Each day when our app server is working a batch nagios sends load warnings out even though they have been configured for much higher loads than the system experiences. Why am being alerted for loads well below the threshold?

Relevent /etc/nagios/nrpe.cfg line on monitored host:
command[check_load]=/usr/lib/nagios/plugins/check_load -w 16,10,4 -c 32,24,20

sar output from host at time of warning:

09:05:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
09:50:01 AM 2 201 0.39 0.61 0.53
09:55:01 AM 2 203 0.60 0.54 0.52
10:05:02 AM 4 230 1.82 1.43 0.97
10:10:01 AM 3 230 1.32 1.55 1.16
10:15:01 AM 1 229 1.09 1.29 1.15
10:20:01 AM 1 232 1.09 1.16 1.12
10:25:01 AM 1 236 1.66 1.28 1.16
10:30:01 AM 1 236 1.26 1.24 1.17
10:35:01 AM 1 236 1.24 1.22 1.18
10:40:01 AM 1 234 1.24 1.22 1.18
10:45:01 AM 1 229 1.01 1.12 1.15
10:50:01 AM 0 222 0.47 0.78 1.00
Average: 0 200 0.24 0.22 0.19

***** Nagios *****
Notification Type: PROBLEM

Service: Processor Load
Host: App 3.0
Address: 10.0.0.74
State: WARNING

Date/Time: Thu Sept 27 10:16:44 EDT 2012

Additional Info:

WARNING - load average: 1.16, 1.24, 1.15
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
***** Nagios *****
Notification Type: PROBLEM
Service: Processor Load
Host: App 3.0
Address: 10.0.0.74
State: WARNING

Date/Time: Thu Sept 27 10:46:44 EDT 2012

Additional Info:

WARNING - load average: 1.00, 1.09, 1.13

Re: Getting load warnings when load OK

Posted: Mon Oct 01, 2012 10:39 am
by mguthrie
Could maybe be a float vs integer issue. What happens when you enter the thresholds as floats?

-w 16.0,10.0,4.0 -c 32.0,24.0,20.0

Re: Getting load warnings when load OK

Posted: Mon Oct 01, 2012 10:54 am
by indy500
Changed command to add float values. WIll see what happens.

nrpe.cfg:command[check_load]=/usr/lib/nagios/plugins/check_load -w 2.0,1.5,1.0 -c 10.0,5.0,3.0

Thanks for the response.