Getting load warnings when load OK
Posted: Thu Sep 27, 2012 10:36 am
Each day when our app server is working a batch nagios sends load warnings out even though they have been configured for much higher loads than the system experiences. Why am being alerted for loads well below the threshold?
Relevent /etc/nagios/nrpe.cfg line on monitored host:
command[check_load]=/usr/lib/nagios/plugins/check_load -w 16,10,4 -c 32,24,20
sar output from host at time of warning:
09:05:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
09:50:01 AM 2 201 0.39 0.61 0.53
09:55:01 AM 2 203 0.60 0.54 0.52
10:05:02 AM 4 230 1.82 1.43 0.97
10:10:01 AM 3 230 1.32 1.55 1.16
10:15:01 AM 1 229 1.09 1.29 1.15
10:20:01 AM 1 232 1.09 1.16 1.12
10:25:01 AM 1 236 1.66 1.28 1.16
10:30:01 AM 1 236 1.26 1.24 1.17
10:35:01 AM 1 236 1.24 1.22 1.18
10:40:01 AM 1 234 1.24 1.22 1.18
10:45:01 AM 1 229 1.01 1.12 1.15
10:50:01 AM 0 222 0.47 0.78 1.00
Average: 0 200 0.24 0.22 0.19
***** Nagios *****
Notification Type: PROBLEM
Service: Processor Load
Host: App 3.0
Address: 10.0.0.74
State: WARNING
Date/Time: Thu Sept 27 10:16:44 EDT 2012
Additional Info:
WARNING - load average: 1.16, 1.24, 1.15
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
***** Nagios *****
Notification Type: PROBLEM
Service: Processor Load
Host: App 3.0
Address: 10.0.0.74
State: WARNING
Date/Time: Thu Sept 27 10:46:44 EDT 2012
Additional Info:
WARNING - load average: 1.00, 1.09, 1.13
Relevent /etc/nagios/nrpe.cfg line on monitored host:
command[check_load]=/usr/lib/nagios/plugins/check_load -w 16,10,4 -c 32,24,20
sar output from host at time of warning:
09:05:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
09:50:01 AM 2 201 0.39 0.61 0.53
09:55:01 AM 2 203 0.60 0.54 0.52
10:05:02 AM 4 230 1.82 1.43 0.97
10:10:01 AM 3 230 1.32 1.55 1.16
10:15:01 AM 1 229 1.09 1.29 1.15
10:20:01 AM 1 232 1.09 1.16 1.12
10:25:01 AM 1 236 1.66 1.28 1.16
10:30:01 AM 1 236 1.26 1.24 1.17
10:35:01 AM 1 236 1.24 1.22 1.18
10:40:01 AM 1 234 1.24 1.22 1.18
10:45:01 AM 1 229 1.01 1.12 1.15
10:50:01 AM 0 222 0.47 0.78 1.00
Average: 0 200 0.24 0.22 0.19
***** Nagios *****
Notification Type: PROBLEM
Service: Processor Load
Host: App 3.0
Address: 10.0.0.74
State: WARNING
Date/Time: Thu Sept 27 10:16:44 EDT 2012
Additional Info:
WARNING - load average: 1.16, 1.24, 1.15
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
***** Nagios *****
Notification Type: PROBLEM
Service: Processor Load
Host: App 3.0
Address: 10.0.0.74
State: WARNING
Date/Time: Thu Sept 27 10:46:44 EDT 2012
Additional Info:
WARNING - load average: 1.00, 1.09, 1.13