Nagios Not Functioning
Posted: Wed Sep 12, 2012 2:25 pm
Nagios has become unstable and is exhibiting a number of issues:
1. Notifications are failing at random with an error:
"[1347476154] Warning: Contact 'cgraham' service notification command '/usr/bin/printf "%b" "[email warning]" [email protected]' timed out after 30 seconds
2. Getting fork errors:
"Warning: The check of service '[service check]' could not be performed due to a fork() error: 'Resource temporarily unavailable'. The check will be rescheduled.
3. Getting pipe errors:
[1347475829] HOST ALERT: [host name];DOWN;SOFT;1;Could not open pipe: /bin/ping -n -U -w 30 -c 5 [ip address]
4. /tmp full of check entries (over 365K)
5. Seeing a block of retries in the logs:
[1347476391] Warning: The check of host '[hostname]' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host...
Any ideas where to start on this? My search of the various issues turned up nothing promising.
1. Notifications are failing at random with an error:
"[1347476154] Warning: Contact 'cgraham' service notification command '/usr/bin/printf "%b" "[email warning]" [email protected]' timed out after 30 seconds
2. Getting fork errors:
"Warning: The check of service '[service check]' could not be performed due to a fork() error: 'Resource temporarily unavailable'. The check will be rescheduled.
3. Getting pipe errors:
[1347475829] HOST ALERT: [host name];DOWN;SOFT;1;Could not open pipe: /bin/ping -n -U -w 30 -c 5 [ip address]
4. /tmp full of check entries (over 365K)
5. Seeing a block of retries in the logs:
[1347476391] Warning: The check of host '[hostname]' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host...
Any ideas where to start on this? My search of the various issues turned up nothing promising.