I have recently set up a new instance of Core 4.2.4 on a new server and I have started to receive a few alert emails indicating that some servers are down (but only sporadically). The servers AREN'T down, incidentally - I just think that the ping is taking a little longer than expected on the odd occasion from the new Nagios master, as the same alerts aren't being sent out from my 4.0.8 instance.
On a few occasions this morning, I received alerts a small numberof servers (all from the same hostgroup), then a bit later on, a few more, but at no point were any of the servers down.
The servers in question are Linux boxes, in a different OU from my other Linux and Windows servers, for which I'm (correctly) not receiving any ping alerts at all, either from the 4.0.8 or the 4.2.4 instances.
Here're the check-host-alive and check_ping commands from my commands.cfg (the config files are exactly as they were copied from my 4.0.8 instance):
Code: Select all
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}
Code: Select all
define host{
name linux-server-xxx; The name of this host template
use generic-host ; This template inherits other values from the generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 1 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive; Default command to check Linux hosts
notification_period 24x7 ; Linux admins hate to be woken up, so we only notify during the day
; Note that the notification_period variable is being overridden from
; the value that is inherited from the generic-host template!
notification_options d,u,r ; Only send notifications for specific host states
notification_interval 10 ; XX Minutes or 0 to only send the FIRST notification
contacts my_id
contact_groups group_id ;Notifications get sent to the admins by default
#hostgroups linux-servers ; Host groups that Windows servers should be a member of
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
Code: Select all
Notification Type: PROBLEM
Host: MyServerID
State: DOWN
Address: MyServerID
Info: (No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_ping, ...) failed. errno is 2: No such file or directory
Other Linux hosts that use the same template and commands are working fine too.
Thanks in advance for your help.
neworderfac33
Posts: 218
Joined: Fri Jul 24, 2015 5:04 pm