Page 1 of 3
time outs while doing check on specific devices
Posted: Thu Mar 26, 2015 8:03 am
by bosecorp
I am getting time outs/alerts when Nagios does the check on specific devices. When I tried to ping them from my desktop or the worker works fine. this only seems to be happening for specific 4 devices.
at first I though it was something to do with the specific devices. however, this devices are in different geographic locations. In the same locations I have hundred of devices that are not having any kind of problem
Re: time outs while doing check on specific devices
Posted: Thu Mar 26, 2015 9:14 am
by tmcdonald
What type of device is this?
What type of check?
Can you show the specific timeout message?
Re: time outs while doing check on specific devices
Posted: Thu Mar 26, 2015 10:12 am
by bosecorp
it just happen with a device. it seems to happen with network devices
"mydevice name" This host is flappingView service status details for this host Down 38s 1/15 2015-03-26 11:10:52 CRITICAL - 10.28.128.7: rta nan, lost 100%
Re: time outs while doing check on specific devices
Posted: Thu Mar 26, 2015 10:30 am
by cmerchant
What check are you using on the devices?
Can you run the check from the command line on the XI server?
Are these checks done on a specific ip address or hostname resolved by dns?
Re: time outs while doing check on specific devices
Posted: Thu Mar 26, 2015 12:18 pm
by bosecorp
checK_icmp
if I run it from the command line works.
it's done by IP
Re: time outs while doing check on specific devices
Posted: Thu Mar 26, 2015 1:31 pm
by tgriep
Can you run the check_icmp from the command line from the server that the check is running on and post the output here?
Also, can you post how the check_icmp command is configured?
Re: time outs while doing check on specific devices
Posted: Thu Mar 26, 2015 6:05 pm
by bosecorp
this is how is configured
$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$
Re: time outs while doing check on specific devices
Posted: Fri Mar 27, 2015 10:52 am
by cmerchant
Could you try adding a timeout parameter -t 30:
Code: Select all
$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5 -t 30
Re: time outs while doing check on specific devices
Posted: Fri Mar 27, 2015 10:59 am
by bosecorp
what are those things for?
and it did not work.
i just saw a device going to down but is not. however it comes right back up
Re: time outs while doing check on specific devices
Posted: Fri Mar 27, 2015 11:20 am
by cmerchant
I should have just left the hard coded $ARG$ out of the example. The timeout parameter eliminates false positives.
$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -t 30