Page 1 of 3

time outs while doing check on specific devices

Posted: Thu Mar 26, 2015 8:03 am
by bosecorp
I am getting time outs/alerts when Nagios does the check on specific devices. When I tried to ping them from my desktop or the worker works fine. this only seems to be happening for specific 4 devices.

at first I though it was something to do with the specific devices. however, this devices are in different geographic locations. In the same locations I have hundred of devices that are not having any kind of problem

Re: time outs while doing check on specific devices

Posted: Thu Mar 26, 2015 9:14 am
by tmcdonald
What type of device is this?

What type of check?

Can you show the specific timeout message?

Re: time outs while doing check on specific devices

Posted: Thu Mar 26, 2015 10:12 am
by bosecorp
it just happen with a device. it seems to happen with network devices

"mydevice name" This host is flappingView service status details for this host Down 38s 1/15 2015-03-26 11:10:52 CRITICAL - 10.28.128.7: rta nan, lost 100%

Re: time outs while doing check on specific devices

Posted: Thu Mar 26, 2015 10:30 am
by cmerchant
What check are you using on the devices?

Can you run the check from the command line on the XI server?

Are these checks done on a specific ip address or hostname resolved by dns?

Re: time outs while doing check on specific devices

Posted: Thu Mar 26, 2015 12:18 pm
by bosecorp
checK_icmp

if I run it from the command line works.

it's done by IP

Re: time outs while doing check on specific devices

Posted: Thu Mar 26, 2015 1:31 pm
by tgriep
Can you run the check_icmp from the command line from the server that the check is running on and post the output here?
Also, can you post how the check_icmp command is configured?

Re: time outs while doing check on specific devices

Posted: Thu Mar 26, 2015 6:05 pm
by bosecorp
this is how is configured

$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$

Re: time outs while doing check on specific devices

Posted: Fri Mar 27, 2015 10:52 am
by cmerchant
Could you try adding a timeout parameter -t 30:

Code: Select all

$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5 -t 30

Re: time outs while doing check on specific devices

Posted: Fri Mar 27, 2015 10:59 am
by bosecorp
what are those things for?

and it did not work.

i just saw a device going to down but is not. however it comes right back up

Re: time outs while doing check on specific devices

Posted: Fri Mar 27, 2015 11:20 am
by cmerchant
I should have just left the hard coded $ARG$ out of the example. The timeout parameter eliminates false positives.

$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -t 30