Plugin time out error
Plugin time out error
Hi guys, bit of a newbie question please go easy on me…
Nagios was installed and configured (on our Unix servers) by someone who’s not with us anymore and I’m now looking after it with limited knowledge (but I’m getting better).
We have a Linux server that only has one check and that is to see if it is up, we use the supplied plugin “check_ping” run with the following parameters…
check_ping -H <host_address> -w 100.0,20% -c 500.0,60% -L
For some reason it fails with an error of “CRITICAL - Plugin timed out after 10 seconds”
The server is up, I can ping it ok and also just to confuse matters when I run the script from the command line it works fine. Also I have written my own very simple plugin that just does a ping and that works both from the command line and through the GUI.
The same script (check_ping) is run on every server on site and doesn’t fail so I would naturally think that it is an issue with this one server but it never used to do this and I don’t know where to look?
Any ideas why the job is failing through the GUI but not the command line?
Sorry if this is a bit daft/obvious but any pointers/advice would be very welcome
Cheers
Nagios was installed and configured (on our Unix servers) by someone who’s not with us anymore and I’m now looking after it with limited knowledge (but I’m getting better).
We have a Linux server that only has one check and that is to see if it is up, we use the supplied plugin “check_ping” run with the following parameters…
check_ping -H <host_address> -w 100.0,20% -c 500.0,60% -L
For some reason it fails with an error of “CRITICAL - Plugin timed out after 10 seconds”
The server is up, I can ping it ok and also just to confuse matters when I run the script from the command line it works fine. Also I have written my own very simple plugin that just does a ping and that works both from the command line and through the GUI.
The same script (check_ping) is run on every server on site and doesn’t fail so I would naturally think that it is an issue with this one server but it never used to do this and I don’t know where to look?
Any ideas why the job is failing through the GUI but not the command line?
Sorry if this is a bit daft/obvious but any pointers/advice would be very welcome
Cheers
Re: Plugin time out error
How long does a normal ping to the system take - longer than 10 seconds?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Plugin time out error
Hi abrist, I ran some tests and from the command line every time it was approx. 4 seconds returning the following output...abrist wrote:How long does a normal ping to the system take - longer than 10 seconds?
./check_ping -H nile -w 3000.0,80% -c 5000.0,100% -p 10 -t 50
PING OK - Packet loss = 0%, RTA = 0.46 ms|rta=0.459500ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
however Service State Information from the failed job in the GUI displays...
Host Status: DOWN (for 6d 6h 17m 32s)
Status Information: PING CRITICAL - Packet loss = 100%
Performance Data: rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
Current Attempt: 1/10 (HARD state)
Last Check Time: 16-01-2014 08:38:02
Check Type: ACTIVE
Check Latency / Duration: 2.968 / 14.196 seconds
as I said before this never used to happen but I don't understand why it isn't failing on the command line as well.
I've had a look for any logs that can throw some light on this but I can't find any.
any pointers? thanks in advance
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Plugin time out error
Can you copy the host definition to which that ping service is attached and share it here?
Re: Plugin time out error
Hi slansing,slansing wrote:Can you copy the host definition to which that ping service is attached and share it here?
thanks for the reply, I'm not allowed to post the server name & IP address of the server on a public forum
back to square one I'm afraid
cheers
Re: Plugin time out error
Well of course you can edit out the IP in the form of xxx.xxx.xxx.xxx and change the server name to <server name>, so back to whatever square we were just in.
Former Nagios employee
Re: Plugin time out error
ok is this what you mean...tmcdonald wrote:Well of course you can edit out the IP in the form of xxx.xxx.xxx.xxx and change the server name to <server name>, so back to whatever square we were just in.
Extract from hosts.cfg
Code: Select all
define host{
use linux-server
host_name nile
alias nile
address xxxx.xxxx.xxxx.xxxx
parents pollux
}
Code: Select all
define service{
use local-service
host_name *
service_description Ping Check
check_command check_ping!100.0,20%!500.0,60%
}
Code: Select all
# 'check-host-alive' command definition
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
is that any use to you?
thanks in advance
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Plugin time out error
Can you post the output of the following commands in code-wrap? I have used your post above as an example:
Code: Select all
ls -la /app/nagios/libexec/
Re: Plugin time out error
Hi slansing, before I do that do you need the entire list of output? There are a lot of bespoke scripts in that directory developed by our site and it will make for a rather large post.slansing wrote:Can you post the output of the following commands in code-wrap? I have used your post above as an example:
Code: Select all
ls -la /app/nagios/libexec/
Would it be easier if I narrowed the list down to something specific?
thanks
-
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Plugin time out error
I just wanted to verify the plugins were actually installed to that directory, can you run the following and show the output?:
Code: Select all
ls -la /app/nagios/libexec/check_ping
Code: Select all
/app/nagios/libexec/check_ping -H <hostaddress> -w 100.0,20% -c 500.0,60% -L