Plugin time out error

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
barney
Posts: 10
Joined: Wed Jan 15, 2014 5:42 am

Plugin time out error

Post by barney »

Hi guys, bit of a newbie question please go easy on me…

Nagios was installed and configured (on our Unix servers) by someone who’s not with us anymore and I’m now looking after it with limited knowledge (but I’m getting better).

We have a Linux server that only has one check and that is to see if it is up, we use the supplied plugin “check_ping” run with the following parameters…

check_ping -H <host_address> -w 100.0,20% -c 500.0,60% -L

For some reason it fails with an error of “CRITICAL - Plugin timed out after 10 seconds

The server is up, I can ping it ok and also just to confuse matters when I run the script from the command line it works fine. Also I have written my own very simple plugin that just does a ping and that works both from the command line and through the GUI.

The same script (check_ping) is run on every server on site and doesn’t fail so I would naturally think that it is an issue with this one server but it never used to do this and I don’t know where to look?

Any ideas why the job is failing through the GUI but not the command line?

Sorry if this is a bit daft/obvious but any pointers/advice would be very welcome

Cheers
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Plugin time out error

Post by abrist »

How long does a normal ping to the system take - longer than 10 seconds?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
barney
Posts: 10
Joined: Wed Jan 15, 2014 5:42 am

Re: Plugin time out error

Post by barney »

abrist wrote:How long does a normal ping to the system take - longer than 10 seconds?
Hi abrist, I ran some tests and from the command line every time it was approx. 4 seconds returning the following output...

./check_ping -H nile -w 3000.0,80% -c 5000.0,100% -p 10 -t 50
PING OK - Packet loss = 0%, RTA = 0.46 ms|rta=0.459500ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0


however Service State Information from the failed job in the GUI displays...

Host Status: DOWN (for 6d 6h 17m 32s)
Status Information: PING CRITICAL - Packet loss = 100%
Performance Data: rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
Current Attempt: 1/10 (HARD state)
Last Check Time: 16-01-2014 08:38:02
Check Type: ACTIVE
Check Latency / Duration: 2.968 / 14.196 seconds


as I said before this never used to happen but I don't understand why it isn't failing on the command line as well.

I've had a look for any logs that can throw some light on this but I can't find any.

any pointers? thanks in advance
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Plugin time out error

Post by slansing »

Can you copy the host definition to which that ping service is attached and share it here?
barney
Posts: 10
Joined: Wed Jan 15, 2014 5:42 am

Re: Plugin time out error

Post by barney »

slansing wrote:Can you copy the host definition to which that ping service is attached and share it here?
Hi slansing,

thanks for the reply, I'm not allowed to post the server name & IP address of the server on a public forum

back to square one I'm afraid

cheers
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Plugin time out error

Post by tmcdonald »

Well of course you can edit out the IP in the form of xxx.xxx.xxx.xxx and change the server name to <server name>, so back to whatever square we were just in.
Former Nagios employee
barney
Posts: 10
Joined: Wed Jan 15, 2014 5:42 am

Re: Plugin time out error

Post by barney »

tmcdonald wrote:Well of course you can edit out the IP in the form of xxx.xxx.xxx.xxx and change the server name to <server name>, so back to whatever square we were just in.
ok is this what you mean...

Extract from hosts.cfg

Code: Select all

define host{
        use                   linux-server
        host_name        nile
        alias                 nile
        address             xxxx.xxxx.xxxx.xxxx
        parents             pollux
        }
Extract from services.cfg

Code: Select all

define service{
        use                             local-service
        host_name                  *
        service_description      Ping Check
        check_command           check_ping!100.0,20%!500.0,60%
        }
Extract from commands.cfg

Code: Select all

# 'check-host-alive' command definition
define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
        }
N.B. $USER1$=/app/nagios/libexec

is that any use to you?

thanks in advance
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Plugin time out error

Post by slansing »

Can you post the output of the following commands in code-wrap? I have used your post above as an example:

Code: Select all

ls -la /app/nagios/libexec/
barney
Posts: 10
Joined: Wed Jan 15, 2014 5:42 am

Re: Plugin time out error

Post by barney »

slansing wrote:Can you post the output of the following commands in code-wrap? I have used your post above as an example:

Code: Select all

ls -la /app/nagios/libexec/
Hi slansing, before I do that do you need the entire list of output? There are a lot of bespoke scripts in that directory developed by our site and it will make for a rather large post.

Would it be easier if I narrowed the list down to something specific?

thanks
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Plugin time out error

Post by slansing »

I just wanted to verify the plugins were actually installed to that directory, can you run the following and show the output?:

Code: Select all

ls -la /app/nagios/libexec/check_ping

Code: Select all

/app/nagios/libexec/check_ping -H <hostaddress> -w 100.0,20% -c 500.0,60% -L
Locked