Nagios/NRPE Return code 127 error

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
00_kl250
Posts: 63
Joined: Tue Apr 16, 2013 7:26 am

Nagios/NRPE Return code 127 error

Post by 00_kl250 »

First I just wanted to say thank you in advance to anyone that reads this and is able to provide some guidance.

Over the past few days i've been beating my head against the desk trying to get nagios core 3.3.1, running on cent OS 6, to monitor a remote machine running on SLES 11 SP2 via NRPE. Here are all the details:

Nagios server running core 3.3.1. IP address is 192.168.0.29. The firewall is turned off on this machine. When I go to the /usr/local/nagios/libexec directory and run the following command, ./check_nrpe -H 192.168.0.154 - (this is the client i'm trying to monitor) I get a correct return value, NRPE v2.14.

From my nagios server I can telnet to the client over port 5666.

My client is running SLES 11 SP2 with the firewall turned off. nrpe-2.14 and nagios plugins 1.4.16 are installed. I can ping the server(192.168.0.29), I can telnet to port 80, but I can't telnet to port 5666.

On the nagios server i'm trying to monitor two things on the client, "Apache Check" and "total processes". On the client, I have these defined in the nrpe.cfg file located at /usr/local/nagios/etc/nrpe.cfg

NRPE.CFG
command[check_apache]=/usr/local/nagios/libexec/check_procs -c 1:30 -C httpd2-prefork
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

On the server I have the following commands defined in the commands.cfg file:

# 'check_apache' command definition
define command{
command_name check_apache
command_line check_nrpe!check_apache
}

# 'check_processes' command definition
define command{
command_name check_total_procs
command_line check_nrpe!check_total_procs
}

In my windows.cfg file, I have the services defined:


define service{
use generic-service
hostgroup_name suse-servers
service_description apache check
check_command check_apache
}

define service{
use generic-service
hostgroup_name suse-servers
service_description total processes
check_command check_total_procs
}

When I run the pre flight check I get 0 errors and 0 warnings. Inside my nagios dashboard for the two services, "Apache Check" and "Total Processes" I get the following in the status information column, (Return code of 127 is out of bounds - plugin may be missing).

I have googled the crap out of the internet and have tried everything possible that I've come across. I'm almost to the point of wiping the client machine and reinstalling and starting from scratch. The only thing that jumps out at me is that I can not telnet to port 5666 on the server from the client, but the firewall is off on my nagios server(running cent os 6).

Any help would be much appreciated.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Nagios/NRPE Return code 127 error

Post by slansing »

When you run:

Code: Select all

/usr/local/nagios/libexec/check_procs -c 1:30 -C httpd2-prefork
Locally on the SLES machine what does it return?

Have you made sure the plugins you are using are executable?
00_kl250
Posts: 63
Joined: Tue Apr 16, 2013 7:26 am

Re: Nagios/NRPE Return code 127 error

Post by 00_kl250 »

Thanks for the response, when i run the command i get the following return:

PROCS OK: 6 processes with command name 'httpd2-prefork'


The plugins do seem executable as they run fine locally
User avatar
gshergill
Posts: 231
Joined: Tue Aug 07, 2012 5:08 am

Re: Nagios/NRPE Return code 127 error

Post by gshergill »

Hi 00_kl250,

Just something to mention, I can't telnet from any of my remote servers (windows or linux) to my Nagios machines, but my Nagios machines can telnet to them (all over 5666), can your Nagios machine telnet your remote server?

If not, maybe the firewall is blocking it on your remote server?

From the command line on the Nagios server run;

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H <remote server ip>
This should return the NRPE version number. Make sure to change the path to the plugin as appropriate.

Good luck!

Kind Regards,

Gary Shergill
00_kl250
Posts: 63
Joined: Tue Apr 16, 2013 7:26 am

Re: Nagios/NRPE Return code 127 error

Post by 00_kl250 »

Hi gshergill,

I can telnet to port 5666 from the server to the client.

I CAN'T telnet to port 5666 from the client to the server. I can however telnet to other ports on the server from the client such as port 80.

BTW, when i run the command on the server, /usr/local/nagios/libexec/check_nrpe -H 192.168.0.154

I get NRPE v2.14 which is what it should return.
00_kl250
Posts: 63
Joined: Tue Apr 16, 2013 7:26 am

Re: Nagios/NRPE Return code 127 error

Post by 00_kl250 »

I dont know if this helps at all, but on the nagios server, in the /usr/local/nagios/var location I run a tail -f nagios.log file. When I do a "force check on all services" oh the nagios gui i get this output in the log file:

[1366122754] EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_SVC_CHECKS;SLES-TEST2;1366122754
[1366122763] Warning: Return code of 127 for check of service 'total processes' on host 'SLES-TEST2' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366122763] Warning: Return code of 127 for check of service 'apache check' on host 'SLES-TEST2' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1366122763] Warning: Return code of 127 for check of service 'Disk Check' on host 'SLES-TEST2' was out of bounds. Make sure the plugin you're trying to run actually exists.


I do know the plugins exist on the client because I was able to run them in the post above.
User avatar
gshergill
Posts: 231
Joined: Tue Aug 07, 2012 5:08 am

Re: Nagios/NRPE Return code 127 error

Post by gshergill »

Hi 00_kl250,

Sorry my bad, didn't realise you had answered my other questions in your first my, totally missed it!

I was able to reproduce your issue, the following should fix it:

commands.cfg (change path as required);

Code: Select all

define command {
       command_name     check_nrpe
       command_line     /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -t 30 -c $ARG1$
}
services.cfg;

Code: Select all

define service{
use generic-service
hostgroup_name suse-servers
service_description apache check
check_command check_nrpe!check_apache
}
As far as I know, the command_line can't use the "!" syntax used in check_command to separate arguments (I'm guessing it tried to run a plugin called "check_nrpe!check_apache", which of course won't exist), which I confirmed when I tried your command_line on my test system (which has nrpe working).

Kind Regards,

Gary Shergill
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Nagios/NRPE Return code 127 error

Post by slansing »

If you were to call the check remotely from Nagios how do you define the check? I.E:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H 192.168.0.154 -p 5666 -c check_procs '-w 150 -c 200'
Edit: Good catch Gary, yes the command_line can't use ! as it is not valid, you would need to specify arguments if you are separating anything with !'s at a ratio of one "!" to one "$ARG$" as Gary showed the first "!" would be denoted as $ARG1$, the second as $ARG2$ and so on. This should help in the future as well!
User avatar
gshergill
Posts: 231
Joined: Tue Aug 07, 2012 5:08 am

Re: Nagios/NRPE Return code 127 error

Post by gshergill »

Hi 00_kl250,

Alternatively, you can use the following:

Code: Select all

define command {
       command_name     check_apache
       command_line     /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -t 30 -c check_apache
}

define command {
       command_name     check_total_procs
       command_line     /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -t 30 -c check_total_procs
}

define service{
use generic-service
hostgroup_name suse-servers
service_description apache check
check_command check_apache
}

define service{
use generic-service
hostgroup_name suse-servers
service_description total processes
check_command check_total_procs
}
Both should work fine, but this one keeps more inline with what you currently have.

Kind Regards,

Gary Shergill
00_kl250
Posts: 63
Joined: Tue Apr 16, 2013 7:26 am

Re: Nagios/NRPE Return code 127 error

Post by 00_kl250 »

This works perfectly! Thanks so much for your help!

One question I have, what does the "-t 30" argument do?
Locked