Nagios/NRPE Return code 127 error
Posted: Tue Apr 16, 2013 7:44 am
First I just wanted to say thank you in advance to anyone that reads this and is able to provide some guidance.
Over the past few days i've been beating my head against the desk trying to get nagios core 3.3.1, running on cent OS 6, to monitor a remote machine running on SLES 11 SP2 via NRPE. Here are all the details:
Nagios server running core 3.3.1. IP address is 192.168.0.29. The firewall is turned off on this machine. When I go to the /usr/local/nagios/libexec directory and run the following command, ./check_nrpe -H 192.168.0.154 - (this is the client i'm trying to monitor) I get a correct return value, NRPE v2.14.
From my nagios server I can telnet to the client over port 5666.
My client is running SLES 11 SP2 with the firewall turned off. nrpe-2.14 and nagios plugins 1.4.16 are installed. I can ping the server(192.168.0.29), I can telnet to port 80, but I can't telnet to port 5666.
On the nagios server i'm trying to monitor two things on the client, "Apache Check" and "total processes". On the client, I have these defined in the nrpe.cfg file located at /usr/local/nagios/etc/nrpe.cfg
NRPE.CFG
command[check_apache]=/usr/local/nagios/libexec/check_procs -c 1:30 -C httpd2-prefork
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
On the server I have the following commands defined in the commands.cfg file:
# 'check_apache' command definition
define command{
command_name check_apache
command_line check_nrpe!check_apache
}
# 'check_processes' command definition
define command{
command_name check_total_procs
command_line check_nrpe!check_total_procs
}
In my windows.cfg file, I have the services defined:
define service{
use generic-service
hostgroup_name suse-servers
service_description apache check
check_command check_apache
}
define service{
use generic-service
hostgroup_name suse-servers
service_description total processes
check_command check_total_procs
}
When I run the pre flight check I get 0 errors and 0 warnings. Inside my nagios dashboard for the two services, "Apache Check" and "Total Processes" I get the following in the status information column, (Return code of 127 is out of bounds - plugin may be missing).
I have googled the crap out of the internet and have tried everything possible that I've come across. I'm almost to the point of wiping the client machine and reinstalling and starting from scratch. The only thing that jumps out at me is that I can not telnet to port 5666 on the server from the client, but the firewall is off on my nagios server(running cent os 6).
Any help would be much appreciated.
Over the past few days i've been beating my head against the desk trying to get nagios core 3.3.1, running on cent OS 6, to monitor a remote machine running on SLES 11 SP2 via NRPE. Here are all the details:
Nagios server running core 3.3.1. IP address is 192.168.0.29. The firewall is turned off on this machine. When I go to the /usr/local/nagios/libexec directory and run the following command, ./check_nrpe -H 192.168.0.154 - (this is the client i'm trying to monitor) I get a correct return value, NRPE v2.14.
From my nagios server I can telnet to the client over port 5666.
My client is running SLES 11 SP2 with the firewall turned off. nrpe-2.14 and nagios plugins 1.4.16 are installed. I can ping the server(192.168.0.29), I can telnet to port 80, but I can't telnet to port 5666.
On the nagios server i'm trying to monitor two things on the client, "Apache Check" and "total processes". On the client, I have these defined in the nrpe.cfg file located at /usr/local/nagios/etc/nrpe.cfg
NRPE.CFG
command[check_apache]=/usr/local/nagios/libexec/check_procs -c 1:30 -C httpd2-prefork
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
On the server I have the following commands defined in the commands.cfg file:
# 'check_apache' command definition
define command{
command_name check_apache
command_line check_nrpe!check_apache
}
# 'check_processes' command definition
define command{
command_name check_total_procs
command_line check_nrpe!check_total_procs
}
In my windows.cfg file, I have the services defined:
define service{
use generic-service
hostgroup_name suse-servers
service_description apache check
check_command check_apache
}
define service{
use generic-service
hostgroup_name suse-servers
service_description total processes
check_command check_total_procs
}
When I run the pre flight check I get 0 errors and 0 warnings. Inside my nagios dashboard for the two services, "Apache Check" and "Total Processes" I get the following in the status information column, (Return code of 127 is out of bounds - plugin may be missing).
I have googled the crap out of the internet and have tried everything possible that I've come across. I'm almost to the point of wiping the client machine and reinstalling and starting from scratch. The only thing that jumps out at me is that I can not telnet to port 5666 on the server from the client, but the firewall is off on my nagios server(running cent os 6).
Any help would be much appreciated.