check_nrpe returning error code 255 when connection refused

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
cgoerner
Posts: 2
Joined: Wed Apr 09, 2014 1:11 am

check_nrpe returning error code 255 when connection refused

Post by cgoerner »

I have been trialling the Nagios XI appliance (2014R2.7) and have noticed that check_nrpe returns error code 255 if the remote nrpe agent is not running. I would have expected this to be a 2 (CRITICAL).

For example:

[root@nagiosxi-64 ~]# /usr/local/nagios/libexec/check_nrpe -H MYIPADDRESS
connect to address MYIPADDRESS port 5666: Connection refused
connect to host MYIPADDRESS port 5666: Connection refused

[root@nagiosxi-64 ~]# echo $?
255

MYIPADDRESS is a server with the nrpe agent installed but *not running*. When the agent is running, it works as expected. The problem is specifically that when check_nrpe gets "connection refused", it returns 255.

When nagios executes these check, it isn't able to interpret 255, so the check status is:
(Return code of 255 is out of bounds)

The version of check_nrpe included with the appliance is 2.15.

It looks like this problem has been addressed in this pull request from October 2014:
https://github.com/NagiosEnterprises/nrpe/pull/13

Is this fix likely to be incorporated into the version of check_nrpe that is shipped with Nagios XI?

Many thanks.

--
Chris
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: check_nrpe returning error code 255 when connection refu

Post by abrist »

cgoerner wrote:Is this fix likely to be incorporated into the version of check_nrpe that is shipped with Nagios XI?
I am not sure if it is, but it can easily be patched:

Code: Select all

cd /tmp
wget https://github.com/NagiosEnterprises/nrpe/archive/master.zip
unzip master.zip
cd nrpe-master/src/
wget https://github.com/SteveLowe/nrpe/commit/a788d94b0c9dd4e130c1b06339f947774a798560.patch -O 255.patch
patch < 255.patch
cd ..
./configure
make
make install
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: check_nrpe returning error code 255 when connection refu

Post by jdalrymple »

This was just discussed in another thread in the customer only forum so the knowledge is very fresh in my brain.

It's very likely that 255 was used for a reason. That reason is likely that "check_nrpe" is absolutely not intended to check whether or not the nrpe service is running or not. It's intent is to check disk or check memory or something else on a remote host. So while more philisophical in nature - some might argue that because NRPE is down any result that we would give back regarding the state of our disk or memory check would be invalid.

Me personally, I think it should return UNKNOWN, and as such I've modified code in my personal repositories to reflect that. However - we all have our own opinions. What I'm saying is that I wouldn't count on that pull request to get included, and furthermore who knows when the next NRPE update will be. If you want the change made I would consider making it yourself then using your own NRPE source for your systems.

FWIW, the alternative that our customer chose was to use a service dependancy check_tcp for check_nrpe. This is also a very logical solution, just a bit of legwork.
cgoerner
Posts: 2
Joined: Wed Apr 09, 2014 1:11 am

Re: check_nrpe returning error code 255 when connection refu

Post by cgoerner »

Thanks abrist and jdalrymple.

That philosophy makes some sense, but UNKNOWN does seem to be a better choice.

It sounds like a workaround is going to be the best solution.

Thanks for your help. Much appreciated.
Locked