For some weird reason, check_nrpe 2.15 returns "255" when it gets a "connection refused" due to the NRPE daemon is not running on the other end.
This is really bad because it results in false negatives.
See this example:
# with check_nrpe 2.12:
$ /usr/local/nagios/libexec/check_nrpe.2.12 -n -H somehost
Connection refused by host
$ echo $?
2 # NOTE : CORRECT RESPONSE, "2" OR CRITICAL
# with check_nrpe 2.15:
$ /usr/local/nagios/libexec/check_nrpe -n -H somehost
connect to address 192.168.1.5 port 5666: Connection refused
connect to host 192.168.1.5 port 5666: Connection refused
$ echo $?
255 #NOTE: NAGIOS SERVER REGISTERS THIS OK AND GREEN (false negative).
Many thanks for your help!
CP
check_nrpe 2.15 returns
- rexconsulting
- Posts: 60
- Joined: Fri May 04, 2012 4:27 pm
- Location: Oakland, CA
- Contact:
check_nrpe 2.15 returns
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
Re: check_nrpe 2.15 returns
I just tested with NRPE 3 and it works properly:
Please try that and let us know the results.
Thank you
Code: Select all
[root@ssc66c libexec]# ./check_nrpe -H 192.168.4.124
connect to address 192.168.4.124 port 5666: Connection refused
connect to host 192.168.4.124 port 5666: Connection refused
[root@ssc66c libexec]# echo $?
2Thank you
- rexconsulting
- Posts: 60
- Joined: Fri May 04, 2012 4:27 pm
- Location: Oakland, CA
- Contact:
Re: check_nrpe 2.15 returns
Thanks. I checked and the problem does not exist with check_nrpe 3.0, but along the way I re-compiled 2.15 and cannot reproduce the problem with my newly compiled 2.15, which is really odd. There must have been something different in the compile or something. Note line 158 of the 2.15 version of "check_nrpe.c". This is the line with "exit (255)" which renders the exit code 255. Why does it even do that? The number "255" is not a valid response for the Nagios server. It should only be 0, 1, 2, or 3 right?
And another issue on a higher level how Nagios is OK with a non-zero return code. Of course a lot of people are going to want to set up servicedependencies for NRPE-based checks to depend on the NRPE. So if the NRPE is down, and exit code 255 is returned, then the NRPE and all the services that depend on it show OK/Green. They are not being checked since the NRPE is not running but no alarm or notification.
CP
And another issue on a higher level how Nagios is OK with a non-zero return code. Of course a lot of people are going to want to set up servicedependencies for NRPE-based checks to depend on the NRPE. So if the NRPE is down, and exit code 255 is returned, then the NRPE and all the services that depend on it show OK/Green. They are not being checked since the NRPE is not running but no alarm or notification.
CP
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
- rexconsulting
- Posts: 60
- Joined: Fri May 04, 2012 4:27 pm
- Location: Oakland, CA
- Contact:
Re: check_nrpe 2.15 returns
PS. Server is core, 4.1.1.
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
Re: check_nrpe 2.15 returns
Correct. 255 is an internal exit code that correlates with an NRPE error. It's not something that strictly associated with Nagios Core.rexconsulting wrote:This is the line with "exit (255)" which renders the exit code 255. Why does it even do that? The number "255" is not a valid response for the Nagios server. It should only be 0, 1, 2, or 3 right?
This is a bigger issue for sure. I would suggest raising an issue on the Nagios Core github (best way to get things fixed):rexconsulting wrote:And another issue on a higher level how Nagios is OK with a non-zero return code.
https://github.com/NagiosEnterprises/nagioscore
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
- rexconsulting
- Posts: 60
- Joined: Fri May 04, 2012 4:27 pm
- Location: Oakland, CA
- Contact:
Re: check_nrpe 2.15 returns
FYI: I opened up a nagioscore ticket 144 at GitHub for this: https://github.com/NagiosEnterprises/na ... issues/144. - CP
CP
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
--
Chris Paul
Rex Consulting, Inc
5652 Florence Terrace, Oakland, CA 94611
email: [email protected]
web: http://www.rexconsulting.net
phone, toll-free: +1 (888) 403-8996 ext 1
Re: check_nrpe 2.15 returns
Thanks for the update! Is it alright if we lock this thread and mark the issue as resolved?
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/