Page 1 of 1
NRPE 2 versus 3
Posted: Mon Mar 06, 2017 5:20 pm
by SteveBeauchemin
I think I'd like to use nrpe 3.0.1
The problem with that is the incompatibility between the 2 and 3. The version 3 is not a simple drop in replacement. If I have nrpe 3 on a remote host, do I have to also run check_nrpe 3 on my core nagios host or use different parameters? If I use check_nrpe 3 to query nrpe 2, then do I need to use different parameters again. Does this mean that I have to know the difference between a client running 2 versus 3. Do I have to double up the Service tests, and host groups, in order to move from one version to another, one a version 2 and one a version 3 for the same test? I am very confused.
Has anyone developed a Migration process to move from NRPE 2.1.2 to NRPE 3.0.1 yet? If so, please chime in.
Would it be possible to get the Version 2 and 3 to be compatible so we can easily migrate? Do I have to compile nrpe 1500 times?
I think I'd like to use nrpe 3.0.1 - but I fear that I'll be on nrpe 2 for a very long time because they are incompatible. And I am not even asking about NSClient and check_nrpe3.
Here's a rock - here I am - here's a hard place. Help, I'm stuck! Please advise...
Steve B
Re: NRPE 2 versus 3
Posted: Tue Mar 07, 2017 1:04 pm
by WillemDH
Steve,
I built a check_nrpe_v3 and I'm working on migrating my Windows clients automatically with a Rundeck job.
Does this mean that I have to know the difference between a client running 2 versus 3
Yes, as the command is different, you will need to know what version the client is using in orde to use the correct command.
Do I have to double up the Service tests, and host groups, in order to move from one version to another, one a version 2 and one a version 3 for the same test? I am very confused.
Yes, you will need different templates / commands for NRPE v3.
Would it be possible to get the Version 2 and 3 to be compatible so we can easily migrate?
Would be nice, but as Nagios and NSCLient are not playing along very well, support is limited.. So I guess this might work one day for Linux and the NRPE agent, but not for NSClient.
Do I have to compile nrpe 1500 times?
It should be possible to automate this?
I'll see if I can pass my function to detect the NSClient version for you.
Grtz
Re: NRPE 2 versus 3
Posted: Tue Mar 07, 2017 1:36 pm
by tmcdonald
I spoke with our NRPE dev to confirm, and he said that:
v3 check_nrpe will talk to a v2 server (it will try v3 protocol first, so there's a little delay) and a v3 server will talk to a v2 check_nrpe.
and from the
changelog:
- Added support for version 3 variable sized packets up to 64KB. nrpe will
accept either version from check_nrpe. check_nrpe will try to send a
version 3 packet first, and fall back to version 2. check_nrpe can be forced
to only send version 2 packets if the switch `-2` is used. (John Frickson)
Have you tested this? If it is not working then it might be a bug and we can probably fix that for you a lot easier than migrating all of your checks.
Re: NRPE 2 versus 3
Posted: Tue Mar 07, 2017 1:55 pm
by SteveBeauchemin
Let me get some more specific and repeatable data for you. I did try to just drop in nrpe3 but the nsclient tests failed to run properly so I had to step back from that.
To be fair, I will get more specific data.
Also, I use mod_gearman, and have a slightly mixed setup right now. My Nagios core is totally updated and running Red Hat 7 Nagios XI 5.4.2. but still on nrpe 2.1.2. The Gearman workers are still Red Hat 6. I am trying to build a Red Hat 7 gearman worker and that is what triggered this question. I would like to build a new mod_gearman with nrpe 3, but initial research gave me a hard stop on that.
I will add to this post with real data soon.
Thanks
Steve B
Re: NRPE 2 versus 3
Posted: Tue Mar 07, 2017 2:46 pm
by dwhitfield
SteveBeauchemin wrote: I did try to just drop in nrpe3 but the nsclient tests failed to run properly so I had to step back from that.
As
@WillemDH already alluded to, NSClient does have some incompatibilities with NRPE, but I thought they had broken everything for NRPE 2.x, not 3.
Since you are thinking of a "rewrite" anyway, have you considered NCPA?
We await more specific data.

Re: NRPE 2 versus 3
Posted: Tue Mar 07, 2017 3:16 pm
by SteveBeauchemin
Tested using check_nrpe from my core server to my NSClient systems. I have 3 versions of NSClient++ in use.
It seems that the on a system running the 0.3.9 NSClient++ agent is where I saw bad stuff happen, and caused me to NOT use the check_nrpe 3 on my core system. At the time, I was dealing with migrating to new OS and new Nagios XI version. When I saw the problem, I just reverted the core server version back to 2.15. Now, looking again, taking some time to analyze, I see that only the very old NSClient agent has a problem.
Code: Select all
./check_nrpe-Version-2.15 -H [various hosts with Different NSClient versions]
I (0.3.9.328 2011-08-16) seem to be doing fine...
I (0.4.4.19 2015-12-08) seem to be doing fine...
I (0.5.0.62 2016-09-14) seem to be doing fine...
./check_nrpe-Version-3.0.1 -H [Same hosts same order]
Could not construct return packet in NRPE handler check client side (nsclient.log) logs...
I (0.4.4.19 2015-12-08) seem to be doing fine...
I (0.5.0.62 2016-09-14) seem to be doing fine...
Now that I know better, to solve this, I simply changed 4 command definitions to use the version 2.15 check_nrpe. For the small handful of hosts with Old agents, they now look like this:
Code: Select all
$USER1$/check_nrpe-Version-2.15 -H $HOSTADDRESS$ ...
So they are using the old binary. The rest of my tests on my core server are now using the check_nrpe version 3. So far so good.
Damn I wish I knew this a couple weeks ago. So
it is a drop in replacement, as long as you are not using pre-historical code on remote systems.
Willem - you have version 0.3.9 deployed? Moving from 0.3.9 to new version did require me to spend a lot of time changing test syntax. I can jump start you by sending some examples if you like. You're probably okay already, but let me know.
It would be so much easier if we did stuff like Management thinks. Don't you just push a button and it works? When it is not working they ask - Did you push the button?
Well, now, yes. I pushed the button. And the migration methodology is easy, post something stupid to a forum, then actually do some research and testing.
Thanks
Steve B
(close this if you like...)
Re: NRPE 2 versus 3
Posted: Tue Mar 07, 2017 4:17 pm
by WillemDH
Steve,
This is the function I use in my Rundeck scripts to determine the version of NRPE:
Code: Select all
WriteLog () {
ScriptName=$(basename "${BASH_SOURCE}")
if [ ! -z "$ScriptName" ] ; then
ScriptName="$(basename "$(readlink -f "$0")")"
fi
if [ -z "$1" ] ; then
echo "WriteLog: Log parameter #1 is zero length. Please debug..."
exit 1
else
if [ -z "$2" ] ; then
echo "WriteLog: Severity parameter #2 is zero length. Please debug..."
exit 1
else
if [ -z "$3" ] ; then
echo "WriteLog: Message parameter #3 is zero length. Please debug..."
exit 1
fi
fi
fi
Now=$(date '+%Y-%m-%d %H:%M:%S,%3N')
if [ "$1" = "Debug" ] && [ "$Debug" = 1 ] ; then
echo "$Now: $ScriptName: $2: Debug: $3 "
elif [ "$1" = "Verbose" ] && [ "$Verbose" = 1 ] ; then
echo "$Now: $ScriptName: $2: $3"
elif [ "$1" = "Output" ] ; then
echo "${Now}: $ScriptName: $2: $3"
elif [ -f "$1" ] ; then
echo "${Now}: $ScriptName: $2: $3" >> "$1"
fi
if [ ! -z "$LogLocal" ] ; then
if [ "$1" = "Debug" ] && [ "$Debug" = 1 ] ; then
echo "$Now: $ScriptName: $2: Debug: $3 " >> "$LogLocal"
elif [ "$1" = "Verbose" ] && [ "$Verbose" = 1 ] ; then
echo "$Now: $ScriptName: $2: $3" >> "$LogLocal"
elif [ "$1" = "Output" ] ; then
echo "${Now}: $ScriptName: $2: $3" >> "$LogLocal"
fi
fi
}
CheckNrpeWindows () {
WinHost=$1
NrpeTest="$(/usr/local/nagios/libexec/check_nrpe_v3 -2 -P 10240 -H $WinHost)"
WriteLog Debug Info "NrpeTest $WinHost: $NrpeTest"
if [[ $NrpeTest = *"Receive underflow"* ]] ; then
NrpeCom="/usr/local/nagios/libexec/check_nrpe"
elif [[ $NrpeTest = *"seem to be doing fine"* ]] ; then
NrpeCom="/usr/local/nagios/libexec/check_nrpe_v3 -2 -P 10240"
elif [[ $NrpeTest = *"Could not complete SSL handshake"* ]] ; then
WriteLog Output Error "NRPE: Could not complete SSL handshake. Please debug. "
exit 2
else
WriteLog Output Error "NRPE detection failed. Please debug. "
exit 2
fi
}
When
is detected in the
Code: Select all
/usr/local/nagios/libexec/check_nrpe_v3 -2 -P 10240 -H $WinHost
commùand it's using NRPE 2.15.
I'm migrating from 0.4.1.105 to 0.5.0.62 and at the same time upgrading NRPE by setting the
in nsclient.ini
Never used 0.3.x NSCLients.
I have a test group of 12 Windows servers which have been migrated. 580 more to do. I've implemented it so that I can switch fast and easy with the help of some Rundeck jobs between NSCLient 0.4.1.105 with NRPE 2.15 and NSCLient 0.5.0.62 and NRPE v3.
Grtz
Re: NRPE 2 versus 3
Posted: Tue Mar 07, 2017 5:06 pm
by cdienger
Hi Steve & Willem, just want to get a consensus here before locking the thread, were there any questions regarding the code Willem provided or would either of you like to keep the thread open a bit longer?
Re: NRPE 2 versus 3
Posted: Wed Mar 08, 2017 10:41 am
by SteveBeauchemin
I am okay to close this. If and When I get to having Rundeck, I will probably look here for the data Willem provided.
Thanks.
Steve B