nrpe.cfg reconfig?

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
xpac
Posts: 54
Joined: Mon Aug 25, 2014 3:43 pm

nrpe.cfg reconfig?

Post by xpac »

So I would like to use a plugin for monitoring Cassandra, the only issue is that it requires support for command arguments in the NRPE daemon, which of course when I originally configured it I didn't use the "-enable-command-args" which apparently you need to do.

SO, can I just simply reconfigure NRPE on each remote host with the "-enable-command-args" or will this break something else? (I also have to change the nrpe.cfg file to dont_blame_nrep=1 but that's simple enough).
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: nrpe.cfg reconfig?

Post by jdalrymple »

If you use the same configure options that you used originally, usually just "--with-nagios-user" and "--with-nagios-group" then add in your "--enable-command-args" and you should be good to go. It sounds like you have a good handle on the process. Probably best to run through the process on a test/non-prod system, just for safety's sake.

Let use know if anything goes wrong.
xpac
Posts: 54
Joined: Mon Aug 25, 2014 3:43 pm

Re: nrpe.cfg reconfig?

Post by xpac »

Thanks, I got it to "work" at least in theory. I used these instructions in case anybody is interested:

http://labs.nagios.com/2014/08/21/monit ... ase-nodes/

However, it really doesn't work. Unfortunately I can't find anybody else on these forums that has tried to set up Cassandra monitoring via Nagios core. I wonder if the problem is that I'm using Nagios Core instead of XI (although that really doesn't make any sense as far as I can tell)?

The problem is I can get the command to not give me an error when run remotely via NRPE, however the output is completely useless, as no matter what it returns a value of 0 live database nodes which is completely incorrect - although the first time I saw the output it gave me quite a scare :lol: :lol:

It's so frustrating, because the script runs fine locally on the Cassandra node, but when I try to run it remotely it just gives me the erroneous information
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: nrpe.cfg reconfig?

Post by jdalrymple »

This is almost always troubleshootable and fixable.

Can you give us a more verbose indication of the error? What happens if you run the check_cassandra script/bin as the nagios user?
xpac
Posts: 54
Joined: Mon Aug 25, 2014 3:43 pm

Re: nrpe.cfg reconfig?

Post by xpac »

Ok, first I should mention that there was a permissions error which did not allow nagios user to run the "nodetool" command which is a VERY important part of this script :lol:

That said I fixed it, and still it gives me garbage info.

Here's the correct info, when I run the script remotely from the Nagios server via ssh (as nagios user):

ssh mytestcassandranode /usr/local/nagios/libexec/check_cassandra_cluster.sh -H localhost -P 7199 -w 1 -c 0

WARNING - Live Node:1 - 127.0.0.1:Normal,101.6,KB100.00%,-1405306065600736406 | Load_127.0.0.1=KB100.00% Owns_127.0.0.1=-1405306065600736406

This is the correct output.

However, if I try to run the command remotely using NRPE I get this incorrect info:

./check_nrpe -H mytestcassandranode-c check_cassandra_cluster -a '-H mytestcassandranode -P 7199 -w 1 -c 0'

CRITICAL - Live Node:0 - |

The command should take 5-10 seconds to run the command remotely, as it does when I run it using ssh. However, when I run it using NRPE it immediately returns that incorrect info.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: nrpe.cfg reconfig?

Post by jdalrymple »

Are you sure you have the argument passing bit working proper? To troubleshoot that I'd statically configure the args in the nrpe.cfg and run it from the Nagios server with just

Code: Select all

check_nrpe -H HOST -t 60 -c check_cassandra
xpac
Posts: 54
Joined: Mon Aug 25, 2014 3:43 pm

Re: nrpe.cfg reconfig?

Post by xpac »

To troubleshoot that I'd statically configure the args in the nrpe.cfg and run it from the Nagios server
Unfortunately that just gives me the same output as trying to run it from the Nagios server with arguments.

Also as another test, I was able to run the basic check_users command from the Nagios server, both using the hardcoded arguments and then I did it with argument passing and both worked fine.

The only other thing I haven't mentioned yet is that I'm testing this on some VirtualBox instances, but I don't see where this would cause any issue.

Out of ideas for the moment :oops:
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: nrpe.cfg reconfig?

Post by jdalrymple »

xpac wrote:Here's the correct info, when I run the script remotely from the Nagios server via ssh (as nagios user):

ssh mytestcassandranode /usr/local/nagios/libexec/check_cassandra_cluster.sh -H localhost -P 7199 -w 1 -c 0

WARNING - Live Node:1 - 127.0.0.1:Normal,101.6,KB100.00%,-1405306065600736406 | Load_127.0.0.1=KB100.00% Owns_127.0.0.1=-1405306065600736406

This is the correct output.

However, if I try to run the command remotely using NRPE I get this incorrect info:

./check_nrpe -H mytestcassandranode-c check_cassandra_cluster -a '-H mytestcassandranode -P 7199 -w 1 -c 0'
This is a stretch, but I am noticing that you're not putting localhost into the check args when using NRPE, but you are when using ssh. Is it possible that the node isn't resolving its own hostname properly?
xpac
Posts: 54
Joined: Mon Aug 25, 2014 3:43 pm

Re: nrpe.cfg reconfig?

Post by xpac »

This is a stretch, but I am noticing that you're not putting localhost into the check args when using NRPE, but you are when using ssh. Is it possible that the node isn't resolving its own hostname properly?
Unfortunately that isn't it.

So what I've done now is located an even simpler script that really only checks to verify that my nodes are alive (just to simplify things). I don't even have to provide the arguments as they're hard coded into the script. So, I ran the script from the host, and also ran the script locally via NRPE on the host (since localhost is allowed in the xinetd.d/nrpe conf file).

Same problem, running the script locally was just fine - when I ran the command via NRPE (locally) I got the same error/output.

It seems as though NRPE and/or xinetd is causing the problem, just not sure how or why.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: nrpe.cfg reconfig?

Post by jdalrymple »

Just run check_nrpe -H hostname from the Nagios host. Let's see if that even works.

Code: Select all

[jdalrymple@localhost libexec]$ ./check_nrpe -H 127.0.0.1
NRPE v2.15
Locked