check_nrpe -c check_procs Not Working

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
wrobj0
Posts: 17
Joined: Fri Dec 20, 2019 2:47 pm

check_nrpe -c check_procs Not Working

Post by wrobj0 »

I've read through at least ten different threads, and I cannot get this to work.

In the UI, the check is configured as such:

$USER1$/check_nrpe -2 -H $HOSTADDRESS$ -t 30 -c $ARG1$
ARG1 = check_procs -a '-c 1:1024 -Csshd'

This was working for a couple of years; I have no idea when it stopped working, but I discovered it today when trying to add a check for ns-slapd. It doesn't matter what process I choose, sshd, httpd, ntpd, etc., they all fail.

When I run the command from the UI, it returns nothing (just the full command, no error or anything). When I run it from the CLI, I get this:

Code: Select all

# /usr/local/nagios/libexec/check_nrpe -H <TARGET HOST> -t 30 -c check_procs -a "'-C sshd' '-c 1:'"
Usage:
check_procs -w <range> -c <range> [-m metric] [-s state] [-p ppid]
 [-u user] [-r rss] [-z vsz] [-P %cpu] [-a argument-array]
 [-C command] [-k] [-t timeout] [-v]
These are all the combinations I tried:

Code: Select all

# /usr/local/nagios/libexec/check_nrpe -H <TARGET_HOST> -c check_procs -a "'sshd' '1:'"
CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).

# /usr/local/nagios/libexec/check_nrpe -H <TARGET_HOST> -c check_procs -a '"sshd" "1:"'
CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).

# /usr/local/nagios/libexec/check_nrpe -H <TARGET_HOST> -c check_procs -a '"-C sshd" "-c 1:"'
CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).

# /usr/local/nagios/libexec/check_nrpe -H <TARGET_HOST> -c check_procs -a '"-C sshd" "-c 1:'
CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).

# /usr/local/nagios/libexec/check_nrpe -H <TARGET_HOST> -c check_procs -a "'-C sshd' '-c 1:'"
CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).


# /usr/local/nagios/libexec/check_nrpe -H <TARGET_HOST> -c check_procs -a ''-C sshd' '-c 1:''
Usage:
check_procs -w <range> -c <range> [-m metric] [-s state] [-p ppid]
 [-u user] [-r rss] [-z vsz] [-P %cpu] [-a argument-array]
 [-C command] [-k] [-t timeout] [-v]
In the client's /etc/nagios/nrpe.cfg, I have this:

Code: Select all

command[check_procs]=/usr/lib64/nagios/plugins/check_procs -C $ARG1$ -c $ARG2$
If I change the client's nrpe.cfg file to this:

Code: Select all

command[check_procs]=/usr/lib64/nagios/plugins/check_procs -C $ARG1$
The command works.

Code: Select all

# /usr/local/nagios/libexec/check_nrpe -H <TARGET HOST> -c check_procs -a "sshd"
PROCS OK: 3 processes with command name 'sshd' | procs=3;;;0;

Why?
Last edited by wrobj0 on Wed Aug 19, 2020 11:22 am, edited 1 time in total.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: check_nrpe -c check_procs Not Working

Post by benjaminsmith »

Hi,

Not quite why it was working before or what changes were made, but the following error is related to the arguments not getting passed correctly.

Code: Select all

 /usr/local/nagios/libexec/check_nrpe -H <TARGET HOST> -t 30 -c check_procs -a "'-C sshd' '-c 1:'"
Usage:
check_procs -w <range> -c <range> [-m metric] [-s state] [-p ppid]
[-u user] [-r rss] [-z vsz] [-P %cpu] [-a argument-array]
[-C command] [-k] [-t timeout] [-v
One way to simplify this is to just use a single argument in your command definition and pass all the options in quote to the check_proces command for example. Change the command in nrpe.cfg to this:

Code: Select all

command[check_procs]=/usr/lib64/nagios/plugins/check_procs $ARG1$
And then run setup your command as follows:

Code: Select all

./check_nrpe -H 192.168.23.144 -c check_procs '-a  -C sshd -c 1:'
If you make any changes to nrpe.cfg, be sure to restart the nrpe service or xinetd, depending how it was installed. Let me know if that will work you.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
wrobj0
Posts: 17
Joined: Fri Dec 20, 2019 2:47 pm

Re: check_nrpe -c check_procs Not Working

Post by wrobj0 »

Unfortunately, I still can't quite get this working.

I forgot to put this in the original post, but here's the relevant version information:
Nagios Server:
Red Hat Enterprise Linux Server release 7.8 (Maipo)
Nagios XI 5.6.14

TARGET_HOSTs:
NRPE: Version: 2.15 OS: RHEL 6.10 and RHEL 7.7
NRPE: Version: 4.0.3 OS: CentOS8

Our Nagios configuration has this in it from some time long before I joined the team. If I add a new Linux host using the configuration wizard, it uses this check_nrpe command (I added the -2 a few months back because it was filling our logs with version errors, and I found some thread on here that said to do that).

Code: Select all

define command {
    command_name    check_nrpe
    command_line    $USER1$/check_nrpe -2 -H $HOSTADDRESS$ -t 30 -c $ARG1$
}
$ARG1$ = check_$CMD '-$OPTS'

In this specific instance, ARG1 = check_procs -a '-c 1:1024 -Csshd', but the vast majority of our services are configured in this way, like checking time drift is ARG1 = check_time -a '-w 15 -c 30'. Particularly troubling is that

Like I said, that worked for years, and I have no idea why or when it stopped. This is particularly worrisome because it was by chance that I caught this. That all services report OK in Nagios despite that almost 0 of the commands work is a massive failure from Nagios. My boards should be glowing red, and my email should be flooded with service failures, but it took me adding and testing a new check to find this. I'm not sure if you work for them, or if I should submit this feedback another way, but this needs to get fixed even if the syntax I'm using is no longer valid.

Some relevant excerpts from TARGET_HOST's nrpe config file
dont_blame_nrpe=1
include_dir=/etc/nrpe.d/


Contents of the config file in /etc/nrpe.d

Code: Select all

command[check_swap]=/usr/lib64/nagios/plugins/check_swap $ARG1$
command[check_disk]=/usr/lib64/nagios/plugins/check_disk $ARG1$
command[check_load]=/usr/lib64/nagios/plugins/check_load $ARG1$
command[check_procs]=/usr/lib64/nagios/plugins/check_procs $ARG1$
command[check_time]=/usr/lib64/nagios/plugins/check_ntp_time -H net-ntp00.uchicago.edu $ARG1$
command[check_zombie_procs]=/usr/lib64/nagios/plugins/check_procs -s Z $ARG1$
command[check_mem]=/etc/nagios/scripts/check_mem.sh $ARG1$
command[check_mount]=/etc/nagios/scripts/check_mount.sh
command[check_disk_stat]=/etc/nagios/scripts/check_diskstat.sh $ARG1$
command[check_network_stats]=/etc/nagios/scripts/stat_net.pl
command[check_cpu_stats]=/etc/nagios/scripts/check_cpu_stats.sh $ARG1$
command[check_openmanage]=/etc/nagios/scripts/check_openmanage $ARG1$
Trying your suggestion from the CLI:

Code: Select all

[root@nagios]#/usr/local/nagios/libexec/check_nrpe -H $TARGET_HOST -c check_procs '-a -C sshd -c 1:'
PROCS WARNING: 1 process with args '-C' | procs=1;sshd;1:;0;
This is incorrect, though. Actual sshd processes running on TARGET_HOST

Code: Select all

[root@TARGET_HOST]# pgrep sshd | wc -l
3
Figuring it probably sees options as arguments, I moved the first apostrophe over, and it works.

Code: Select all

[root@nagios]#/usr/local/nagios/libexec/check_nrpe -H $TARGET_HOST -c check_procs -a '-C sshd -c 1:'
PROCS OK: 3 processes with command name 'sshd' | procs=3;;1:;0;
Replicating this in the WebUI:

This fails no matter how I try to structure the syntax using this command

Code: Select all

define command {
    command_name    check_nrpe
    command_line    $USER1$/check_nrpe -2 -H $HOSTADDRESS$ -t 30 -c $ARG1$
}
So I created a new command called check_procs. I did this through the WebUI, after which I applied the conifguration.

Code: Select all

define command {
    command_name    check_procs
    command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_procs $ARG1$ 
}
ARG1 = '-a -C sshd -c 1:'

Code: Select all

SQL Error [nagiosxi] : ERROR:  syntax error at or near "sshd"
LINE 1: ...TARGET_HOST -c check_procs \'-a -C sshd -c 1:...
                                                             ^

[nagios@nagios ~]$ /usr/local/nagios/libexec/check_nrpe -H aaaTARGET_HOST -c check_procs '-a -C sshd -c 1:'
Error submitting command
.

ARG1 = -a '-C sshd -c 1:'

Code: Select all

SQL Error [nagiosxi] : ERROR:  syntax error at or near "sshd"
LINE 1: ...TARGET_HOST -c check_procs -a \'-C sshd -c 1:...
                                                             ^

[nagios@nagios ~]$ /usr/local/nagios/libexec/check_nrpe -H aaaTARGET_HOST -c check_procs -a '-C sshd -c 1:'
Error submitting command.

So I modified this in TARGET_HOST's nrpe.cfg

Code: Select all

command[check_procs]=/usr/lib64/nagios/plugins/check_procs -a $ARG1$
I then updated the command in Nagios and applied the config again.

Code: Select all

define command {
    command_name    check_procs
    command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_procs -a $ARG1$ 
}

ARG1 = -C sshd -c 1:

Code: Select all

[nagios@nagios ~]$ /usr/local/nagios/libexec/check_nrpe -H aaaTARGET_HOST -c check_procs -a -C sshd -c 1:
    
ARG1 = '-C sshd -c 1:'

Code: Select all

SQL Error [nagiosxi] : ERROR:  syntax error at or near "sshd"
LINE 1: ...TARGET_HOST -c check_procs -a \'-C sshd -c 1:...
                                                             ^

[nagios@nagios ~]$ /usr/local/nagios/libexec/check_nrpe -H aaaTARGET_HOST -c check_procs -a '-C sshd -c 1:'
Error submitting command.
That's broken from the CLI, too, so I guess it's just not good syntax.

Code: Select all

[root@nagios]#/usr/local/nagios/libexec/check_nrpe -H $TARGET_HOST -c check_procs -a '-C sshd -c 1:'
PROCS CRITICAL: 0 processes with args '-Csshd' | procs=0;;1:;0;
wrobj0
Posts: 17
Joined: Fri Dec 20, 2019 2:47 pm

Re: check_nrpe -c check_procs Not Working

Post by wrobj0 »

I was trying to track down the date on which this stopped working, and I discovered that I received a critical notification on August 9.

I then decided to test alerting on a non-critical service, and found that the services are being monitored, it's just the check command part that's broken.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: check_nrpe -c check_procs Not Working

Post by benjaminsmith »

Hi,

It's very close but there are a few things with the syntax to modify. On the host side,npre.cfg, set your command as follows:

Code: Select all

command[check_procs]=/usr/local/nagios/libexec/check_procs $ARG1$
Next in Nagios XI, change the check_nrpe command back to the default. You can add the -2 option to force version 2 packest in case you are setting up checks with systems running older versions of the NRPE agent. For example:

Code: Select all

$USER1$/check_nrpe -2 -H $HOSTADDRESS$ -t 30 -c $ARG1$ $ARG2$
We tested the command with the following syntax, and they are working.

Code: Select all

[root@main-nagios-xi libexec]# ./check_nrpe -H <ip address> -c check_procs -a '-C sshd -c 1:'
PROCS OK: 2 processes with command name 'sshd' | procs=2;;1:;0;

Code: Select all

[root@main-nagios-xi libexec]# ./check_nrpe -H <ip address> -c check_procs -a '-C sshd' -c 1:
PROCS OK: 2 processes with command name 'sshd' | procs=2;;;0;
In the CCM, the arguments should look like this in the Nagios XI CCM.
check-procs.png
Let us know if you get it working.
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
wrobj0
Posts: 17
Joined: Fri Dec 20, 2019 2:47 pm

Re: check_nrpe -c check_procs Not Working

Post by wrobj0 »

It appears there's either a bug in the "Run Check Command" function causing it to fail when I use it, or there's something else somewhere in the configurations creating this problem.

I modified the command to work the way you suggested, and I get the same error.

Code: Select all

SQL Error [nagiosxi] : ERROR:  syntax error at or near "sshd"
LINE 1: ...2 -H TARGET_HOST -t 30 -c check_procs  -a \'-C sshd -c 1:...
                                                             ^

[[email protected] ~]$ /usr/local/nagios/libexec/check_nrpe -2 -H TARGET_HOST -t 30 -c check_procs  -a '-C sshd -c 1:'
Error submitting command.
So our existing monitoring is fine, at least in terms of service checks (our check_total_procs check is definitely broken on some hosts, but that check is pointless in our environment).

It's still super inconvenient to go back and forth between the WebUI and CLI, though, so what might cause the check command function to fail, despite identical configurations to yours?
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: check_nrpe -c check_procs Not Working

Post by lmiltchev »

I am not sure why this command is not working for you. It works for me just fine with the same arguments.

On the target machine, I have check_procs command defined as this:

Code: Select all

command[check_procs]=/usr/local/nagios/libexec/check_procs $ARG1$
Testing the plugin on the remove box:

Code: Select all

/usr/local/nagios/libexec/check_procs -C sshd -c 1:
PROCS OK: 2 processes with command name 'sshd' | procs=2;;1:;0;

Code: Select all

pgrep sshd | wc -l
2
On the Nagios XI, I have:

Code: Select all

define command {
    command_name    check_nrpe
    command_line    $USER1$/check_nrpe -2 -u -H $HOSTADDRESS$ -t 60 -c $ARG1$ $ARG2$
}
Testing from the CLI:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -2 -u -H 192.168.x.x -t 60 -c check_procs -a '-C sshd -c 1:'
PROCS OK: 2 processes with command name 'sshd' | procs=2;;1:;0;
In the GUI:
example-01.jpg
example-02.jpg
Try the following:

1. Check your db log for errors/crashed tables, and repair the db.
https://assets.nagios.com/downloads/nag ... tabase.pdf

2. Make sure your plugin works locally (on the target machine):

Code: Select all

/usr/lib64/nagios/plugins/check_procs -C sshd -c 1:
3. Make sure that your command is defined as this:

Code: Select all

command[check_procs]=/usr/lib64/nagios/plugins/check_procs $ARG1$
and there are no other definitions of the same command in the include directory (/etc/nrpe.d) or any other place.

4. Restart NRPE daemon on the target machine after making the changes. If NRPE is not running as a standalone daemon, you may need to restart xinetd instead.

5. Test your command from the CLI and the GUI on the Nagios XI server.

If you continue to have issues, we may need to move this to our ticketing system, and schedule a remote session.
You do not have the required permissions to view the files attached to this post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked