check_nrpe problems: Unable to read output and seteuid(0)

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
markgreene
Posts: 11
Joined: Mon Jun 17, 2019 9:44 am

check_nrpe problems: Unable to read output and seteuid(0)

Post by markgreene »

I have recently setup a new XI system using the off-line tarball install. The installation ran without errors, and adding hosts and services to monitor is mostly going OK, except for check_nrpe.

Host being monitored:
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.6 (Maipo)

rpm -qa |egrep -i "nagios|nrpe"
nagios-plugins-ssh-2.2.1-16.20180725git3429dad.el7.x86_64
nrpe-3.2.1-8.el7.x86_64
nagios-plugins-2.2.1-16.20180725git3429dad.el7.x86_64
nagios-plugins-swap-2.2.1-16.20180725git3429dad.el7.x86_64
nagios-plugins-perl-2.2.1-16.20180725git3429dad.el7.x86_64
nagios-plugins-load-2.2.1-16.20180725git3429dad.el7.x86_64
nagios-plugins-http-2.2.1-16.20180725git3429dad.el7.x86_64
nagios-plugins-ntp-2.2.1-16.20180725git3429dad.el7.x86_64
nagios-plugins-nagios-2.2.1-16.20180725git3429dad.el7.x86_64
nagios-common-4.4.3-1.el7.x86_64
nagios-plugins-nrpe-3.2.1-8.el7.x86_64
nagios-plugins-disk-2.2.1-16.20180725git3429dad.el7.x86_64

Nagios system:

cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.6 (Maipo)

rpm -qa |egrep -i "nagios|nrpe"
nagiosxi-pnp-5.6.3-1.el7.x86_64
nagiosxi-nagiosplugins-5.6.3-1.el7.x86_64
nagiosxi-nrpe-5.6.3-1.el7.x86_64
nagiosxi-nsca-5.6.3-1.el7.x86_64
perl-Nagios-Monitoring-Plugin-0.51-1.el7.noarch
nagiosxi-nxti-5.6.3-1.el7.x86_64
nagiosxi-ndoutils-5.6.3-1.el7.x86_64
nagiosxi-wkhtmltox-5.6.3-1.el7.x86_64
nagios-repo-7-3.el7.noarch
nagiosxi-nagvis-5.6.3-1.el7.x86_64
nagiosxi-shellinabox-5.6.3-1.el7.x86_64
nagiosxi-nrds-5.6.3-1.el7.x86_64
nagiosxi-wmic-5.6.3-1.el7.x86_64
nagiosxi-5.6.3-1.el7.x86_64
nagiosxi-mrtg-5.6.3-1.el7.x86_64
nagiosxi-nagioscore-5.6.3-1.el7.x86_64
nagiosxi-nagiosmobile-5.6.3-1.el7.x86_64

There is no proxy between the nagios system and the hosts being monitored;
Both the nagios system and the monitored hosts are VMware virtual Linux systems.

tail -f /var/log/messages
Jun 17 10:12:29 cliplsat01 nrpe[20535]: CONN_CHECK_PEER: checking if host is allowed: 172.20.132.62 port 57483
Jun 17 10:12:29 cliplsat01 nrpe[20535]: is_an_allowed_host (AF_INET): is host >172.20.132.62< an allowed host >172.20.132.62<
Jun 17 10:12:29 cliplsat01 nrpe[20535]: is_an_allowed_host (AF_INET): is host >172.20.132.62< an allowed host >172.20.132.62<
Jun 17 10:12:29 cliplsat01 nrpe[20535]: is_an_allowed_host (AF_INET): host
is in allowed host list!
Jun 17 10:12:29 cliplsat01 nrpe[20536]: WARNING: my_system() seteuid(0): Operation not permitted

I can run the plugin on the system at the command line just fine:
$ ./check_mem.py -w10 -c5
OK: Free memory percentage is 57% (18378 MB)

And as the "nrpe" userID:
$ sudo -u nrpe /usr/lib64/nagios/plugins/check_mem.py -w 10 -c 5
OK: Free memory percentage is 57% (18377 MB)

Nrpe runs as user "nrpe";
ps -ef |grep nrpe
nrpe 17549 1 0 09:57 ? 00:00:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f -n

The host IP address has been added to "allowed_hosts" in /etc/nagios/nrpe.cfg;
"dont_blame_nrpe" has been set to 1; and the command has been added to the config file like this:
command[check_mem.py]=/usr/lib64/nagios/plugins/check_mem.py -c $ARG1$ -w $ARG2$

The custom plugin check_mem.py reads /proc/meminfo, which has this ownership and permissions:
ls -l /proc/meminfo
-r--r--r-- 1 root root 0 Jun 17 10:17 /proc/meminfo

So no setuid should be required to read that.

Running the check remotely from the Nagios system, I get this:
$ /usr/local/nagios/libexec/check_nrpe -n -E -g /root/nrpe_check.log -H cliplsat01 -c check_mem.py -a -w10 -c5
NRPE: Unable to read output

(I had to use "-n" and set "-n" in /etc/sysconfig/nrpe to get rid of the "Could not complete SSL handshake" failure error)

And in the log file /usr/local/nagios/var/nagios.log I get this:
[1560777848] SERVICE NOTIFICATION: nagiosadmin;cliplsat01.pcc.int;RAM;UNKNOWN;xi_service_notification_handler;
NRPE: Unable to read output

What config am I missing, and is there a way to turn on debug logging on the Nagois system so I can get more informative error messages?

Thanks,
mark gree ne
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by scottwilkerson »

I would think your arguments looks incorrect and should be

Code: Select all

-a "10 5"
Can you run the following on the remote machine (cliplsat01)?

Code: Select all

su nagios -c "/usr/lib64/nagios/plugins/check_mem.py -c 10 -w 5"
Also, I noted in your check_nrpe command is writing to a log file in /root this could be a problem

Maybe try

Code: Select all

/usr/local/nagios/libexec/check_nrpe -n -E -g /tmp/nrpe_check.log -H cliplsat01 -c check_mem.py -a "10 5"
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by scottwilkerson »

Also, on the remote system what are the permissions of

Code: Select all

ls -ld /proc
ls -al /usr/lib64/nagios/plugins/check_mem.py
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
markgreene
Posts: 11
Joined: Mon Jun 17, 2019 9:44 am

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by markgreene »

This still fails running on the Nagios system:

$ /usr/local/nagios/libexec/check_nrpe -n -H cliplsat01 -c check_mem.py -a "10 5"
NRPE: Unable to read output

Also, the plugin will fail if both the 'w' and the 'c' are not explicitly included:

$ /usr/lib64/nagios/plugins/check_mem.py 10 5
UNKNOWN: Missing critical threshold value.

$ ls -ld /proc
dr-xr-xr-x 228 root root 0 Jun 13 15:58 /proc

$ ls -l /usr/lib64/nagios/plugins/check_mem.py
-rwxr-xr-x 1 root nrpe 2808 Jun 14 15:34 /usr/lib64/nagios/plugins/check_mem.py
markgreene
Posts: 11
Joined: Mon Jun 17, 2019 9:44 am

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by markgreene »

Changing the ownership on the plugin to be exclusively nrpe doesn't produce new results; it still runs on the host:

ls -l /usr/lib64/nagios/plugins/check_mem.py
-rwxr-xr-x 1 nrpe nrpe 2808 Jun 14 15:34 /usr/lib64/nagios/plugins/check_mem.py

[root@cliplsat01 plugins]$ /usr/lib64/nagios/plugins/check_mem.py -w 10 -c 5
OK: Free memory percentage is 57% (18361 MB)

And still fails when run remotely with the same errors, "unable to read" at the command line, and the Setuid error in the log. I also verified that userID "nrpe" can change directories all the way from / to /usr/lib64/nagios/plugins without a problem.

mark
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by scottwilkerson »

scottwilkerson wrote:Can you run the following on the remote machine (cliplsat01)?

Code: Select all

su nagios -c "/usr/lib64/nagios/plugins/check_mem.py -c 10 -w 5"
I wanted to see it run as the nagios user

But in your case it looks like your NRPE user is nrpe
so

Code: Select all

su nrpe -c "/usr/lib64/nagios/plugins/check_mem.py -c 10 -w 5"
markgreene wrote:Also, the plugin will fail if both the 'w' and the 'c' are not explicitly included:

$ /usr/lib64/nagios/plugins/check_mem.py 10 5
UNKNOWN: Missing critical threshold value.
this I know but you have the -c and -w defined in the nrpe command so they shouldn't be passed again
markgreene wrote:command[check_mem.py]=/usr/lib64/nagios/plugins/check_mem.py -c $ARG1$ -w $ARG2$
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by scottwilkerson »

It would be useful if you could attach your check_mem.py plugin
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
markgreene
Posts: 11
Joined: Mon Jun 17, 2019 9:44 am

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by markgreene »

this I know but you have the -c and -w defined in the nrpe command so they shouldn't be passed again
Ah,thanks for that, I clearly missed it. Forest vs trees problem, and I'm target-fixated on specific trees. Pulling back....

This fails:

$ su nrpe -c "/usr/lib64/nagios/plugins/check_mem.py -c 10 -w 5"
This account is currently not available.

due to nrpe being setup with /sbin/nologin as its shell. This works:

sudo -u nrpe /usr/lib64/nagios/plugins/check_mem.py -w 10 -c 5
OK: Free memory percentage is 57% (18375 MB)

I can also run it as my login:
$ whoami; /usr/lib64/nagios/plugins/check_mem.py -w 10 -c 5
greenemj
OK: Free memory percentage is 57% (18385 MB)

Here is the plugin code:

Code: Select all

#!/usr/bin/env python

"""

    Nagios plugin to report Memory usage by parsing /proc/meminfo

    by L.S. Keijser <keijser@stone-it.com>

    This script takes Cached memory into consideration by adding that
    to the total MemFree value.

"""

from optparse import OptionParser
import sys

checkmemver = '0.1'

# Parse commandline options:
parser = OptionParser(usage="%prog -w <warning threshold> -c <critical threshold> [ -h ]",version="%prog " + checkmemver)
parser.add_option("-w", "--warning", action="store", type="string", dest="warn_threshold", help="Warning threshold in percentage")
parser.add_option("-c", "--critical", action="store", type="string", dest="crit_threshold", help="Critical threshold in percentage")
(options, args) = parser.parse_args()


def readLines(filename):
    f = open(filename, "r")
    lines = f.readlines()
    return lines

def readMemValues():
    global memTotal, memCached, memFree, memBuffers
    for line in readLines('/proc/meminfo'):
        if line.split()[0] == 'MemTotal:':
            memTotal = line.split()[1]
        if line.split()[0] == 'MemFree:':
            memFree = line.split()[1]
        if line.split()[0] == 'Cached:':
            memCached = line.split()[1]
        if line.split()[0] == 'Buffers:':
            memBuffers = line.split()[1]

def percMem():
    readMemValues()
    return (((int(memFree) + int(memCached) + int(memBuffers)) * 100) / int(memTotal))

def realMem():
    readMemValues()
    return (int(memFree) + int(memCached) + int(memBuffers)) / 1024

def go():
    if not options.crit_threshold:
        print "UNKNOWN: Missing critical threshold value."
        sys.exit(3)
    if not options.warn_threshold:
        print "UNKNOWN: Missing warning threshold value."
        sys.exit(3)
    if int(options.crit_threshold) >= int(options.warn_threshold):
        print "UNKNOWN: Critical percentage can't be equal to or bigger than warning percentage."
        sys.exit(3)
    trueFree = percMem()
    trueMemFree = realMem()
    if int(trueFree) <= int(options.crit_threshold):
        print "CRITICAL: Free memory percentage is less than or equal to " + options.crit_threshold + "%: " + str(trueFree) + "% (" + str(trueMemFree) + " MB)"
        sys.exit(2)
    if int(trueFree) <= int(options.warn_threshold):
        print "WARNING: Free memory percentage is less than or equal to " + options.warn_threshold + "%: " + str(trueFree) + "% (" + str(trueMemFree) + " MB)"
        sys.exit(1)
    else:
        print "OK: Free memory percentage is " + str(trueFree) + "% (" + str(trueMemFree) +" MB)"
        sys.exit(0)

if __name__ == '__main__':
    try:
        go()
    except Exception, err:
        sys.stderr.write("Fatal memory check error: %s\n" % str(err))
        sys.exit(2)

sys.exit(0)
# end of program #
I run this code just fine on RHEL6 and 7 systems with my old Nagios system which is Core v4.4.3.

mark
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by scottwilkerson »

Hmm, I tested this on my system and it worked, but I did first get

Code: Select all

UNKNOWN: Critical percentage can't be equal to or bigger than warning percentage.
which made me have to change the arguments to

Code: Select all

/usr/local/nagios/libexec/check_nrpe -n -E -g /tmp/nrpe_check.log -H cliplsat01 -c check_mem.py -a "5 10"
Are you seeing any errors in the syslog on the remote machine when trying to execute the above?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_nrpe problems: Unable to read output and seteuid(0

Post by scottwilkerson »

Also can you show the output of the following on the remote system

Code: Select all

/usr/sbin/nrpe -h
We need to make sure it was compiled with argument support
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked