check_nrpe!check_cpu and others not working, weird issue

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
User avatar
amityweb
Posts: 7
Joined: Sun Jun 22, 2014 2:50 am

check_nrpe!check_cpu and others not working, weird issue

Post by amityweb »

Hi

I am really struggling with an issue I have...

Running this on the remote machine works fine:

Code: Select all

[root@mail ~]# /usr/lib64/nagios/plugins/check_cpu
OK: CPU is 98% idle | cpu=2%;101;102;0;100
Running it on the nagios server produced an error:

Code: Select all

root@monitor:~# /usr/lib/nagios/plugins/check_nrpe -H myhost.com -c check_cpu
NRPE: Unable to read output
I have been messing with permissions/sudoers/shell and things people report online for this what looks like a very common error, BUT here's the thing, if I replace the check_cpu content with a simple echo "hi!" message, then it works fine from the server.

Code: Select all

root@monitor:~# /usr/lib/nagios/plugins/check_nrpe -H myhost.com -c check_cpu
Hi
Does anyone know why this is? Maybe python can't be executed remotely, but bash can?


I get similar issues with other scripts, like check_load, it works on the remote machine locally, but not on the nagios server:

Code: Select all

root@monitor:~# /usr/lib/nagios/plugins/check_nrpe -H myhost.com -c check_load -a '-w 15,10,5 -c 30,25,20'
Warning threshold must be float or float triplet!

Usage:
check_load [-r] -w WLOAD1,WLOAD5,WLOAD15 -c CLOAD1,CLOAD5,CLOAD15
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: check_nrpe!check_cpu and others not working, weird issue

Post by slansing »

Can you show us your nrpe.cfg file from that remote server? Thanks!
User avatar
amityweb
Posts: 7
Joined: Sun Jun 22, 2014 2:50 am

Re: check_nrpe!check_cpu and others not working, weird issue

Post by amityweb »

I was just about to say I have dont_blame_nrpe=1

Here's the file:

Code: Select all

#############################################################################
# Sample NRPE Config File 
# Written by: Ethan Galstad (nagios@nagios.org)
# 
# Last Modified: 11-23-2007
#
# NOTES:
# This is a sample configuration file for the NRPE daemon.  It needs to be
# located on the remote host that is running the NRPE daemon, not the host
# from which the check_nrpe client is being executed.
#############################################################################


# LOG FACILITY
# The syslog facility that should be used for logging purposes.

log_facility=daemon



# PID FILE
# The name of the file in which the NRPE daemon should write it's process ID
# number.  The file is only written if the NRPE daemon is started by the root
# user and is running in standalone mode.

pid_file=/var/run/nrpe.pid



# PORT NUMBER
# Port number we should wait for connections on.
# NOTE: This must be a non-priviledged port (i.e. > 1024).
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

server_port=5666



# SERVER ADDRESS
# Address that nrpe should bind to in case there are more than one interface
# and you do not want nrpe to bind on all interfaces.
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

#server_address=127.0.0.1



# NRPE USER
# This determines the effective user that the NRPE daemon should run as.  
# You can either supply a username or a UID.
# 
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_user=nrpe



# NRPE GROUP
# This determines the effective group that the NRPE daemon should run as.  
# You can either supply a group name or a GID.
# 
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_group=nrpe



# ALLOWED HOST ADDRESSES
# This is an optional comma-delimited list of IP address or hostnames 
# that are allowed to talk to the NRPE daemon. Network addresses with a bit mask
# (i.e. 192.168.1.0/24) are also supported. Hostname wildcards are not currently 
# supported.
#
# Note: The daemon only does rudimentary checking of the client's IP
# address.  I would highly recommend adding entries in your /etc/hosts.allow
# file to allow only the specified host to connect to the port
# you are running this daemon on.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

allowed_hosts=127.0.0.1,myhost.com
 


# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed.  This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.  
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! *** 
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments

dont_blame_nrpe=1



# COMMAND PREFIX
# This option allows you to prefix all commands with a user-defined string.
# A space is automatically added between the specified prefix string and the
# command line from the command definition.
#
# *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! ***
# Usage scenario: 
# Execute restricted commmands using sudo.  For this to work, you need to add
# the nagios user to your /etc/sudoers.  An example entry for alllowing 
# execution of the plugins from might be:
#
# nagios          ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/
#
# This lets the nagios user run all commands in that directory (and only them)
# without asking for a password.  If you do this, make sure you don't give
# random users write access to that directory or its contents!

# command_prefix=/usr/bin/sudo 



# DEBUGGING OPTION
# This option determines whether or not debugging messages are logged to the
# syslog facility.
# Values: 0=debugging off, 1=debugging on

debug=0



# COMMAND TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# allow plugins to finish executing before killing them off.

command_timeout=60



# CONNECTION TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# wait for a connection to be established before exiting. This is sometimes
# seen where a network problem stops the SSL being established even though
# all network sessions are connected. This causes the nrpe daemons to
# accumulate, eating system resources. Do not set this too low.

connection_timeout=300



# WEEK RANDOM SEED OPTION
# This directive allows you to use SSL even if your system does not have
# a /dev/random or /dev/urandom (on purpose or because the necessary patches
# were not applied). The random number generator will be seeded from a file
# which is either a file pointed to by the environment valiable $RANDFILE
# or $HOME/.rnd. If neither exists, the pseudo random number generator will
# be initialized and a warning will be issued.
# Values: 0=only seed from /dev/[u]random, 1=also seed from weak randomness

#allow_weak_random_seed=1



# INCLUDE CONFIG FILE
# This directive allows you to include definitions from an external config file.

#include=<somefile.cfg>



# INCLUDE CONFIG DIRECTORY
# This directive allows you to include definitions from config files (with a
# .cfg extension) in one or more directories (with recursion).

include_dir=/etc/nrpe.d/



# COMMAND DEFINITIONS
# Command definitions that this daemon will run.  Definitions
# are in the following format:
#
# command[<command_name>]=<command_line>
#
# When the daemon receives a request to return the results of <command_name>
# it will execute the command specified by the <command_line> argument.
#
# Unlike Nagios, the command line cannot contain macros - it must be
# typed exactly as it should be executed.
#
# Note: Any plugins that are used in the command lines must reside
# on the machine that this daemon is running on!  The examples below
# assume that you have plugins installed in a /usr/local/nagios/libexec
# directory.  Also note that you will have to modify the definitions below
# to match the argument format the plugins expect.  Remember, these are
# examples only!


# The following examples use hardcoded command arguments...

#command[check_users]=/usr/libexec/nagios/plugins/check_users -w 5 -c 10
#command[check_load]=/usr/libexec/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
#command[check_hda1]=/usr/libexec/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1
#command[check_zombie_procs]=/usr/libexec/nagios/plugins/check_procs -w 5 -c 10 -s Z
#command[check_total_procs]=/usr/libexec/nagios/plugins/check_procs -w 150 -c 200 


# The following examples allow user-supplied arguments and can
# only be used if the NRPE daemon was compiled with support for 
# command arguments *AND* the dont_blame_nrpe directive in this
# config file is set to '1'.  This poses a potential security risk, so
# make sure you read the SECURITY file before doing this.


command[check_cpu]=sudo /usr/lib64/nagios/plugins/check_cpu -w $ARG1$ -c $ARG2$
command[check_users]=sudo /usr/lib64/nagios/plugins/check_users -w $ARG1$ -c $ARG2$
command[check_load]=sudo /usr/lib64/nagios/plugins/check_load -w $ARG1$ -c $ARG2$
command[check_disk]=sudo /usr/lib64/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
command[check_procs]=sudo /usr/lib64/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
command[check_all_disks]=sudo /usr/lib64/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
I have also added this to sudoers

Code: Select all

nagios          ALL=(ALL) NOPASSWD: /usr/lib64/nagios/plugins/
Edit: I would have installed NRPE using Yum also.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: check_nrpe!check_cpu and others not working, weird issue

Post by Box293 »

There are a couple of things here which we need to do to get you working.
amityweb wrote:Running this on the remote machine works fine:

Code: Select all

[root@mail ~]# /usr/lib64/nagios/plugins/check_cpu
OK: CPU is 98% idle | cpu=2%;101;102;0;100
Running it on the nagios server produced an error:

Code: Select all

root@monitor:~# /usr/lib/nagios/plugins/check_nrpe -H myhost.com -c check_cpu
NRPE: Unable to read output
And the line from your config file is as follows:

Code: Select all

command[check_cpu]=sudo /usr/lib64/nagios/plugins/check_cpu -w $ARG1$ -c $ARG2$
Firstly, we don't need to sudo the command to make it work, so change the line in your config file to:

Code: Select all

command[check_cpu]=/usr/lib64/nagios/plugins/check_cpu -w $ARG1$ -c $ARG2$
Now restart NRPE on your remote server.

Also, your command in nrpe.cfg is saying it needs a warning and critical values.

So when you test it fom the Nagios host like this:

Code: Select all

/usr/lib/nagios/plugins/check_nrpe -H myhost.com -c check_cpu
What is being executed on the remote host is:

Code: Select all

/usr/lib64/nagios/plugins/check_cpu -w -c
So this is why it is failing, you are not providing any values for warning and critical.

So you must test from Nagios like:

Code: Select all

/usr/lib/nagios/plugins/check_nrpe -H myhost.com -c check_cpu -a warning_value crtical_value
Notice that we don't specify the -w and -c flags.



Using check_load as another example.

Change the line in your config file to:

Code: Select all

command[check_load]=usr/lib64/nagios/plugins/check_load -w $ARG1$ -c $ARG2$
Now restart NRPE on your remote server.

Now test from Nagios:

Code: Select all

/usr/lib/nagios/plugins/check_nrpe -H myhost.com -c check_load -a 15,10,5 30,25,20
Which should output something like:

Code: Select all

OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;
My test system has no load :lol:

Once you understand how NRPE passes values to the client and how the client uses the values in the command it should become a lot easier!

Does this help you out?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
amityweb
Posts: 7
Joined: Sun Jun 22, 2014 2:50 am

Re: check_nrpe!check_cpu and others not working, weird issue

Post by amityweb »

Ahhhhhh... not passing warning and critical values on the client still works, so therefore I assumed not passing on then on the host would work too! I thought it woud run the same command. I didn't think it would pass empty -w -c. I couldn't find the commands in any logs to see it not being passed to realise this was happening.

Secondly, when I did pass values on the host I included the -w -c as part of the arguments. I thought the -a flag was the entire argument string to pass, so '-w 20 -c 10'

As for sudo in the config file, I read someone where online to do this, I've tried quite a lot of different things!

So running the command from the host is now working and fetching the data! Thank you!!

But the Nagios web interface is still showing the errors I was getting on the command line, but armed with the above info I am better equipped to investigate!

Thanks a lot, this is a huge step forward for me.

Edit: The command line passed by the host is:

Code: Select all

/usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c check_cpu -a 20
and it states

Code: Select all

unused:	$ARG3$=10
So it seems $ARG1% is check_cpu and $ARG2$ is 20, and so my -c value is not being passed as $ARG3$ is not used.

This is my service definition

Code: Select all

define service{
	use														generic-service				; Name of service template to use
	hostgroup_name								servers
	service_description						CPU
	check_command									check_nrpe!check_cpu!85!95
}
Edit: I think I have done it... based on your explanations I tried putting a space not an exclamation to pass the warning and critical values, as both of these would be considered $ARG2$, which I guess in turn would then be considered as $ARG1$ and $ARG2$ on the remote machine. I think its working now.

Code: Select all

define service{
	use											generic-service				; Name of service template to use
	hostgroup_name							servers
	service_description					CPU
	check_command							check_nrpe!check_cpu!20 10
}
So the command being executed is:

Code: Select all

/usr/lib/nagios/plugins/check_nrpe -H mail.amitywebsolutions.co.uk -c check_cpu -a 20 10
Edit: One last question... surely this has to be very very common, monitoring remote machines, but why can't I find this documented anywhere? Or is there another way that is documented, and the above is not correct?
Last edited by slansing on Tue Jun 24, 2014 12:17 pm, edited 1 time in total.
Reason: Combined all of your posts into one, please edit your previous post to add additional information if you are the last poster. Thanks!
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: check_nrpe!check_cpu and others not working, weird issue

Post by tmcdonald »

Thank you again Box for helping us out!
amityweb wrote:Edit: One last question... surely this has to be very very common, monitoring remote machines, but why can't I find this documented anywhere? Or is there another way that is documented, and the above is not correct?
@amityweb - Have you seen our NRPE documentation? It is a bit old but still accurate.
Former Nagios employee
User avatar
amityweb
Posts: 7
Joined: Sun Jun 22, 2014 2:50 am

Re: check_nrpe!check_cpu and others not working, weird issue

Post by amityweb »

Hi @tmcdonald

I dont think I did see this specific one. If I am honest, I don't think the documentation is very user friendly for newbies, there is a lot of digging around, so I just source other peoples blogs with easy guides to follow to install it (all of which are different which makes it worse!).

BUT... that NRPE PDF does not even refer to how to pass arguments when using nrpe_check, i.e. the fact that they need spaces instead of !, when all other documentation states to use ! when not using check_nrpe, completely threw me, and which is why I am surprised about this issue considering so many people must be using check_nrpe! If that PDF is where people turn to, how have other people figured out to use spaces? I had to interpret the errors and use trial and error. And I couldn't find this info on other people's blogs either.

Box's advise above was a complete eye opener, easy to understand now, and with it I was able to easily realise I needed spaces in the arguments instead of !

Thanks
User avatar
eloyd
Cool Title Here
Posts: 2129
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: check_nrpe!check_cpu and others not working, weird issue

Post by eloyd »

FYI, Nagios CORE is free, with public support (such as this).

Nagios XI is paid software, with premium (and, if I do say so myself) top-notch support directly from Nagios Enterprises. XI also includes...oh, I'll go with lots of add-ons and customization and reporting and management and wizard capabilities that make it a lot easier for people who don't have a sysadmin background to start monitoring things.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoydI'm a Nagios Fanatic!
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: check_nrpe!check_cpu and others not working, weird issue

Post by Box293 »

Great stuff, glad I could help you learn Nagios :)

It's good to get some honest feedback on the documentation. We were all newbie's at one stage, we can sometimes forget how it feels to go through the initial learning process.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
amityweb
Posts: 7
Joined: Sun Jun 22, 2014 2:50 am

Re: check_nrpe!check_cpu and others not working, weird issue

Post by amityweb »

@eloyd, yeah I'm aware its free and the support I got here for this was great. I'm aware of the commercial version, but the price is out of my league to be honest. Its clearly aimed at large enterprises. I run a small web development company with 11 servers, and $2000 is out of my league for this (especially as there looks to be so many more commercial addons, would this base price even be enough?). I would be happy to pay a few hundred pounds for something like this, but not $2000. But we need a monitoring system. I pay a cloud provider at the moment (CopperEgg) approx. £80 per month, so looking to eliminate this cost hence using Nagios (or one of the other self hosted systems), but I wouldn't do it for $2000, I'd still have to invest all the time too!

When my company grows about 10 times the size I will certainly buy it then :)
Locked