Monitor Nagios XI and remote Linux servers - Nagios Support Forum

Monitor Nagios XI and remote Linux servers

Locked

13 posts

1
2
Next

xlin125: Posts: 172; Joined: Mon Jan 19, 2015 6:01 pm

Monitor Nagios XI and remote Linux servers

Post by xlin125 » Tue Jul 07, 2015 5:25 pm

We plan to create a set of common services to monitor system resouces such as CPU, SWAP, MEM, root file system, crond, etc. for Nagios XI server and remote Linux servers using check_nrpe. This works fine on the remote Linux servers. However, when we assign such a service to the "localhost" which is installed with a Nagios XI server, we receive an "Unknown" service status under the Service Details with the message "CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.". If we assign the Nagios XI server host itself instead of localhost to the mange hosts list for this service, we get an "CHECK_NRPE: Error - Could not complete SSL handshake". Then we add the Nagios XI server host name to /etc/xinetd.d/nrpe to allow the Nagios XI server on the system to access the NRPE agent running on the same Nagios XI server, we get an service status of "Unknown" again. Can check_nrpe work for Nagios XI server itself? It seems this just lets the Nagios XI server talks back to the NRPE agent that co-resides with the Nagios XI server. Of cause,we can always use something like check_local_load, check_local_sawpand check_local_procs for the similar purpose. But then we will not be able to share a "common" set of services for the Nagios XI server and remote Linux servers. Thanks.

Box293: Too Basu; Posts: 5126; Joined: Sun Feb 07, 2010 10:55 pm; Location: Deniliquin, Australia; Contact:
Contact Box293

Website

Re: Monitor Nagios XI and remote Linux servers

Post by Box293 » Tue Jul 07, 2015 8:58 pm

xlin125 wrote:Can check_nrpe work for Nagios XI server itself?

Yes it can. The things you need to consider are:

xlin125 wrote:Then we add the Nagios XI server host name to /etc/xinetd.d/nrpe to allow the Nagios XI server on the system to access the NRPE agent running on the same Nagios XI server, we get an service status of "Unknown" again.

Try the IP Address instead of the hostname. Off the top of my head 127.0.0.1 and the eth0 IP address should probably be added.

The best communication test is to just do a check on itself without using -c

Code: Select all

./check_nrpe -H 127.0.0.1
NRPE v2.15

Code: Select all

./check_nrpe -H 10.25.5.2
CHECK_NRPE: Error - Could not complete SSL handshake.

I had to edit /etc/xinetd.d/nrpe to add the eth0 ip address only_from = 127.0.0.1 10.25.5.2

Then restart xinetd

Code: Select all

service xinetd restart

Code: Select all

/check_nrpe -H 10.25.5.2
NRPE v2.15

All working

Also, the firewall rules might be preventing the 5666 port inbound to itself.

Code: Select all

iptables -I INPUT -p tcp --destination-port 5666 -j ACCEPT
service iptables save

Does this help?

As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

xlin125: Posts: 172; Joined: Mon Jan 19, 2015 6:01 pm

Re: Monitor Nagios XI and remote Linux servers

Post by xlin125 » Wed Jul 08, 2015 1:27 pm

Thanks for the quick response.

I added the Nagios XI server IP address next to "127.0.0.1" to the /etc/xinetd.d/nrpe on the line "only_from". This resolved the SSH handshake issue. However, even I added two new rules (one for port 5666/nrpe, another for 5667/nsca) to the iptables to allow 5666 port inbound to itself, I still got the "Unknown" status with the error "CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages." under the Service Status page. The following log messages are found in /usr/local/nagios/var/nagios.log file:

"[1436378251] SERVICE ALERT: localhost;Swap_util_host_xi;UNKNOWN;HARD;1;CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
[1436378251] SERVICE NOTIFICATION: nagiosadmin;localhost;Swap_util_host_xi;UNKNOWN;xi_service_notification_handler;CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages."

Here is the current INPUT iptables status (port 5666/nrpe and 5667/nsca):
# service iptables status
Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
2 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0
3 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
4 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
5 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80
6 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443
7 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5666
8 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5667
9 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited

Any suggested solutions/tips to make this work (Nagios XI calls check_nrpe to itself)? Thanks in advance!

jolson: Attack Rabbit; Posts: 2560; Joined: Thu Feb 12, 2015 12:40 pm

Re: Monitor Nagios XI and remote Linux servers

Post by jolson » Wed Jul 08, 2015 2:43 pm

The best way to approach this will be one step at a time. First, let's run check_nrpe from the command line against localhost to ensure that the nagios server can talk to itself:

Code: Select all

./check_nrpe -H 127.0.0.1

Code: Select all

./check_nrpe -H 192.168.x.x

where 192.168.x.x is your private IP.

Did either of those work for you? If so, let's take the one that succeeded and work with it. Can you then run check commands against localhost?

Code: Select all

./check_nrpe -H 127.0.0.1 -c check_users

-or-

Code: Select all

./check_nrpe -H 192.168.x.x -c check_users

Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.

xlin125: Posts: 172; Joined: Mon Jan 19, 2015 6:01 pm

Re: Monitor Nagios XI and remote Linux servers

Post by xlin125 » Thu Jul 09, 2015 12:43 am

I followed your suggestion to test it one step at a time. The results were:
1) regardless using localhost 127.0.0.1 or IP address of the Nagios XI server, the same result, success or failure, was returned;
2) when directly running the monitoring script without invoking check_nrpe, it works; when running check_nrpe, it failed if the monitoring script has arglist, but it passed if the monitoring script has no arglist.

For examples:
[nagios@mtovis02 libexec]$ ./check_nrpe -H 127.0.0.1 -t 30 -c check_load -a '-w 70,60,50 -c 90,80,70'
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
[nagios@mtovis02 libexec]$ ./check_load -w 70,60,50 -c 90,80,70
OK - load average: 0.08, 0.07, 0.01|load1=0.080;70.000;90.000;0; load5=0.070;60.000;80.000;0; load15=0.010;50.000;70.000;0;

[nagios@mtovis02 libexec]$ ./check_nrpe -H 127.0.0.1 -t 30 -c check_swap -a '-w 40 -c 20'
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
[nagios@mtovis02 libexec]$ ./check_swap -w 40 -c 20
SWAP OK - 100% free (11999 MB out of 11999 MB) |swap=11999MB;0;0;0;11999

[nagios@mtovis02 libexec]$ ./check_nrpe -H 127.0.01 -t 30 -c check_users
USERS OK - 1 users currently logged in |users=1;5;10;0

What went wrong?

Box293: Too Basu; Posts: 5126; Joined: Sun Feb 07, 2010 10:55 pm; Location: Deniliquin, Australia; Contact:
Contact Box293

Website

Re: Monitor Nagios XI and remote Linux servers

Post by Box293 » Thu Jul 09, 2015 2:28 am

Can you please post these files from your XI server:

/usr/local/nagios/etc/nrpe.cfg
/usr/local/nagios/etc/nrpe/common.cfg

If you don't have /usr/local/nagios/etc/nrpe/common.cfg then I suspect I know what is going on.

The XI server may not have the same full install as the linux-nrpe-agent on other servers.

As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

xlin125: Posts: 172; Joined: Mon Jan 19, 2015 6:01 pm

Re: Monitor Nagios XI and remote Linux servers

Post by xlin125 » Thu Jul 09, 2015 9:25 am

The /usr/local/nagios/etc/nrpe directory does not exist on the Nagios XI server. So there is no /usr/local/nagios/etc/nrpe/common.cfg file at all (see below). You are right that this is not a complete NRPE agent installed on the Nagios XI server. Is there any way to make this work (allow Nagios XI to call check_nrpe to monitor itself or to monitor another Nagios XI server system)? Or, have we concluded that this way will not work, period?

$ pwd
/usr/local/nagios/etc
$ ls -l nrpe
ls: cannot access nrpe: No such file or directory

The /usr/local/nagios/etc/nrpe.cfg file is an original installed file when I installed the Nagios XI on this box.

Code: Select all

$ pwd
/usr/local/nagios/etc
$ ls -l nrpe.cfg
-rw-rw-r--. 1 apache nagios 7988 May  7 11:35 nrpe.cfg
$ cat nrpe.cfg
#############################################################################
# Sample NRPE Config File
# Written by: Ethan Galstad ([email protected])
#
# Last Modified: 11-23-2007
#
# NOTES:
# This is a sample configuration file for the NRPE daemon.  It needs to be
# located on the remote host that is running the NRPE daemon, not the host
# from which the check_nrpe client is being executed.
#############################################################################


# LOG FACILITY
# The syslog facility that should be used for logging purposes.

log_facility=daemon



# PID FILE
# The name of the file in which the NRPE daemon should write it's process ID
# number.  The file is only written if the NRPE daemon is started by the root
# user and is running in standalone mode.

pid_file=/var/run/nrpe.pid



# PORT NUMBER
# Port number we should wait for connections on.
# NOTE: This must be a non-priviledged port (i.e. > 1024).
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

server_port=5666



# SERVER ADDRESS
# Address that nrpe should bind to in case there are more than one interface
# and you do not want nrpe to bind on all interfaces.
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

#server_address=127.0.0.1



# NRPE USER
# This determines the effective user that the NRPE daemon should run as.
# You can either supply a username or a UID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_user=nagios



# NRPE GROUP
# This determines the effective group that the NRPE daemon should run as.
# You can either supply a group name or a GID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_group=nagios



# ALLOWED HOST ADDRESSES
# This is an optional comma-delimited list of IP address or hostnames
# that are allowed to talk to the NRPE daemon. Network addresses with a bit mask
# (i.e. 192.168.1.0/24) are also supported. Hostname wildcards are not currently
# supported.
#
# Note: The daemon only does rudimentary checking of the client's IP
# address.  I would highly recommend adding entries in your /etc/hosts.allow
# file to allow only the specified host to connect to the port
# you are running this daemon on.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

allowed_hosts=127.0.0.1



# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed.  This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments

dont_blame_nrpe=0



# BASH COMMAND SUBTITUTION
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments that contain bash command substitutions of the form
# $(...).  This option only works if the daemon was configured with both
# the --enable-command-args and --enable-bash-command-substitution configure
# script options.
#
# *** ENABLING THIS OPTION IS A HIGH SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow bash command substitutions,
#         1=allow bash command substitutions

allow_bash_command_substitution=0



# COMMAND PREFIX
# This option allows you to prefix all commands with a user-defined string.
# A space is automatically added between the specified prefix string and the
# command line from the command definition.
#
# *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! ***
# Usage scenario:
# Execute restricted commmands using sudo.  For this to work, you need to add
# the nagios user to your /etc/sudoers.  An example entry for alllowing
# execution of the plugins from might be:
#
# nagios          ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/
#
# This lets the nagios user run all commands in that directory (and only them)
# without asking for a password.  If you do this, make sure you don't give
# random users write access to that directory or its contents!

# command_prefix=/usr/bin/sudo



# DEBUGGING OPTION
# This option determines whether or not debugging messages are logged to the
# syslog facility.
# Values: 0=debugging off, 1=debugging on

debug=0



# COMMAND TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# allow plugins to finish executing before killing them off.

command_timeout=60



# CONNECTION TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# wait for a connection to be established before exiting. This is sometimes
# seen where a network problem stops the SSL being established even though
# all network sessions are connected. This causes the nrpe daemons to
# accumulate, eating system resources. Do not set this too low.

connection_timeout=300



# WEEK RANDOM SEED OPTION
# This directive allows you to use SSL even if your system does not have
# a /dev/random or /dev/urandom (on purpose or because the necessary patches
# were not applied). The random number generator will be seeded from a file
# which is either a file pointed to by the environment valiable $RANDFILE
# or $HOME/.rnd. If neither exists, the pseudo random number generator will
# be initialized and a warning will be issued.
# Values: 0=only seed from /dev/[u]random, 1=also seed from weak randomness

#allow_weak_random_seed=1



# INCLUDE CONFIG FILE
# This directive allows you to include definitions from an external config file.

#include=<somefile.cfg>



# INCLUDE CONFIG DIRECTORY
# This directive allows you to include definitions from config files (with a
# .cfg extension) in one or more directories (with recursion).

#include_dir=<somedirectory>
#include_dir=<someotherdirectory>



# COMMAND DEFINITIONS
# Command definitions that this daemon will run.  Definitions
# are in the following format:
#
# command[<command_name>]=<command_line>
#
# When the daemon receives a request to return the results of <command_name>
# it will execute the command specified by the <command_line> argument.
#
# Unlike Nagios, the command line cannot contain macros - it must be
# typed exactly as it should be executed.
#
# Note: Any plugins that are used in the command lines must reside
# on the machine that this daemon is running on!  The examples below
# assume that you have plugins installed in a /usr/local/nagios/libexec
# directory.  Also note that you will have to modify the definitions below
# to match the argument format the plugins expect.  Remember, these are
# examples only!


# The following examples use hardcoded command arguments...

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200


# The following examples allow user-supplied arguments and can
# only be used if the NRPE daemon was compiled with support for
# command arguments *AND* the dont_blame_nrpe directive in this
# config file is set to '1'.  This poses a potential security risk, so
# make sure you read the SECURITY file before doing this.

#command[check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
#command[check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
#command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
#command[check_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$

Last edited by tmcdonald on Thu Jul 09, 2015 9:29 am, edited 1 time in total.
Reason: Please wrap long output in [code][/code] tags

lmiltchev: Bugs find me; Posts: 13589; Joined: Mon May 23, 2011 12:15 pm

Re: Monitor Nagios XI and remote Linux servers

Post by lmiltchev » Thu Jul 09, 2015 10:03 am

You can have commands, defined in either nrpe.cfg or common.cfg. The "common.cfg" file gets installed only when you run our official "Linux agent" installer script. You don't need it in this case as you can define commands in the nrpe.cfg. All of your commands are "hardcoded":

Code: Select all

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

So, in your case, you can simply run:

Code: Select all

./check_nrpe -H 127.0.0.1 -c <command name>

when you substitute <command name> with the actual command name, i.e. check_users, check_load, etc.

Or, if you don't want these commands to be "hardcoded", you can use args, for example:

Code: Select all

command[check_users]=/usr/local/nagios/libexec/check_users $ARG1$

Restart xinetd, so that changes can take place:

Code: Select all

service xinetd restart

then test it:

Code: Select all

./check_nrpe -H 127.0.0.1 -c check_users -a '-w 5 -c 10'

Why do you want to use "check_nrpe" locally on the XI box anyway?

Be sure to check out our Knowledgebase for helpful articles and solutions!

xlin125: Posts: 172; Joined: Mon Jan 19, 2015 6:01 pm

Re: Monitor Nagios XI and remote Linux servers

Post by xlin125 » Thu Jul 09, 2015 2:23 pm

As I observed and reported before, when directly running the monitoring script without invoking check_nrpe from the Nagios XI, it works:
# ./check_load -w 70,60,50 -c 90,80,70
OK - load average: 0.00, 0.00, 0.00|load1=0.000;70.000;90.000;0; load5=0.000;60.000;80.000;0; load15=0.000;50.000;70.000;0;

When running it with check_nrpe, it failed if the monitoring script has arglist, but it passed if the monitoring script has no arglist:
# ./check_nrpe -H 127.0.0.1 -c check_users
USERS OK - 1 users currently logged in |users=1;5;10;0

# ./check_nrpe -H 127.0.0.1 -c check_users -a '-w 5 -c 10'
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.

# ./check_nrpe -H 127.0.0.1 -t 30 -c check_load -a '-w 70,60,50 -c 90,80,70'
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.

We are building a set common services and want to use them to monitor the remote NRPE agent-installed Linux servers and other Nagios XI servers for system resouce usage and process check. We tried to see whether we can treat a Nagios XI as a NRPE agent-installed server so that we can have a Nagios XI server to monitor another Nagios XI server using "check_nrpe" on the system level. So this is why we want to use "check_nrpe" locally on the Nagios XI and remotely on other Nagios XI servers.

It seems we can not use "check_nrpe" to monitor a Nagios XI server(?); or what changes need to be made to get it working? Any help will be greatly appreciated.

jolson: Attack Rabbit; Posts: 2560; Joined: Thu Feb 12, 2015 12:40 pm

Re: Monitor Nagios XI and remote Linux servers

Post by jolson » Thu Jul 09, 2015 2:40 pm

On your XI box you will need to change the 'dont_blame_nrpe=0' setting to 'dont_blame_nrpe=1' to allow for arguments to be passed through properly.

Code: Select all

vi /usr/local/nagios/etc/nrpe.cfg

Code: Select all

service xinetd restart

Give the above a shot - I have a hunch that it will work for you.

Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.

Locked

13 posts

1
2
Next

Return to “Nagios XI”