In Nagios GUI getting Critical errors

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
dheerushops
Posts: 47
Joined: Fri Dec 06, 2019 7:44 am

Re: In Nagios GUI getting Critical errors

Post by dheerushops »

As you suggested. I have added below entry in nrpe.cfg in nagioshost file.
command[check_ssh]=/usr/local/nagios/libexec/check_ssh -H <nagiosserverIP>


But for one server. We just upgrade RHEL 6.10 to 7.6.
After that is nagios monitoring tools, we are seeing below errors.


CPU Load

CRITICAL 12-11-2019 13:38:02 0d 2h 31m 7s 3/3 (Return code of 255 for service 'CPU Load' on host 'nagioshost1' was out of bounds)

SSH Monitoring

CRITICAL 12-11-2019 13:36:28 8d 0h 32m 37s 3/3 (Return code of 255 for service 'SSH Monitoring' on host 'nagioshost1' was out of bounds)

Total Processes

CRITICAL 12-11-2019 13:37:43 8d 0h 31m 22s 3/3 (Return code of 255 for service 'Total Processes' on host 'nagioshost1' was out of bounds)
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: In Nagios GUI getting Critical errors

Post by lmiltchev »

What are you trying to accomplish with the check_ssh command? On your nagios server, you are running check_nrpe against a clicking machine, where you are running the check_ssh command against another nagios server? Can you elaborate on your setup?

Also, it doesn't help when you show us only the output...
CRITICAL 12-11-2019 13:36:28 8d 0h 32m 37s 3/3 (Return code of 255 for service 'SSH Monitoring' on host 'nagioshost1' was out of bounds)
Please show us the actual command that you are running from the command line along with the output of it. If you have several clients (hosts that you are monitoring), show us how the "check_ssh" command is configured on this specific machine.
Be sure to check out our Knowledgebase for helpful articles and solutions!
dheerushops
Posts: 47
Joined: Fri Dec 06, 2019 7:44 am

Re: In Nagios GUI getting Critical errors

Post by dheerushops »

please check below errors. We will get back to you on SSH alert.
CPU Load

CRITICAL 12-11-2019 13:38:02 0d 2h 31m 7s 3/3 (Return code of 255 for service 'CPU Load' on host 'nagioshost1' was out of bounds)

Total Processes

CRITICAL 12-11-2019 13:37:43 8d 0h 31m 22s 3/3 (Return code of 255 for service 'Total Processes' on host 'nagioshost1' was out of bounds)
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: In Nagios GUI getting Critical errors

Post by lmiltchev »

We don't have sufficient information to start troubleshooting your issue. Please show us the following for each failing check:

1. Actual command, run from the command line along with the output of it
2. Service definitions for these checks (from the nagios server)
3. nrpe.cfg file from the host that you are monitoring
4. system log from the remote machine
Be sure to check out our Knowledgebase for helpful articles and solutions!
dheerushops
Posts: 47
Joined: Fri Dec 06, 2019 7:44 am

Re: In Nagios GUI getting Critical errors

Post by dheerushops »

1. Actual command, run from the command line along with the output of it
[root@nagioshost1 ~]# /usr/local/nagios/libexec/check_nrpe -H localhost -c check_load
OK - load average: 0.00, 0.01, 0.05|load1=0.000;15.000;30.000;0; load5=0.010;10.000;25.000;0; load15=0.050;5.000;20.000;0;
[root@nagioshost1 ~]#

2. Service definitions for these checks (from the nagios server)

[root@nagiosserver ~]# /usr/local/nagios/libexec/check_load nagioshost1
OK - load average: 0.00, 0.00, 0.00|load1=0.000;0.000;0.000;0; load5=0.000;0.000;0.000;0; load15=0.000;0.000;0.000;0;
[root@nagiosserver ~]#

3. nrpe.cfg file from the host that you are monitoring

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200



4. system log from the remote machine
Can you please tell me , which log you are referring to.
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: In Nagios GUI getting Critical errors

Post by lmiltchev »

1. Actual command, run from the command line along with the output of it
[root@nagioshost1 ~]# /usr/local/nagios/libexec/check_nrpe -H localhost -c check_load
OK - load average: 0.00, 0.01, 0.05|load1=0.000;15.000;30.000;0; load5=0.010;10.000;25.000;0; load15=0.050;5.000;20.000;0;
[root@nagioshost1 ~]#
This is good - it means you can run the command "locally", on the remote box.
2. Service definitions for these checks (from the nagios server)

[root@nagiosserver ~]# /usr/local/nagios/libexec/check_load nagioshost1
OK - load average: 0.00, 0.00, 0.00|load1=0.000;0.000;0.000;0; load5=0.000;0.000;0.000;0; load15=0.000;0.000;0.000;0;
[root@nagiosserver ~]#
This is incorrect. If you wanted to monitor your 'nagioshost1' host, you would need to run this:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H <ip address of 'nagioshost1'> -c check_load
We still need to see the actual config files for the "CPU Load", "SSH Monitoring", and "Total Processes" services (from 'nagiosserver').
3. nrpe.cfg file from the host that you are monitoring

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
This is not the entire nrpe.cfg file... Can you upload the file on the forum?

Anyway, you have "check_load" and "check_total_procs" defined. I don't see any definitions for "SSH" checks though...

4. system log from the remote machine
Can you please tell me , which log you are referring to.
This would depend on the OS, e.g. /var/log/messages or /var/log/syslog, etc. Let's first check out the configs, before going into the log. You can review it though to see if there are any "NRPE related" errors in it.
Be sure to check out our Knowledgebase for helpful articles and solutions!
dheerushops
Posts: 47
Joined: Fri Dec 06, 2019 7:44 am

Re: In Nagios GUI getting Critical errors

Post by dheerushops »

[root@nagioshost1 ~]# cat /usr/local/nagios/etc/nrpe.cfg
#############################################################################
# Sample NRPE Config File
# Written by: Ethan Galstad (nagios@nagios.org)
#
# Last Modified: 11-23-2007
#
# NOTES:
# This is a sample configuration file for the NRPE daemon. It needs to be
# located on the remote host that is running the NRPE daemon, not the host
# from which the check_nrpe client is being executed.
#############################################################################


# LOG FACILITY
# The syslog facility that should be used for logging purposes.

log_facility=daemon



# PID FILE
# The name of the file in which the NRPE daemon should write it's process ID
# number. The file is only written if the NRPE daemon is started by the root
# user and is running in standalone mode.

pid_file=/var/run/nrpe.pid



# PORT NUMBER
# Port number we should wait for connections on.
# NOTE: This must be a non-priviledged port (i.e. > 1024).
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

server_port=5666



# SERVER ADDRESS
# Address that nrpe should bind to in case there are more than one interface
# and you do not want nrpe to bind on all interfaces.
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

#server_address=127.0.0.1



# NRPE USER
# This determines the effective user that the NRPE daemon should run as.
# You can either supply a username or a UID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_user=nagios



# NRPE GROUP
# This determines the effective group that the NRPE daemon should run as.
# You can either supply a group name or a GID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_group=nagios



# ALLOWED HOST ADDRESSES
# This is an optional comma-delimited list of IP address or hostnames
# that are allowed to talk to the NRPE daemon. Network addresses with a bit mask
# (i.e. 192.168.1.0/24) are also supported. Hostname wildcards are not currently
# supported.
#
# Note: The daemon only does rudimentary checking of the client's IP
# address. I would highly recommend adding entries in your /etc/hosts.allow
# file to allow only the specified host to connect to the port
# you are running this daemon on.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

allowed_hosts=127.0.0.1 localhost nagiosserver




# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed. This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments

dont_blame_nrpe=0



# BASH COMMAND SUBTITUTION
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments that contain bash command substitutions of the form
# $(...). This option only works if the daemon was configured with both
# the --enable-command-args and --enable-bash-command-substitution configure
# script options.
#
# *** ENABLING THIS OPTION IS A HIGH SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow bash command substitutions,
# 1=allow bash command substitutions

allow_bash_command_substitution=0



# COMMAND PREFIX
# This option allows you to prefix all commands with a user-defined string.
# A space is automatically added between the specified prefix string and the
# command line from the command definition.
#
# *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! ***
# Usage scenario:
# Execute restricted commmands using sudo. For this to work, you need to add
# the nagios user to your /etc/sudoers. An example entry for alllowing
# execution of the plugins from might be:
#
# nagios ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/
#
# This lets the nagios user run all commands in that directory (and only them)
# without asking for a password. If you do this, make sure you don't give
# random users write access to that directory or its contents!

# command_prefix=/usr/bin/sudo



# DEBUGGING OPTION
# This option determines whether or not debugging messages are logged to the
# syslog facility.
# Values: 0=debugging off, 1=debugging on

debug=0



# COMMAND TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# allow plugins to finish executing before killing them off.

command_timeout=60



# CONNECTION TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# wait for a connection to be established before exiting. This is sometimes
# seen where a network problem stops the SSL being established even though
# all network sessions are connected. This causes the nrpe daemons to
# accumulate, eating system resources. Do not set this too low.

connection_timeout=300



# WEEK RANDOM SEED OPTION
# This directive allows you to use SSL even if your system does not have
# a /dev/random or /dev/urandom (on purpose or because the necessary patches
# were not applied). The random number generator will be seeded from a file
# which is either a file pointed to by the environment valiable $RANDFILE
# or $HOME/.rnd. If neither exists, the pseudo random number generator will
# be initialized and a warning will be issued.
# Values: 0=only seed from /dev/random, 1=also seed from weak randomness

#allow_weak_random_seed=1



# INCLUDE CONFIG FILE
# This directive allows you to include definitions from an external config file.

#include=<somefile.cfg>



# INCLUDE CONFIG DIRECTORY
# This directive allows you to include definitions from config files (with a
# .cfg extension) in one or more directories (with recursion).

#include_dir=<somedirectory>
#include_dir=<someotherdirectory>



# COMMAND DEFINITIONS
# Command definitions that this daemon will run. Definitions
# are in the following format:
#
# command[<command_name>]=<command_line>
#
# When the daemon receives a request to return the results of <command_name>
# it will execute the command specified by the <command_line> argument.
#
# Unlike Nagios, the command line cannot contain macros - it must be
# typed exactly as it should be executed.
#
# Note: Any plugins that are used in the command lines must reside
# on the machine that this daemon is running on! The examples below
# assume that you have plugins installed in a /usr/local/nagios/libexec
# directory. Also note that you will have to modify the definitions below
# to match the argument format the plugins expect. Remember, these are
# examples only!


# The following examples use hardcoded command arguments...

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200


# The following examples allow user-supplied arguments and can
# only be used if the NRPE daemon was compiled with support for
# command arguments *AND* the dont_blame_nrpe directive in this
# config file is set to '1'. This poses a potential security risk, so
# make sure you read the SECURITY file before doing this.

#command[check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
#command[check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
#command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
#command[check_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
[root@nagioshost1 ~]#
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: In Nagios GUI getting Critical errors

Post by lmiltchev »

First off, your allowed_hosts need to be comma-delimited, not space-delimited...

Code: Select all

allowed_hosts=127.0.0.1,localhost,nagiosserver
Once you fix that, you need to restart nrpe on 'nagioshost1'.

We still haven't seen your "CPU Load", "SSH Monitoring", and "Total Processes" service configs, so we don't know if these services are configured properly. Please post the configs on the forum.

Please run the following command on the nagios server (after you fix the allowed_hosts issue) and show the output:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H <ip address of 'nagioshost1'> -c check_load
Be sure to check out our Knowledgebase for helpful articles and solutions!
dheerushops
Posts: 47
Joined: Fri Dec 06, 2019 7:44 am

Re: In Nagios GUI getting Critical errors

Post by dheerushops »

[root@nagiosserver ~]# /usr/local/nagios/libexec/check_nrpe -H <ip address of 'nagioshost1'> -c check_load
OK - load average: 0.00, 0.01, 0.05|load1=0.000;15.000;30.000;0; load5=0.010;10.000;25.000;0; load15=0.050;5.000;20.000;0;
[root@nagiosserver ~]#

Issue resolved. Thank you lmiltchev.
dheerushops
Posts: 47
Joined: Fri Dec 06, 2019 7:44 am

Re: In Nagios GUI getting Critical errors

Post by dheerushops »

Can you please share link for official page for configuration nagios in server and how to nagioshost to nagios server.
Locked