Page 1 of 4
command not defined
Posted: Fri Nov 29, 2013 9:49 am
by arenist
Hi folks,
I have a nagios-server (v.3.5) which is controlling several clients (nrpe v2.14). One week ago I decided to add some commands to nrpe.cfg. Now I have two clients that don't respond correctly. When starting nagios (on the server) I recieve mails that tell me a command is not defined. But I know that this command IS defined in /usr/local/nagios/etc/nrpe.cfg.
Let me show you an example:
I get a mail with following context:
***** Nagios *****
Notification Type: PROBLEM
Service: Free Space KFSDB-Online
Host: VBGMADB11
Address: 194.59.101.179
State: CRITICAL
Date/Time: Fri Nov 29 14:42:52 CET 2013
Additional Info:
NRPE: Command check_kfsdb_online_redo not defined
So I have a look at my config-file on the client:
Code: Select all
[nagios@vbgmadb11 nagios]$ pwd
/usr/local/nagios
[nagios@vbgmadb11 nagios]$ grep check_kfsdb_online_redo etc/nrpe.cfg
command[check_kfsdb_online_redo]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/kfsdb01/redo
Hmm, things look okay. So I try to run the command manually from the CLI of my nagios-server as OS-user nagios:
Code: Select all
nagios@madpbk51:~> id
uid=200(nagios) gid=200(nagios) Gruppen=200(nagios),210(nagcmd),504(oinstall)
nagios@madpbk51:~> /usr/local/nagios/libexec/check_nrpe -H VBGMADB11 -c check_kfsdb_online_redo
DISK OK - free space: /oradata/kfsdb01/redo 294848 MB (21% inode=81%);| /oradata/kfsdb01/redo=1094310MB;1111326;1250242;0;1389158
Surprise, I get an answer. Meanwhile I don't know whats running wrong. I don't understand why I get an information by typing the command via CLI and via nagios-server I get an error message. I've tried to restart xinetd on the remote machine, but there is no effect. I delete the objects.cache on all servers, but still no effect.
In the syslog of the server I find
Code: Select all
Nov 29 14:42:52 madpbk51 nagios: SERVICE NOTIFICATION: stl;vbgmadb11;Free Space KFSDB-Online;CRITICAL;notify-service-by-email;NRPE: Command check_kfsdb_online_redo not defined
The nagios log says:
Code: Select all
[1385645911] SERVICE NOTIFICATION: nagiosadmin;vbgmadb11;Free Space KFSDB-Online;CRITICAL;notify-service-by-email;NRPE: Command check_kfsdb_online_redo not defined
I hope someone of you can tell me a solution. I have no more idea...
Thanks & best regards,
Re: command not defined
Posted: Mon Dec 02, 2013 11:47 am
by sreinhardt
Could you post the nagios service configuration too please? From what you have posted, I agree that it looks correct from the cli and remote agent perspective, so let's look the next piece of the puzzle.
Re: command not defined
Posted: Tue Dec 03, 2013 1:24 am
by arenist
Hi Spenser,
here is the configuration of my service:
Code: Select all
# define service{
# use generic-service
# host_name vbgmadb11,vbgmadb71
# service_description Free Space KFSDB-Online
# contact_groups admins,db-admins
# normal_check_interval 30
# notification_interval 240
# check_command check_nrpe!check_kfsdb_online_redo
# }
It's actually a comment because I'd receive too many error messages and mails otherwise. None of the newly added commands on those 2 hosts wokrs correctly when called automatically. I'm quite sure there's somewhere else another config file, maybe a cached one that nagios connects to when calling the commands automatically. The "old" commands still work fine. I can remeber there was some time ago a similar problem. I added some contacts but they didn't get any mails by nagion. After deleting /usr/local/nagios/var/retention.dat on the client host they did.
Thanks for having a look at my posting. Best regards,
Werner
Re: command not defined
Posted: Tue Dec 03, 2013 11:01 am
by slansing
You need to add a -c in front of the command you are calling through NRPE unless it is in the command definition:
Code: Select all
check_command check_nrpe!-c check_kfsdb_online_redo
How is check_nrpe defined in your commands.cfg?
Re: command not defined
Posted: Wed Dec 04, 2013 1:40 am
by arenist
Hi slansing,
this was no good idea. I replaced each check_command check_nrpe!<nrpe_command> by check_command check_nrpe!-c <nrpe_command> in my services.cfg on the nagios server and get mails over mails looking like this:
***** Nagios *****
Notification Type: PROBLEM
Service: Checkpoint-Errors MP01
Host: vMACLDB91
Address: 194.59.106.167
State: CRITICAL
Date/Time: Wed Dec 4 07:22:05 CET 2013
Additional Info:
NRPE: Command -c not defined
My nagios documentation tells me to add services like this:
The following service will monitor the free drive space on /dev/hda1 on the remote host.
define service{
use generic-service
host_name remotehost
service_description /dev/hda1 Free Space
check_command check_nrpe!check_hda1
}
The following service will monitor the number of zombie processes on the remote host.
define service{
use generic-service
host_name remotehost
service_description Zombie Processes
check_command check_nrpe!check_zombie_procs
}
Regards, arenist
Re: command not defined
Posted: Wed Dec 04, 2013 11:24 am
by slansing
You need to also post your commands.cfg definition for check_nrpe... we had no idea of knowing that you already had "-c" defined in that command which it likely is.. my mistake.

Re: command not defined
Posted: Fri Dec 06, 2013 1:45 am
by arenist
Hi slansing,
here's my definition for check_nrpe from commands.cfg:
Code: Select all
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
Regards, arenist
Re: command not defined
Posted: Fri Dec 06, 2013 10:42 am
by slansing
Can you post the NRPE.cfg from the remote system you are checking against? If there is any information there that should be blocked out from public forum eyes you are encouraged to do so!
Re: command not defined
Posted: Fri Dec 06, 2013 2:21 pm
by arenist
This is my nrpe.cfg from the remote server. Entries marked with "# new" were added and are "not defined" via automatical call of nagios. They work only when called via CLI. The other entries work fine both ways.
Code: Select all
[nagios@vbgmadb11 ~]$ cd /usr/local/nagios/etc/
[nagios@vbgmadb11 etc]$ cat nrpe.cfg
#############################################################################
# Sample NRPE Config File
# Written by: Ethan Galstad ([email protected])
#
# Last Modified: 11-23-2007
#
# NOTES:
# This is a sample configuration file for the NRPE daemon. It needs to be
# located on the remote host that is running the NRPE daemon, not the host
# from which the check_nrpe client is being executed.
#############################################################################
# LOG FACILITY
# The syslog facility that should be used for logging purposes.
log_facility=daemon
# PID FILE
# The name of the file in which the NRPE daemon should write it's process ID
# number. The file is only written if the NRPE daemon is started by the root
# user and is running in standalone mode.
pid_file=/var/run/nrpe.pid
# PORT NUMBER
# Port number we should wait for connections on.
# NOTE: This must be a non-priviledged port (i.e. > 1024).
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd
server_port=5666
# SERVER ADDRESS
# Address that nrpe should bind to in case there are more than one interface
# and you do not want nrpe to bind on all interfaces.
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd
#server_address=127.0.0.1
# NRPE USER
# This determines the effective user that the NRPE daemon should run as.
# You can either supply a username or a UID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd
nrpe_user=nagios
# NRPE GROUP
# This determines the effective group that the NRPE daemon should run as.
# You can either supply a group name or a GID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd
nrpe_group=nagios
# ALLOWED HOST ADDRESSES
# This is an optional comma-delimited list of IP address or hostnames
# that are allowed to talk to the NRPE daemon.
#
# Note: The daemon only does rudimentary checking of the client's IP
# address. I would highly recommend adding entries in your /etc/hosts.allow
# file to allow only the specified host to connect to the port
# you are running this daemon on.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd
allowed_hosts=127.0.0.1
# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed. This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments
dont_blame_nrpe=0
# COMMAND PREFIX
# This option allows you to prefix all commands with a user-defined string.
# A space is automatically added between the specified prefix string and the
# command line from the command definition.
#
# *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! ***
# Usage scenario:
# Execute restricted commmands using sudo. For this to work, you need to add
# the nagios user to your /etc/sudoers. An example entry for alllowing
# execution of the plugins from might be:
#
# nagios ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/
#
# This lets the nagios user run all commands in that directory (and only them)
# without asking for a password. If you do this, make sure you don't give
# random users write access to that directory or its contents!
# command_prefix=/usr/bin/sudo
# DEBUGGING OPTION
# This option determines whether or not debugging messages are logged to the
# syslog facility.
# Values: 0=debugging off, 1=debugging on
debug=1
# COMMAND TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# allow plugins to finish executing before killing them off.
command_timeout=60
# CONNECTION TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# wait for a connection to be established before exiting. This is sometimes
# seen where a network problem stops the SSL being established even though
# all network sessions are connected. This causes the nrpe daemons to
# accumulate, eating system resources. Do not set this too low.
connection_timeout=300
# WEEK RANDOM SEED OPTION
# This directive allows you to use SSL even if your system does not have
# a /dev/random or /dev/urandom (on purpose or because the necessary patches
# were not applied). The random number generator will be seeded from a file
# which is either a file pointed to by the environment valiable $RANDFILE
# or $HOME/.rnd. If neither exists, the pseudo random number generator will
# be initialized and a warning will be issued.
# Values: 0=only seed from /dev/[u]random, 1=also seed from weak randomness
#allow_weak_random_seed=1
# INCLUDE CONFIG FILE
# This directive allows you to include definitions from an external config file.
# include=/usr/local/nagios/etc/nrpe.cfg
# INCLUDE CONFIG DIRECTORY
# This directive allows you to include definitions from config files (with a
# .cfg extension) in one or more directories (with recursion).
#include_dir=<somedirectory>
#include_dir=<someotherdirectory>
# COMMAND DEFINITIONS
# Command definitions that this daemon will run. Definitions
# are in the following format:
#
# command[<command_name>]=<command_line>
#
# When the daemon receives a request to return the results of <command_name>
# it will execute the command specified by the <command_line> argument.
#
# Unlike Nagios, the command line cannot contain macros - it must be
# typed exactly as it should be executed.
#
# Note: Any plugins that are used in the command lines must reside
# on the machine that this daemon is running on! The examples below
# assume that you have plugins installed in a /usr/local/nagios/libexec
# directory. Also note that you will have to modify the definitions below
# to match the argument format the plugins expect. Remember, these are
# examples only!
# The following examples use hardcoded command arguments...
# Partitionen ueberwachen
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/xvda3
command[check_var]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /var # new
command[check_opt_oracle]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /opt/oracle # new
command[check_qc_fs]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/qcprod # new
command[check_qc_archive_log]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/qcprod/archive_log # new
command[check_qc_online_redo]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/qcprod/online_redo # new
command[check_aris_fs]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/arissvr/dbf # new
command[check_aris_archive_log]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/arissvr/archive_log # new
command[check_aris_online_redo]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/arissvr/online_redo # new
command[check_arisdb_fs]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/arispub/dbf # new
command[check_arisdb_archive_log]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/arispub/archive_log # new
command[check_arisdb_online_redo]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/arispub/online_redo # new
command[check_sfkarte_fs]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/sfkarte # new
command[check_kfsdb_fs]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/kfsdb01/db # new
command[check_kfsdb_archive_log]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/kfsdb01/archive # new
command[check_kfsdb_online_redo]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/kfsdb01/redo # new
command[check_dd_fs]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/ddprod/dbf # new
command[check_dd_archive_log]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/ddprod/archive # new
command[check_dd_online_redo1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/ddprod/online01 # new
command[check_dd_online_redo2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /oradata/ddprod/online02 # new
# Netzkarten
command[check_eth0]=/usr/local/nagios/libexec/check_network_device.bsh -E 0
# System-Log
command[check_messages_err]=/usr/local/nagios/libexec/check_log -F /var/log/messages -O /usr/local/nagios/libexec/messages.err -q "error"
command[check_messages_prob]=/usr/local/nagios/libexec/check_log -F /var/log/messages -O /usr/local/nagios/libexec/messages.problem -q "problem"
command[check_messages_fail]=/usr/local/nagios/libexec/check_log -F /var/log/messages -O /usr/local/nagios/libexec/messages.fail -q "fail"
command[check_messages_down]=/usr/local/nagios/libexec/check_log -F /var/log/messages -O /usr/local/nagios/libexec/messages.down -q "down"
# DB, alertlog & Listener
command[check_qc]=/usr/local/nagios/libexec/check_oracle.bsh --db qcprod # new
command[check_QC_alert]=/usr/local/nagios/libexec/check_alertlog -F /opt/oracle/base/diag/rdbms/qcprod/qcprod/trace/alert_qcprod.log -O /usr/local/nagios/libexec/alert_QC.log # new
command[check_QC_alert_cp]=/usr/local/nagios/libexec/check_alertlog_checkpoint -F /opt/oracle/base/diag/rdbms/qcprod/qcprod/trace/alert_qcprod.log -O /usr/local/nagios/libexec/alert_QC_cp.log # new
command[check_ARIS]=/usr/local/nagios/libexec/check_oracle.bsh --db ARIS # new
command[check_ARIS_alert]=/usr/local/nagios/libexec/check_alertlog -F /opt/oracle/base/diag/rdbms/aris/ARIS/trace/alert_ARIS.log -O /usr/local/nagios/libexec/alert_ARIS.log # new
command[check_ARIS_alert_cp]=/usr/local/nagios/libexec/check_alertlog_checkpoint -F /opt/oracle/base/diag/rdbms/aris/ARIS/trace/alert_ARIS.log -O /usr/local/nagios/libexec/alert_ARIS_cp.log # new
command[check_ARISBP]=/usr/local/nagios/libexec/check_oracle.bsh --db ARISBP # new
command[check_ARISBP_alert]=/usr/local/nagios/libexec/check_alertlog -F /opt/oracle/base/diag/rdbms/arisbp/ARISBP/trace/alert_ARISBP.log -O /usr/local/nagios/libexec/alert_ARISBP.log # new
command[check_ARISBP_alert_cp]=/usr/local/nagios/libexec/check_alertlog_checkpoint -F /opt/oracle/base/diag/rdbms/arisbp/ARISBP/trace/alert_ARISBP.log -O /usr/local/nagios/libexec/alert_ARISBP_cp.log # new
command[check_sfkarte]=/usr/local/nagios/libexec/check_oracle.bsh --db sfkarteprod # new
command[check_sfk_alert]=/usr/local/nagios/libexec/check_alertlog -F /opt/oracle/base/diag/rdbms/sfkarteprod/sfkarteprod/trace/alert_sfkarteprod.log -O /usr/local/nagios/libexec/alert_sfk.log # new
command[check_sfk_alert_cp]=/usr/local/nagios/libexec/check_alertlog_checkpoint -F /opt/oracle/base/diag/rdbms/sfkarteprod/sfkarteprod/trace/alert_sfkarteprod.log -O /usr/local/nagios/libexec/alert_sfk_cp.log # new
command[check_dd]=/usr/local/nagios/libexec/check_oracle.bsh --db ddprod # new
command[check_dd_alert]=/usr/local/nagios/libexec/check_alertlog -F /opt/oracle/base/diag/rdbms/ddprod/ddprod/trace/alert_ddprod.log -O /usr/local/nagios/libexec/alert_dd.log # new
command[check_dd_alert_cp]=/usr/local/nagios/libexec/check_alertlog_checkpoint -F /opt/oracle/base/diag/rdbms/ddprod/ddprod/trace/alert_ddprod.log -O /usr/local/nagios/libexec/alert_dd_cp.log # new
command[check_kfsdb]=/usr/local/nagios/libexec/check_oracle.bsh --db KFSDB01 # new
command[check_kfs_alert]=/usr/local/nagios/libexec/check_alertlog -F /opt/oracle/base/diag/rdbms/kfsdb01/KFSDB01/trace/alert_KFSDB01.log -O /usr/local/nagios/libexec/alert_kfs.log # new
command[check_kfs_alert_cp]=/usr/local/nagios/libexec/check_alertlog_checkpoint -F /opt/oracle/base/diag/rdbms/kfsdb01/KFSDB01/trace/alert_KFSDB01.log -O /usr/local/nagios/libexec/alert_kfs_cp.log # new
command[check_lsnr]=/usr/local/nagios/libexec/check_listener.bsh
# Diverses
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load2]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 -r
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 200 -c 250
command[check_time]=/usr/local/nagios/libexec/check_ntp_time -H ntp.dpma.de -w ~0.4:0.4 -c ~0.5:0.5
command[check_cores]=/usr/local/nagios/libexec/check_cores.bsh
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
# The following examples allow user-supplied arguments and can
# only be used if the NRPE daemon was compiled with support for
# command arguments *AND* the dont_blame_nrpe directive in this
# config file is set to '1'. This poses a potential security risk, so
# make sure you read the SECURITY file before doing this.
#command[check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
#command[check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
#command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
command[check_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
Re: command not defined
Posted: Fri Dec 06, 2013 3:07 pm
by lmiltchev
Are you running NRPE under xinetd or as a standalone daemon?