Page 1 of 2

nrpe not working

Posted: Thu Jul 27, 2017 11:32 am
by john.1209
Error: Service check command 'check_nrpe!check_load' specified in service 'check_load' for host 'myhost' not defined anywhere!

I'm getting the above error when I try to start nagios.

This is a new configuration running on Centos 7

I'm adding the first client.

I can run the following on the command line:

[root@netmon objects]# /usr/local/nagios/libexec/check_nrpe -H 192.168.1.62 -c check_users
USERS OK - 1 users currently logged in |users=1;5;10;0

But on startup, I get the error below:

Error: Service check command 'check_nrpe!check_load' specified in service 'check_load' for host 'myhost' not defined anywhere!

Here are some contents of the commands.cfg file:

# 'check_local_load' command definition
define command{
command_name check_local_load
command_line $USER1$/check_load -w $ARG1$ -c $ARG2$
}

Here are contents of the host.cfg file causing the error:

define service{
use generic-service ; Name of service template to use
host_name myhost.mydomain.com
service_description check_load
check_command check_nrpe!check_load
notifications_enabled 1
}

Any idea why?

Thanks,

Re: nrpe not working

Posted: Thu Jul 27, 2017 11:58 am
by kyang
Hello,

Have you checked if your 'myhost' IP address is allowed in the /usr/local/nagios/etc/nrpe.cfg under allowed_hosts

Re: nrpe not working

Posted: Thu Jul 27, 2017 3:11 pm
by john.1209
Yeah, actually that is in there....

[root@trasher etc]# grep allowed_hosts nrpe.cfg
allowed_hosts=127.0.0.1,192.168.1.58
[root@trasher etc]#

Here is a thought. check out the following:

[root@netmon objects]# pwd
/usr/local/nagios/etc/objects

[root@netmon objects]# ls
commands.cfg contacts.cfg localhost.cfg nrln printer.cfg switch.cfg templates.cfg timeperiods.cfg windows.cfg

[root@netmon objects]# grep check_load *
commands.cfg: command_line $USER1$/check_load -w $ARG1$ -c $ARG2$

In the /usr/local/nagios/etc folder on the Nagios server host, is there suppose to be a "linux" template file such as linux.cfg? The only place I'm seeing check_load is in the commands.cfg file. Am I missing any files in the /usr/local/nagios/etc/objects folder? I'm just thinking out loud here.

TIA.

Re: nrpe not working

Posted: Thu Jul 27, 2017 3:36 pm
by lmiltchev
When do you see the error message - when you try to start/restart nagios?

Can you post the entire nrpe.cfg file from the client (remote machine)?

Does check_nrpe work correctly when you test it from the nagios server?

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H <client ip>
The proper nrpe syntax is:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H <client ip> -c <command> -a <args>
where the "<command>" is defined in nrpe.cfg file on the remote box.

Re: nrpe not working

Posted: Thu Jul 27, 2017 6:49 pm
by john.1209
NRPE check works fine from the server when checked from the command line. Please note:

Code: Select all

)
[root@netmon objects]# /usr/local/nagios/libexec/check_nrpe -H 192.168.1.62 -c check_users
USERS OK - 1 users currently logged in |users=1;5;10;0
[root@netmon objects]#
The NRPE.CFG file from the client is posted below.

Code: Select all

[root@trasher etc]# grep -v "^#" nrpe.cfg  | grep -v "^$"
log_facility=daemon
debug=0
pid_file=/usr/local/nagios/var/nrpe.pid
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=127.0.0.1,192.168.1.58
dont_blame_nrpe=0
allow_bash_command_substitution=0
command_timeout=60
connection_timeout=300
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -r -w .15,.10,.05 -c .30,.25,.20
command[check_root]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/mapper/cl-root
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

the error log after starting nagios is posted below:

Code: Select all

[root@netmon objects]# journalctl -xe
Jul 27 19:41:29 netmon.mydomain.com systemd[1]: nagios.service: control process exited, code=exited status=8
Jul 27 19:41:29 netmon.mydomain.com nagios[29349]: the HTML documentation regarding the config files, as well as the
Jul 27 19:41:29 netmon.mydomain.com nagios[29349]: 'Whats New' section to find out what has changed.
Jul 27 19:41:29 netmon.mydomain.com systemd[1]: Failed to start LSB: Starts and stops the Nagios monitoring server.
-- Subject: Unit nagios.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit nagios.service has failed.
--
-- The result is failed.
Jul 27 19:41:29 netmon.mydomain.com systemd[1]: Unit nagios.service entered failed state.
Jul 27 19:41:29 netmon.mydomain.com systemd[1]: nagios.service failed.
Jul 27 19:41:29 netmon.mydomain.com polkitd[666]: Unregistered Authentication Agent for unix-process:29344:29580184 (system bus name :1.1195, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
Jul 27 19:43:01 netmon.mydomain.com polkitd[666]: Registered Authentication Agent for unix-process:29391:29589353 (system bus name :1.1196 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/Policy
Jul 27 19:43:01 netmon.mydomain.com systemd[1]: Starting LSB: Starts and stops the Nagios monitoring server...
-- Subject: Unit nagios.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit nagios.service has begun starting up.
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: Nagios 4.3.2 starting... (PID=29416)
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: Local time is Thu Jul 27 19:43:01 EDT 2017
Jul 27 19:43:01 netmon.mydomain.com nagios[29396]: Starting nagios: done.
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: LOG VERSION: 2.0
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: qh: core query handler registered
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: nerd: Channel hostchecks registered successfully
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: nerd: Channel servicechecks registered successfully
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: nerd: Channel opathchecks registered successfully
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: nerd: Fully initialized and ready to rock!
Jul 27 19:43:01 netmon.mydomain.com systemd[1]: Started LSB: Starts and stops the Nagios monitoring server.
-- Subject: Unit nagios.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit nagios.service has finished starting up.
--
-- The start-up result is done.
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: wproc: Successfully registered manager as @wproc with query handler
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: wproc: Registry request: name=Core Worker 29418;pid=29418
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: wproc: Registry request: name=Core Worker 29420;pid=29420
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: wproc: Registry request: name=Core Worker 29421;pid=29421
Jul 27 19:43:01 netmon.mydomain.com polkitd[666]: Unregistered Authentication Agent for unix-process:29391:29589353 (system bus name :1.1196, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: wproc: Registry request: name=Core Worker 29419;pid=29419
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: WARNING: The normal_retry_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: WARNING: The normal_retry_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
Jul 27 19:43:01 netmon.mydomain.com nagios[29416]: Successfully launched command file worker with pid 29422
Jul 27 19:44:12 netmon.mydomain.com sshd[29433]: Connection closed by 127.0.0.1 [preauth]
Jul 27 19:44:20 netmon.mydomain.com polkitd[666]: Registered Authentication Agent for unix-process:29438:29597212 (system bus name :1.1197 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/Policy
Jul 27 19:44:20 netmon.mydomain.com polkitd[666]: Unregistered Authentication Agent for unix-process:29438:29597212 (system bus name :1.1197, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
[root@netmon objects]#

Re: nrpe not working

Posted: Fri Jul 28, 2017 10:28 am
by lmiltchev
Can you post the entire commands.cfg file from the nagios server? You must be missing a command.

Re: nrpe not working

Posted: Fri Jul 28, 2017 1:37 pm
by john.1209
here is the contents of commands.cfg:

Code: Select all

[root@netmon objects]# cat commands.cfg
###############################################################################
# COMMANDS.CFG - SAMPLE COMMAND DEFINITIONS FOR NAGIOS 4.2.4
#
#
# NOTES: This config file provides you with some example command definitions
#        that you can reference in host, service, and contact definitions.
#
#        You don't need to keep commands in a separate file from your other
#        object definitions.  This has been done just to make things easier to
#        understand.
#
###############################################################################


################################################################################
#
# SAMPLE NOTIFICATION COMMANDS
#
# These are some example notification commands.  They may or may not work on
# your system without modification.  As an example, some systems will require
# you to use "/usr/bin/mailx" instead of "/usr/bin/mail" in the commands below.
#
################################################################################


# 'notify-host-by-email' command definition
define command{
        command_name    notify-host-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
        }

# 'notify-service-by-email' command definition
define command{
        command_name    notify-service-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
        }





################################################################################
#
# SAMPLE HOST CHECK COMMANDS
#
################################################################################


# This command checks to see if a host is "alive" by pinging it
# The check must result in a 100% packet loss or 5 second (5000ms) round trip
# average time to produce a critical error.
# Note: Five ICMP echo packets are sent (determined by the '-p 5' argument)

# 'check-host-alive' command definition
define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
        }




################################################################################
#
# SAMPLE SERVICE CHECK COMMANDS
#
# These are some example service check commands.  They may or may not work on
# your system, as they must be modified for your plugins.  See the HTML
# documentation on the plugins for examples of how to configure command definitions.
#
# NOTE:  The following 'check_local_...' functions are designed to monitor
#        various metrics on the host that Nagios is running on (i.e. this one).
################################################################################

# 'check_local_disk' command definition
define command{
        command_name    check_local_disk
        command_line    $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
        }


# 'check_local_load' command definition
define command{
        command_name    check_local_load
        command_line    $USER1$/check_load -w $ARG1$ -c $ARG2$
        }


# 'check_local_procs' command definition
define command{
        command_name    check_local_procs
        command_line    $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
        }


# 'check_local_users' command definition
define command{
        command_name    check_local_users
#         command_name    check_users
        command_line    $USER1$/check_users -w $ARG1$ -c $ARG2$
        }


# 'check_local_swap' command definition
define command{
        command_name    check_local_swap
        command_line    $USER1$/check_swap -w $ARG1$ -c $ARG2$
        }


# 'check_local_mrtgtraf' command definition
define command{
        command_name    check_local_mrtgtraf
        command_line    $USER1$/check_mrtgtraf -F $ARG1$ -a $ARG2$ -w $ARG3$ -c $ARG4$ -e $ARG5$
        }


################################################################################
# NOTE:  The following 'check_...' commands are used to monitor services on
#        both local and remote hosts.
################################################################################

# 'check_ftp' command definition
define command{
        command_name    check_ftp
        command_line    $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$
        }


# 'check_hpjd' command definition
define command{
        command_name    check_hpjd
        command_line    $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$
        }


# 'check_snmp' command definition
define command{
        command_name    check_snmp
        command_line    $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
        }


# 'check_http' command definition
define command{
        command_name    check_http
        command_line    $USER1$/check_http -I $HOSTADDRESS$ $ARG1$
        }


# 'check_ssh' command definition
define command{
        command_name    check_ssh
        command_line    $USER1$/check_ssh $ARG1$ $HOSTADDRESS$
        }


# 'check_dhcp' command definition
define command{
        command_name    check_dhcp
        command_line    $USER1$/check_dhcp $ARG1$
        }


# 'check_ping' command definition
define command{
        command_name    check_ping
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
        }


# 'check_pop' command definition
define command{
        command_name    check_pop
        command_line    $USER1$/check_pop -H $HOSTADDRESS$ $ARG1$
        }


# 'check_imap' command definition
define command{
        command_name    check_imap
        command_line    $USER1$/check_imap -H $HOSTADDRESS$ $ARG1$
        }


# 'check_smtp' command definition
define command{
        command_name    check_smtp
        command_line    $USER1$/check_smtp -H $HOSTADDRESS$ $ARG1$
        }


# 'check_tcp' command definition
define command{
        command_name    check_tcp
        command_line    $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
        }


# 'check_udp' command definition
define command{
        command_name    check_udp
        command_line    $USER1$/check_udp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
        }


# 'check_nt' command definition
define command{
        command_name    check_nt
        command_line    $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$
        }



################################################################################
#
# SAMPLE PERFORMANCE DATA COMMANDS
#
# These are sample performance data commands that can be used to send performance
# data output to two text files (one for hosts, another for services).  If you
# plan on simply writing performance data out to a file, consider using the
# host_perfdata_file and service_perfdata_file options in the main config file.
#
################################################################################


# 'process-host-perfdata' command definition
define command{
        command_name    process-host-perfdata
        command_line    /usr/bin/printf "%b" "$LASTHOSTCHECK$\t$HOSTNAME$\t$HOSTSTATE$\t$HOSTATTEMPT$\t$HOSTSTATETYPE$\t$HOSTEXECUTIONTIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$\n" >> /usr/local/nagios/var/host-perfdata.out
        }


# 'process-service-perfdata' command definition
define command{
        command_name    process-service-perfdata
        command_line    /usr/bin/printf "%b" "$LASTSERVICECHECK$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATE$\t$SERVICEATTEMPT$\t$SERVICESTATETYPE$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$\n" >> /usr/local/nagios/var/service-perfdata.out
        }


[root@netmon objects]#

Re: nrpe not working

Posted: Fri Jul 28, 2017 1:59 pm
by lmiltchev
Your service is defined as such:
define service{
use generic-service ; Name of service template to use
host_name myhost.mydomain.com
service_description check_load
check_command check_nrpe!check_load
notifications_enabled 1
}
however you don't have "check_nrpe" defined in your commands... That's why you are getting the error you showed us:
Error: Service check command 'check_nrpe!check_load' specified in service 'check_load' for host 'myhost' not defined anywhere!
Have you installed NRPE on the Nagios server? Do you have the check_nrpe plugin in the plugins directory?

Re: nrpe not working

Posted: Fri Jul 28, 2017 3:35 pm
by john.1209
I've installed it. Do I need to install it again?

Note...

Code: Select all

[root@netmon libexec]# ls check_nrpe
check_nrpe
Should I run the NRPE install again?

Re: nrpe not working

Posted: Fri Jul 28, 2017 4:26 pm
by lmiltchev
If you already have the check_nrpe plugin in the libexec directory, add the check_nrpe command to the commands.cfg file:

Example:

Code: Select all

define command {
       command_name                             check_nrpe
       command_line                             $USER1$/check_nrpe -H $HOSTADDRESS$ -t 30 -c $ARG1$ $ARG2$
}
Next, verify the configuration:

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
If you don't have any errors, start nagios service.