Fixing damaged and/or partial installs of Nagios

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
agenerette
Posts: 50
Joined: Wed Jul 25, 2012 5:09 pm

Re: Fixing damaged and/or partial installs of Nagios

Post by agenerette »

No, the only change that I've made, in about 6 months is to add the apt-if-appropriate code, for installing apt only on Debian/Ubuntu VMs.

Do you happen to know what steps I would need to follow to get the check_disk command setup? If I can just get that down to a manual process, I'll be able to add it to my Chef config.

The primary problem, at this point, is that "NRPE: Command 'check_disk' not defined" that's showing on a number of the nodes (see the attached screen-shot). The timeout and ssh-handshake alerts, I believe I'll be able to take care of.

-Anthony
Attachments
Screen Shot 2014-09-03 at 11.30.38 AM.png
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Fixing damaged and/or partial installs of Nagios

Post by eloyd »

"check_disk" is a pretty standard NRPE check.

In commands.cfg:

Code: Select all

define command{
        command_name    check_nrpe
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a "$ARG2$"
}
In services.cfg:

Code: Select all

define service{
        use                     nrpe-service
        service_description     Root Partition
        hostgroups              private
        servicegroups           System,NRPE
        check_command           check_nrpe!check_disk!-w 20% -c 10% -p /
}
(Note, we use hostgroups to associate service checks, so you may want to use a hostname instead.)

On the client, somewhere in nrpe.cfg:

Code: Select all

command[check_disk]=/usr/local/nagios/libexec/check_disk $ARG1$
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
agenerette
Posts: 50
Joined: Wed Jul 25, 2012 5:09 pm

Re: Fixing damaged and/or partial installs of Nagios

Post by agenerette »

After looking over your last post, I ran this, on the Nagios server:

root@ip-10-244-20-90:/etc# grep -R check_nrpe *
nagios3/conf.d/commands.cfg: command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_nagios -t 20
nagios3/conf.d/commands.cfg: command_name check_nrpe_alive
nagios3/conf.d/commands.cfg: command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 20
nagios3/conf.d/commands.cfg: command_name check_nrpe
nagios3/conf.d/commands.cfg: command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t 20
nagios3/conf.d/services.cfg:# check_command check_nrpe!check_smtp
nagios3/conf.d/services.cfg: check_command check_nrpe!check_disk
nagios3/conf.d/services.cfg: check_command check_nrpe!check_disk
nagios3/conf.d/services.cfg.bak:# check_command check_nrpe!check_smtp
nagios-plugins/config/check_nrpe.cfg: command_name check_nrpe
nagios-plugins/config/check_nrpe.cfg: command_line /usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$
nagios-plugins/config/check_nrpe.cfg: command_name check_nrpe_1arg
nagios-plugins/config/check_nrpe.cfg: command_line /usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

So, I started looking through the commands.cfg and services.cfg files... It looks like the check_nrpe command and two check_disk services are already in place. On the portal-production node, though, the directory /usr/local/nagios/libexec/ doesn't exist. On that machine, I find:

[root@ip-10-160-23-32 objects]# find / -name check_disk -print
/var/chef/cache/nagios-plugins-1.4.16/plugins/check_disk
/usr/lib64/nagios/plugins/check_disk
[root@ip-10-160-23-32 objects]#

I tried adding

command[check_disk]=/usr/lib64/nagios/plugins/check_disk $ARG1$

At the end of /etc/nagios/nrpe.cfg on the portal-production node and restarting all of the services, but this doesn't appear to have helped.

-Anthony
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Fixing damaged and/or partial installs of Nagios

Post by eloyd »

Is NRPE running on the remote host?

Code: Select all

netstat -na | grep 5666
or
lsof -i:5666
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
agenerette
Posts: 50
Joined: Wed Jul 25, 2012 5:09 pm

Re: Fixing damaged and/or partial installs of Nagios

Post by agenerette »

Yeah, from portal-production (the monitored node), I get:
[root@ip-10-160-23-32 ~]# netstat -na | grep 5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN
tcp 0 0 :::5666


And from the Nagios server, I get:
root@ip-10-244-20-90:~# /usr/lib/nagios/plugins/check_nrpe -H 50.18.168.192
NRPE v2.15
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Fixing damaged and/or partial installs of Nagios

Post by eloyd »

On the remote host, what happens when you log in as nagios (or su to nagios) and type:

Code: Select all

/usr/lib64/nagios/plugins/check_disk !-w 20% -c 10% -p /
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
agenerette
Posts: 50
Joined: Wed Jul 25, 2012 5:09 pm

Re: Fixing damaged and/or partial installs of Nagios

Post by agenerette »

I get:

[nagios@ip-10-160-23-32 ~]$ /usr/lib64/nagios/plugins/check_disk !-w 20% -c 10% -p /
/usr/lib64/nagios/plugins/check_disk which ohai 20% -c 10% -p /
DISK CRITICAL - which is not accessible: No such file or directory

But,

[root@ip-10-160-23-32 ~]# which ohai
/usr/bin/ohai
Last edited by agenerette on Thu Sep 04, 2014 11:27 am, edited 1 time in total.
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Fixing damaged and/or partial installs of Nagios

Post by eloyd »

Whoops! I had an extra character in there. Try this:

Code: Select all

/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /\
without the ! :-)
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
agenerette
Posts: 50
Joined: Wed Jul 25, 2012 5:09 pm

Re: Fixing damaged and/or partial installs of Nagios

Post by agenerette »

Oh, I saw your comment after I posted that edit to my last...

[root@ip-10-160-23-32 ~]# /usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /
DISK OK - free space: / 5204 MB (65% inode=79%);| /=2777MB;6450;7256;0;8063


So, I just started looking around in /etc/nagios3/conf.d, on the Nagios server. The fact that there are /etc/nagios and /etc/nagios3 directories on that machine has caused me some confusion, up to now. .../nagios3/conf.d/commands.cfg, though, seems to be the place where check_disk needs to be defined, but I saw no code there for the command. So, I added the following code to the file and restarted the Nagios services on both server and node:

define command {
command_name check_disk
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_disk -a '-w 10% -c 5% -p /'
}

Again, 'no luck.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Fixing damaged and/or partial installs of Nagios

Post by sreinhardt »

At this point, it looks like you just need to reconfigure the nrpe side of things. Nagios knows about your nrpe systems, is trying to communicate, but getting stopped because of two main things. The commands are not defined, at least in the configs shown, for nrpe to know how to execute check_disk and others. Second possible issue, is the reference of /usr/lib/ instead of /usr/lib64 as it appears you need to use.

To clarify some on the directories you are seeing, most of the time nagios3 is from an ubuntu\debian rpm install of nagios core. The /etc/nagios dir, could be either nrpe from rpm, a source install of any number of nagios products, or another version of core installed on that system.

To define the command for the nrpe system with plugins in lib64, you nrpe command would look like:

Code: Select all

command[check_disk]=/usr/lib64/nagios/plugins/check_disk $ARG1$
After adding that, restart xinetd or the nrpe daemon, not sure if you are using xinetd or not at this point, and you should be all set to run an immediate check from nagios to get some results!
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Locked