Page 1 of 1
NRPE disk check reporting false positive
Posted: Thu Sep 24, 2020 1:28 pm
by rferebee
Good morning team,
I'm attempting to troubleshoot an issue that seems to occur randomly from time to time. We have several NRPE based disk checks, checking against various Linux hosts and they will randomly report that the disk is completely full for about a week and then the alert will clear on it's own.
Here's an example of the command being ran:
Code: Select all
/usr/local/nagios/libexec/check_nrpe -2 -H xx.xx.xx.xx -c check_disk1 -t 30 -a 5% 3% "/u04/dbs"
DISK CRITICAL - free space: /u04/dbs 0 MB (0% inode=99%);| /u04/dbs=4646735MB;12685312;12952371;0;13352960
Here's how the check_nrpe command is configured:
Code: Select all
$USER1$/check_nrpe -2 -H $HOSTADDRESS$ -t 30 -c $ARG1$ $ARG2$
And finally here's a df ran on the host in question showing the disk is not full:
Code: Select all
@mondo:/home/admsa>df -gt
Filesystem GB blocks Used Free %Used Mounted on
/dev/hd4 1.25 0.24 1.01 20% /
/dev/hd2 5.31 4.18 1.13 79% /usr
/dev/hd9var 2.53 0.59 1.94 24% /var
/dev/hd3 3.00 0.05 2.95 2% /tmp
/dev/hd1 2.53 0.00 2.53 1% /home
/proc - - - - /proc
/dev/hd10opt 3.03 1.33 1.70 44% /opt
/dev/livedump 0.25 0.00 0.25 1% /var/adm/ras/livedump
/dev/lvu01dbs 50.00 17.72 32.28 36% /u01/dbs
/dev/lvu03dbs 15.00 9.57 5.43 64% /u03/dbs
/dev/lvu02dbs 6200.12 5616.07 584.05 91% /u02/dbs
/dev/lvexport 240.00 1.72 238.28 1% /dbexports
/dev/lvoracle 50.00 31.66 18.34 64% /u01/app/oracle
chance:/admin 150.00 123.38 26.62 83% /admin
***/dev/lvu04dbs 13040.00 4517.10 8522.90 35% /u04/dbs***
Re: NRPE disk check reporting false positive
Posted: Fri Sep 25, 2020 1:07 pm
by benjaminsmith
Hi
@rferebee,
Let's check the status of the inodes on this system, What is the output of the following command?
Benjamin
Re: NRPE disk check reporting false positive
Posted: Fri Sep 25, 2020 1:29 pm
by rferebee
The system didn't like df -ih for some reason. Here are a couple other df commands:
Code: Select all
# df -ih
df: Not a recognized flag: h
Usage: df [-P] | [-IMitvc] [-gkm] [-s] [-T {local|remote|vfstype}] [-F {hdr1 hdr2 hdr3}] [filesystem ...] [file ...]
# df -i
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 2621440 2110112 20% 17313 7% /
/dev/hd2 11141120 2375496 79% 65863 19% /usr
/dev/hd9var 5308416 4052496 24% 5707 2% /var
/dev/hd3 6291456 6185288 2% 2250 1% /tmp
/dev/hd1 5308416 5299432 1% 167 1% /home
/proc - - - - - /proc
/dev/hd10opt 6356992 3580800 44% 11310 3% /opt
/dev/livedump 524288 523552 1% 4 1% /var/adm/ras/livedump
/dev/lvu01dbs 104857600 67695696 36% 31 1% /u01/dbs
/dev/lvu03dbs 31457280 11389760 64% 17 1% /u03/dbs
/dev/lvu02dbs 13002604544 1224849408 91% 196 1% /u02/dbs
/dev/lvu04dbs 27346862080 17815081904 35% 282 1% /u04/dbs
/dev/lvexport 503316480 499699064 1% 34 1% /dbexports
/dev/lvoracle 104857600 38379640 64% 83237 2% /u01/app/oracle
chance:/admin 314572800 73694856 77% 59329 1% /admin
# df -gt
Filesystem GB blocks Used Free %Used Mounted on
/dev/hd4 1.25 0.24 1.01 20% /
/dev/hd2 5.31 4.18 1.13 79% /usr
/dev/hd9var 2.53 0.60 1.93 24% /var
/dev/hd3 3.00 0.05 2.95 2% /tmp
/dev/hd1 2.53 0.00 2.53 1% /home
/proc - - - - /proc
/dev/hd10opt 3.03 1.32 1.71 44% /opt
/dev/livedump 0.25 0.00 0.25 1% /var/adm/ras/livedump
/dev/lvu01dbs 50.00 17.72 32.28 36% /u01/dbs
/dev/lvu03dbs 15.00 9.57 5.43 64% /u03/dbs
/dev/lvu02dbs 6200.12 5616.07 584.05 91% /u02/dbs
/dev/lvu04dbs 13040.00 4545.11 8494.89 35% /u04/dbs
/dev/lvexport 240.00 1.72 238.28 1% /dbexports
/dev/lvoracle 50.00 31.70 18.30 64% /u01/app/oracle
chance:/admin 150.00 114.86 35.14 77% /admin
Re: NRPE disk check reporting false positive
Posted: Mon Sep 28, 2020 2:25 pm
by ssax
What plugin is check_disk1 calling? Please attach the plugin from the remote system so I can look at it.
Please send the command definition for check_disk1 from your nrpe.cfg from the remote system so we can see how it's setup.
Re: NRPE disk check reporting false positive
Posted: Mon Sep 28, 2020 2:35 pm
by rferebee
'check_disk1' is calling 'check_disk' on the remote system. Plugin attached.
Here's the cat for nrpe.cfg on the system in question:
Code: Select all
command[check_ss]=/usr/local/nagios/libexec/check_ss.pl -c $ARG1$
command[check_rp]=/usr/local/nagios/libexec/check_rp.pl -t $ARG1$ -c $ARG2$
command[check_cosmos]=/usr/local/nagios/libexec/cosmos_check.pl
command[check_dns]=/usr/local/nagios/libexec/check_dns yahoo.com
command[check_active_procs]=/usr/local/nagios/libexec/check_procs -c $ARG1$ -a $ARG2$
command[check_mem]=/usr/local/nagios/libexec/check_mem -f -w $ARG1$ -c $ARG2$
command[check_mem2]=/usr/local/nagios/libexec/check_mem2 -w $ARG1$ -c $ARG2$
command[check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
command[check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
command[check_disk1]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
command[check_disk2]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$
command[check_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -a $ARG3$
command[check_java_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -P $ARG2$ -a $ARG3$
command[check_oracle]=/usr/local/nagios/libexec/check_oracle --login $ARG1$
command[check_procs_multi]=/usr/local/nagios/libexec/check_procs -c $ARG1$ -C $ARG2$ -a '$ARG3$'
Re: NRPE disk check reporting false positive
Posted: Tue Sep 29, 2020 4:36 pm
by ssax
Please SSH into the remote system, run these commands, and attach the full output:
Code: Select all
su - nagios
/usr/local/nagios/libexec/check_disk -V
/usr/local/nagios/libexec/check_disk -w 5% -c 3% -p "/u04/dbs" -v
Re: NRPE disk check reporting false positive
Posted: Wed Sep 30, 2020 9:27 am
by rferebee
Here you go:
Code: Select all
$ /usr/local/nagios/libexec/check_disk -V
check_disk v1.4.14 (nagios-plugins 1.4.14)
Code: Select all
$ /usr/local/nagios/libexec/check_disk -w 5% -c 3% -p "/u04/dbs" -v
DISK CRITICAL - free space: /u04/dbs 0 MB (0% inode=99%);| /u04/dbs=4653354MB;12685312;12952371;0;13352960
Re: NRPE disk check reporting false positive
Posted: Wed Sep 30, 2020 5:47 pm
by ssax
You're running a really old version of that plugin, try this (run these commands on the remote system):
- This will download the latest, compile it, and test it, it will not update your plugin unless you run the
\cp command at the end
Code: Select all
cd /tmp
wget https://nagios-plugins.org/download/nagios-plugins-2.3.3.tar.gz
tar zxf nagios-plugins-2.3.3.tar.gz
cd nagios-plugins-2.3.3
./configure
make all
cd plugins
./check_disk -w 5% -c 3% -p "/u04/dbs" -v
If that one shows properly, you can do this to upgrade the current plugin:
Code: Select all
\cp -f /tmp/nagios-plugins-2.3.3/plugins/check_disk /usr/local/nagios/libexec/check_disk
Re: NRPE disk check reporting false positive
Posted: Thu Oct 01, 2020 10:21 am
by rferebee
That did the trick, thank you very much.
You can lock this thread.
Re: NRPE disk check reporting false positive
Posted: Thu Oct 01, 2020 10:55 am
by scottwilkerson
rferebee wrote:That did the trick, thank you very much.
You can lock this thread.
Great!
Locking thread