Page 1 of 1

NRPE disk check reporting false positive

Posted: Thu Sep 24, 2020 1:28 pm
by rferebee
Good morning team,

I'm attempting to troubleshoot an issue that seems to occur randomly from time to time. We have several NRPE based disk checks, checking against various Linux hosts and they will randomly report that the disk is completely full for about a week and then the alert will clear on it's own.

Here's an example of the command being ran:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -2 -H xx.xx.xx.xx -c check_disk1 -t 30 -a 5% 3% "/u04/dbs"
DISK CRITICAL - free space: /u04/dbs 0 MB (0% inode=99%);| /u04/dbs=4646735MB;12685312;12952371;0;13352960
Here's how the check_nrpe command is configured:

Code: Select all

$USER1$/check_nrpe -2 -H $HOSTADDRESS$ -t 30 -c $ARG1$ $ARG2$
And finally here's a df ran on the host in question showing the disk is not full:

Code: Select all

@mondo:/home/admsa>df -gt
Filesystem    GB blocks      Used      Free %Used Mounted on
/dev/hd4           1.25      0.24      1.01   20% /
/dev/hd2           5.31      4.18      1.13   79% /usr
/dev/hd9var        2.53      0.59      1.94   24% /var
/dev/hd3           3.00      0.05      2.95    2% /tmp
/dev/hd1           2.53      0.00      2.53    1% /home
/proc                 -         -         -    - /proc
/dev/hd10opt       3.03      1.33      1.70   44% /opt
/dev/livedump      0.25      0.00      0.25    1% /var/adm/ras/livedump
/dev/lvu01dbs     50.00     17.72     32.28   36% /u01/dbs
/dev/lvu03dbs     15.00      9.57      5.43   64% /u03/dbs
/dev/lvu02dbs   6200.12   5616.07    584.05   91% /u02/dbs
/dev/lvexport    240.00      1.72    238.28    1% /dbexports
/dev/lvoracle     50.00     31.66     18.34   64% /u01/app/oracle
chance:/admin     150.00    123.38     26.62   83% /admin
***/dev/lvu04dbs  13040.00   4517.10   8522.90   35% /u04/dbs***

Re: NRPE disk check reporting false positive

Posted: Fri Sep 25, 2020 1:07 pm
by benjaminsmith
Hi @rferebee,

Let's check the status of the inodes on this system, What is the output of the following command?

Code: Select all

df -ih
Benjamin

Re: NRPE disk check reporting false positive

Posted: Fri Sep 25, 2020 1:29 pm
by rferebee
The system didn't like df -ih for some reason. Here are a couple other df commands:

Code: Select all

# df -ih
df: Not a recognized flag: h
Usage: df  [-P] | [-IMitvc] [-gkm] [-s] [-T {local|remote|vfstype}] [-F {hdr1 hdr2 hdr3}] [filesystem ...] [file ...]
# df -i
Filesystem    512-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4         2621440   2110112   20%    17313     7% /
/dev/hd2        11141120   2375496   79%    65863    19% /usr
/dev/hd9var      5308416   4052496   24%     5707     2% /var
/dev/hd3         6291456   6185288    2%     2250     1% /tmp
/dev/hd1         5308416   5299432    1%      167     1% /home
/proc                  -         -    -        -      - /proc
/dev/hd10opt     6356992   3580800   44%    11310     3% /opt
/dev/livedump     524288    523552    1%        4     1% /var/adm/ras/livedump
/dev/lvu01dbs  104857600  67695696   36%       31     1% /u01/dbs
/dev/lvu03dbs   31457280  11389760   64%       17     1% /u03/dbs
/dev/lvu02dbs 13002604544 1224849408   91%      196     1% /u02/dbs
/dev/lvu04dbs 27346862080 17815081904   35%      282     1% /u04/dbs
/dev/lvexport  503316480 499699064    1%       34     1% /dbexports
/dev/lvoracle  104857600  38379640   64%    83237     2% /u01/app/oracle
chance:/admin   314572800  73694856   77%    59329     1% /admin
# df -gt
Filesystem    GB blocks      Used      Free %Used Mounted on
/dev/hd4           1.25      0.24      1.01   20% /
/dev/hd2           5.31      4.18      1.13   79% /usr
/dev/hd9var        2.53      0.60      1.93   24% /var
/dev/hd3           3.00      0.05      2.95    2% /tmp
/dev/hd1           2.53      0.00      2.53    1% /home
/proc                 -         -         -    - /proc
/dev/hd10opt       3.03      1.32      1.71   44% /opt
/dev/livedump      0.25      0.00      0.25    1% /var/adm/ras/livedump
/dev/lvu01dbs     50.00     17.72     32.28   36% /u01/dbs
/dev/lvu03dbs     15.00      9.57      5.43   64% /u03/dbs
/dev/lvu02dbs   6200.12   5616.07    584.05   91% /u02/dbs
/dev/lvu04dbs  13040.00   4545.11   8494.89   35% /u04/dbs
/dev/lvexport    240.00      1.72    238.28    1% /dbexports
/dev/lvoracle     50.00     31.70     18.30   64% /u01/app/oracle
chance:/admin     150.00    114.86     35.14   77% /admin

Re: NRPE disk check reporting false positive

Posted: Mon Sep 28, 2020 2:25 pm
by ssax
What plugin is check_disk1 calling? Please attach the plugin from the remote system so I can look at it.

Please send the command definition for check_disk1 from your nrpe.cfg from the remote system so we can see how it's setup.

Re: NRPE disk check reporting false positive

Posted: Mon Sep 28, 2020 2:35 pm
by rferebee
'check_disk1' is calling 'check_disk' on the remote system. Plugin attached.

Here's the cat for nrpe.cfg on the system in question:

Code: Select all

command[check_ss]=/usr/local/nagios/libexec/check_ss.pl -c $ARG1$
command[check_rp]=/usr/local/nagios/libexec/check_rp.pl -t $ARG1$ -c $ARG2$
command[check_cosmos]=/usr/local/nagios/libexec/cosmos_check.pl
command[check_dns]=/usr/local/nagios/libexec/check_dns yahoo.com
command[check_active_procs]=/usr/local/nagios/libexec/check_procs -c $ARG1$ -a $ARG2$
command[check_mem]=/usr/local/nagios/libexec/check_mem -f -w $ARG1$ -c $ARG2$
command[check_mem2]=/usr/local/nagios/libexec/check_mem2 -w $ARG1$ -c $ARG2$
command[check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
command[check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
command[check_disk1]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
command[check_disk2]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$
command[check_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -a $ARG3$
command[check_java_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -P $ARG2$ -a $ARG3$
command[check_oracle]=/usr/local/nagios/libexec/check_oracle --login $ARG1$
command[check_procs_multi]=/usr/local/nagios/libexec/check_procs -c $ARG1$ -C $ARG2$ -a '$ARG3$'

Re: NRPE disk check reporting false positive

Posted: Tue Sep 29, 2020 4:36 pm
by ssax
Please SSH into the remote system, run these commands, and attach the full output:

Code: Select all

su - nagios
/usr/local/nagios/libexec/check_disk -V
/usr/local/nagios/libexec/check_disk -w 5% -c 3% -p "/u04/dbs" -v

Re: NRPE disk check reporting false positive

Posted: Wed Sep 30, 2020 9:27 am
by rferebee
Here you go:

Code: Select all

$ /usr/local/nagios/libexec/check_disk -V
check_disk v1.4.14 (nagios-plugins 1.4.14)

Code: Select all

$ /usr/local/nagios/libexec/check_disk -w 5% -c 3% -p "/u04/dbs" -v
DISK CRITICAL - free space: /u04/dbs 0 MB (0% inode=99%);| /u04/dbs=4653354MB;12685312;12952371;0;13352960

Re: NRPE disk check reporting false positive

Posted: Wed Sep 30, 2020 5:47 pm
by ssax
You're running a really old version of that plugin, try this (run these commands on the remote system):
- This will download the latest, compile it, and test it, it will not update your plugin unless you run the \cp command at the end

Code: Select all

cd /tmp
wget https://nagios-plugins.org/download/nagios-plugins-2.3.3.tar.gz
tar zxf nagios-plugins-2.3.3.tar.gz
cd nagios-plugins-2.3.3
./configure
make all
cd plugins
./check_disk -w 5% -c 3% -p "/u04/dbs" -v
If that one shows properly, you can do this to upgrade the current plugin:

Code: Select all

\cp -f /tmp/nagios-plugins-2.3.3/plugins/check_disk /usr/local/nagios/libexec/check_disk

Re: NRPE disk check reporting false positive

Posted: Thu Oct 01, 2020 10:21 am
by rferebee
That did the trick, thank you very much.

You can lock this thread.

Re: NRPE disk check reporting false positive

Posted: Thu Oct 01, 2020 10:55 am
by scottwilkerson
rferebee wrote:That did the trick, thank you very much.

You can lock this thread.
Great!

Locking thread