check_disk falsely reporting 0% free Nagios Users

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
salderman1
Posts: 12
Joined: Tue Jan 21, 2014 11:19 am

check_disk falsely reporting 0% free Nagios Users

Post by salderman1 »

All,
I have a strange scenario on a Solaris 10 server using ZFS. The server has several ZFS Pools configured, but only one of them is larger than 1TB. We've been monitoring this server for about a week with out issue. This morning check_disk reports the 2TB filesystem as 0% available and is throwing false alarms. In df, the filesystem is only 39% used.

We are using the following command:

Code: Select all

# /opt/csw/libexec/nagios-plugins/check_disk -w 20% -c 10% -R "^/$|oracle$|^/u[0-9]+" -X tmpfs
DISK CRITICAL - free space: / 196670 MB (96% inode=99%); /oracle 954185 MB (76% inode=99%); /u01 92383 MB (92% inode=99%); /u02 199323 MB (49% inode=99%); /u04 11459 MB (57% inode=99%); /u05 133218 MB (44% inode=99%); /u06 0 MB (0% inode=99%); /u10 69086 MB (68% inode=99%);| /=6501MB;162537;182854;0;203172 /oracle=295166MB;999480;1124415;0;1249351 /u01=7910MB;80235;90264;0;100294 /u02=201819MB;320914;361028;0;401143 /u04=8573MB;16025;18028;0;20032 /u05=167159MB;240301;270339;0;300377 /u06=787725MB;1638601;1843426;0;2048252 /u10=31193MB;80224;90252;0;100280
# df -h /u06
Filesystem             size   used  avail capacity  Mounted on
zpool06/u06            2.0T   769G   1.2T    39%    /u06
# /opt/csw/libexec/nagios-plugins/check_disk -vvv -u bytes /u06
calling stat on /u06
For /u06, used_pct=100 free_pct=0 used_units=8.2599e+11 free_units=0 total_units=2.14775e+12 used_inodes_pct=1 free_inodes_pct=99 fsp.fsu_blocksize=512 mult=1
Freespace_units result=0
Freespace% result=0
Usedspace_units result=0
Usedspace_percent result=0
Usedinodes_percent result=0
Freeinodes_percent result=0
DISK OK - free space: /u06 0 B (0% inode=99%);| /u06=2147483647B;;;0;2147483647
Seems like some kind of math problem, when we have 2.14^12 bytes capacity, with 8.26^11 bytes used. My desktop calculator reports that math to be 38.5%

Any thoughts? Thanks a bunch!
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: check_disk falsely reporting 0% free Nagios Users

Post by sreinhardt »

Interesting, this looks like the performance data is outputted correctly, but the standard message is not. I will take a look into this tonight where I have several large disks to test against. If it does turn out to be a bug, I may ask that you submit a bug post on the github.com/nagios-plugins page, however I would like to do some testing first.

/u06 0 MB (0% inode=99%) | u06=787725MB;1638601;1843426;0;2048252
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
salderman1
Posts: 12
Joined: Tue Jan 21, 2014 11:19 am

Re: check_disk falsely reporting 0% free Nagios Users

Post by salderman1 »

sreinhardt wrote:Interesting, this looks like the performance data is outputted correctly, but the standard message is not. I will take a look into this tonight where I have several large disks to test against. If it does turn out to be a bug, I may ask that you submit a bug post on the github.com/nagios-plugins page, however I would like to do some testing first.

/u06 0 MB (0% inode=99%) | u06=787725MB;1638601;1843426;0;2048252
Excellent, I'd be happy to. FWIW, I failed to mention the plugins version...

Code: Select all

# pkginfo -l CSWnagios-plugins
   PKGINST:  CSWnagios-plugins
      NAME:  nagios_plugins - Plugins for Nagios
  CATEGORY:  application
      ARCH:  sparc
   VERSION:  1.4.16,REV=2012.07.18
   BASEDIR:  /
    VENDOR:  http://downloads.sourceforge.net/nagiosplug/ packaged for CSW by Juergen Arndt
    PSTAMP:  ja@unstable10s-20120718235350
  INSTDATE:  Jan 16 2014 14:21
   HOTLINE:  http://www.opencsw.org/bugtrack/
     EMAIL:  [email protected]
    STATUS:  completely installed
     FILES:       86 installed pathnames
                   1 shared pathnames
                   5 directories
                  55 executables
                   3 setuid/setgid executables
               12082 blocks used (approx)
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: check_disk falsely reporting 0% free Nagios Users

Post by sreinhardt »

OH, 1.4.2, I think this is already patched. I created a virtual 4TB disk last night and ran tests against, was well as a physical 2TB disk and did not have issues with the present 1.5 code. I would highly suggest updating. Myself and abrist have made many improvements in the maint branch in the last few days, as well as the master branch should contain the fix you are looking for presently. http://github.com/nagios-plugins
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
salderman1
Posts: 12
Joined: Tue Jan 21, 2014 11:19 am

Re: check_disk falsely reporting 0% free Nagios Users

Post by salderman1 »

Unfortunately, it would appear that OpenCSW does not provide the version you are referring to. I am running 1.4.16, not 1.4.2 -

Code: Select all

# /opt/csw/libexec/nagios-plugins/check_disk -V
check_disk v1.4.16 (nagios-plugins 1.4.16)
- which seems to be the latest version of nagios_plugins in all of the package trees available through OpenCSW. Can you tell me what version this issue would have been resolved in? It seems like the check_disk.c code for 1.4.16 on github hasn't been touched in ~3 years, the OpenCSW package was built in July of 2012.

It looks like this fix for blocksize differences might be a possible, but it seems to be tagged in both 1.5 and 1.4.16. We are running with several Zpools that have a mix of blocksizes. The ZFS default is 128K, and the filesystem in this issue has that size, but there are other ZFS filesystems on the server which have 8K blocksizes for Oracle DB files.

Thanks for looking into this for me.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: check_disk falsely reporting 0% free Nagios Users

Post by sreinhardt »

My mistake on the version, now I'm not sure where I got 1.4.2 from... Anyway, the news file states that it should have been resolved with the changes to plugins/check_disk.c and lib/disk_utils.c with 1.4.16, which should indicate that you have it. Apparently this does not seem to be the case. I will have to setup a opensolaris or opencsw system with some virtual disks in zpools to test it tonight. Also just a note, any changes relevant to this, likely would be in disk_utils.c not check_disk.c, as the former holds most of the functions for doing the calculations we need.(at least from the brief look I gave it)
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
salderman1
Posts: 12
Joined: Tue Jan 21, 2014 11:19 am

Re: check_disk falsely reporting 0% free Nagios Users

Post by salderman1 »

sreinhardt wrote:My mistake on the version, now I'm not sure where I got 1.4.2 from... Anyway, the news file states that it should have been resolved with the changes to plugins/check_disk.c and lib/disk_utils.c with 1.4.16, which should indicate that you have it. Apparently this does not seem to be the case. I will have to setup a opensolaris or opencsw system with some virtual disks in zpools to test it tonight. Also just a note, any changes relevant to this, likely would be in disk_utils.c not check_disk.c, as the former holds most of the functions for doing the calculations we need.(at least from the brief look I gave it)
Thanks, and thanks for the correction. It's only been twenty years since I studied Kernighan and Ritche in college :twisted: Please forgive me.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: check_disk falsely reporting 0% free Nagios Users

Post by sreinhardt »

HAHA, no worries, good old K&R still nothing quite like it.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
salderman1
Posts: 12
Joined: Tue Jan 21, 2014 11:19 am

Re: check_disk falsely reporting 0% free Nagios Users

Post by salderman1 »

sreinhardt wrote:I will have to setup a opensolaris or opencsw system with some virtual disks in zpools to test it tonight.
Hi, I don't mean to be a pest, but I was curious if you had any luck on your testing.

Thanks!
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: check_disk falsely reporting 0% free Nagios Users

Post by sreinhardt »

Yes I did have a chance to test some aspects of it, got a zfs pool created with ~3tb space, but was unable to replicate the same issue. I was testing on linux not opencsw though, and am thinking that must be the difference. Once I finish getting that setup I will post back again, feel free to bug me all you want, its a good reminder. :D
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Locked