Page 1 of 1

check_disk reports "CRITICAL" but there is space left!?

Posted: Tue Jul 17, 2012 4:10 am
by Ocsic
Hi!

I'm quiet new with nagios but it's great what it makes possible. For some weeks everything worked fine and I added our servers step by step to the monitoring. But since yesterday evening there is a problem with our mac-servers: check_disk reports CRITICAL with several disks which have 0 GB free space left all within 5 minutes (because they were ok till then). If I run "df" on console it shows me 33% free space on exactly the same partitions that check_disk told me they were full.

I'm wondering why this happens now, because it worked till yesterday evening without any problems. The drives that appears full are afp-shares mounted on the mac-server.

Is anybody out there who can tell me why check_disk reports CRITICAL and df says there is enough space left? It is also possible to access the space from the server and add / modify / delete files. I'm extremly confused...

Thanks so far

Re: check_disk reports "CRITICAL" but there is space left!?

Posted: Wed Jul 18, 2012 1:47 am
by Ocsic
I will add some details here if that could help. Starting the script as root with SUID-bit set or not does not solve the problem, the output is equal in both situations:

Command: check_disk -w 20% -c 10% -W 20% -K 10% -e -x /dev -u GB
Thresholds(pct) for / warn: 20,000000 crit 10,000000
calling stat on /
For /, total=38939382, available=35727252, available_to_root=35791252, used=3148130, fsp.fsu_files=38939380, fsp.fsu_ffree=35727252
For /, used_pct=9 free_pct=91 used_units=12 free_units=136 total_units=148 used_inodes_pct=9 free_inodes_pct=91 fsp.fsu_blocksize=4096 mult=1073741824
Freespace_units result=0
Freespace% result=0
Usedspace_units result=0
Usedspace_percent result=0
Usedinodes_percent result=0
Freeinodes_percent result=0
[...]
Thresholds(pct) for /Volumes/Daten warn: 20,000000 crit 10,000000
calling stat on /Volumes/Daten
For /Volumes/Daten, total=3173743510, available=0, available_to_root=2188914777, used=984828733, fsp.fsu_files=3173743508, fsp.fsu_ffree=2188914777
For /Volumes/Daten, used_pct=100 free_pct=0 used_units=3756 free_units=0 total_units=12106 used_inodes_pct=32 free_inodes_pct=68 fsp.fsu_blocksize=4096 mult=1073741824
Freespace_units result=0
Freespace% result=2
Usedspace_units result=0
Usedspace_percent result=0
Usedinodes_percent result=0
Freeinodes_percent result=0
[...]
DISK CRITICAL - free space: / 136 GB (91% inode=91%); /Volumes/Daten 0 GB (0% inode=68%);| /=12GB;118;133;0;148 /Volumes/Daten=3756GB;9684;10895;0;12106
It's confusing that the output says "available=0" in both situations because one command-call used SUID-Bit with root as the owner of the file.

/Volumes/Daten-Mount:
[...]
/dev/disk1s2 on /Volumes/Daten (hfs, NFS exported, local, journaled)
[...]
df -ki says:
[...]
Filesystem 1024-blocks Used Avail Capacity iused ifree %iused Mounted on
[...]
/dev/disk1s2 12694974040 3939316144 8755657896 32% 984828924 2188914584 31% /Volumes/Daten
[...]
If I calculated correct is 31% used inodes correct and should not be affected by rounding (calculated 31,0305...%) - so is this a hint? check_disk reports 32% inodes used (respectively 68% free)!? That matches the column "Capacity" but if I calculate that manually I get 31,0305...%, too!

I thought that inodes needn't match in this case with the used disk-space!? So is it accidentaly the same result?!

I don't know what to do. For one week it worked fine but now... :(

Re: check_disk reports "CRITICAL" but there is space left!?

Posted: Wed Jul 18, 2012 8:05 pm
by jsmurphy
I've personally never seen this behaviour before and it's possible you haven't got a response from anyone else because they haven't either... I would probably consider posting on the nagios-plugins help mailing list as maybe some one who works on the project might have come across it: http://nagiosplugins.org/support

Re: check_disk reports "CRITICAL" but there is space left!?

Posted: Thu Jul 19, 2012 12:06 am
by Ocsic
OK, thanks for the advice. I hoped not to be the only one with this problem.

So I'll try the mailing-list.