Page 1 of 1

Cannot Get Correct Values from check_local_disk

Posted: Thu Mar 06, 2014 1:29 pm
by clombardo
We are getting warning messages from check_local_disk when we should NOT be, as I have been checking the linux server disk usage with df -h:
[root@ourserver etc]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_livm37-lv_root 11T 435G 9.8T 5% /
tmpfs 253G 0 253G 0% /dev/shm
/dev/sda1

The check_local_disk is returning warning message:
[1393909200] CURRENT SERVICE STATE: ourserver.com;Root Partition;OK;HARD;1;DISK OK - free space: / 124402 MB (26% inode=99%):

The above two do not match.

Here is the setting in minimal.cfg:
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.
#
define service{
use generic-service ; Name of service template to use
host_name ourserver.com

service_description Root Partition
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_local_disk!20%!10%!/
}

Also see attached screen shot of our Nagios GUI warning message for this server.

So what am I doing wrong in getting these warning messages when I should not be getting them?

Is there another configuration file that needs to be set?

Please help.

Re: Cannot Get Correct Values from check_local_disk

Posted: Thu Mar 06, 2014 4:22 pm
by sreinhardt
What plugin and how is the check_local_disk command defined? Is this actually a local disk check or something via snmp or similar?

Re: Cannot Get Correct Values from check_local_disk

Posted: Thu Mar 06, 2014 4:39 pm
by clombardo
We are using Nagios 3.2.3

We have Nagios installed on its own server and we are monitoring many other servers remotely from the Nagios server.

Here is the check_local_disk definition:
# 'check_local_disk' command definition
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}

Re: Cannot Get Correct Values from check_local_disk

Posted: Thu Mar 06, 2014 5:42 pm
by sreinhardt
That check_local_disk command, is only checking the local nagios system, was that your intention?

Re: Cannot Get Correct Values from check_local_disk

Posted: Thu Mar 06, 2014 5:46 pm
by abrist
Are you sure the df output is correct? (253gb)!!?
Could you run the check with max verbosity and post the output:

Code: Select all

./check_disk -w 20 -c 10 -vvv -p / 

Re: Cannot Get Correct Values from check_local_disk

Posted: Fri Mar 07, 2014 9:16 am
by clombardo
You are correct sreinhardt!!! I just checked the local system warning message (less than 20% disk space) matches when I run df -h on the local machine! Well this is not our intention. We want monitor all the remote servers listed in the host_name field of the minimal.cfg file, here is the define service we have for the remote servers we want to monitor (see attached for the complete minimal.cfg):

# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if < 10% free space on partition.
define service{
use generic-service ; Name of service template to use
host_name nshserver.livm.net, pegasus.livm.net, pegasus2.livm.net, medcomp.livm.net, nattis.livm.net, scan.livm.net, liesc.livm.net, liescDR.livm.net, leps.livm.net,
liasc.livm.net, liascDR.livm.net


service_description Root Partition
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_local_disk!20%!10%!/
}

So what do I have to configure in order to monitor our remote machines as listed in the host_name field above?

Re: Cannot Get Correct Values from check_local_disk

Posted: Fri Mar 07, 2014 1:19 pm
by sreinhardt
You would want to either configure those systems with a remote agent to execute plugins and return values for the remote systems, such as NCPA. Otherwise you would need to setup snmp and snmp checks, or use check_by_ssh to exectute them remotely instead of using a agent running as a daemon on the remote systems. Once you decide how you would like to do this, then we can work on setting up config options from there.