Disk space check that alert on % rate off growth
Posted: Mon Dec 03, 2012 9:19 am
Hi All,
First post in a while and I'm on the search for some information.
Nagios Environment:
Nagios 3.2.1
PnP4Nagios 0.6.19-R.36
OS - RHEL 5.8
Scenario:
I have a mixed Operating Systems environment consisting of Linux, AIX, and Windows 2000/2003/2008.
From time to time we have a situation where any one of our servers will suddenly eat through all its disk space in a period of time and by the time someone notices the critical alert has gone off its too late to do remedial actions. Yes we could use warnings but they are still struggling to get the concept of dealing with critical alerts in a timely manner.
We want to look at the rate of growth on a disk and see if it is trending at a level where the disk may fill up IF its projected trajectory continues.
Yes, this may raise a few false positives as we tune our timeperiods / growth rates but we feel it may be useful in the long term.
I wish to create a new Nagios check/plugin to measure the trajectory of disk space use over a period of time.
check_disk_trajectory
This plugin looks at the rate of disk space consumption over a user defined time period and alerts if this threshold breaches. im thinking of something like the following
./check_disk_trajectory -H HOSTNAME -p <disk/partition> -t <trajectory time period specified in minutes> -w <warning level in % change> -c <critical level in % change>
Question:
1. Has anyone implemented something like this before?
2. Is there a plugin that does this already?
3. I am thinking that as I already use PnP4Nagios there may be an option to do some RRD calculations on the back off an event handler after the scheduled disk space check has been carried out.
First post in a while and I'm on the search for some information.
Nagios Environment:
Nagios 3.2.1
PnP4Nagios 0.6.19-R.36
OS - RHEL 5.8
Scenario:
I have a mixed Operating Systems environment consisting of Linux, AIX, and Windows 2000/2003/2008.
From time to time we have a situation where any one of our servers will suddenly eat through all its disk space in a period of time and by the time someone notices the critical alert has gone off its too late to do remedial actions. Yes we could use warnings but they are still struggling to get the concept of dealing with critical alerts in a timely manner.
We want to look at the rate of growth on a disk and see if it is trending at a level where the disk may fill up IF its projected trajectory continues.
Yes, this may raise a few false positives as we tune our timeperiods / growth rates but we feel it may be useful in the long term.
I wish to create a new Nagios check/plugin to measure the trajectory of disk space use over a period of time.
check_disk_trajectory
This plugin looks at the rate of disk space consumption over a user defined time period and alerts if this threshold breaches. im thinking of something like the following
./check_disk_trajectory -H HOSTNAME -p <disk/partition> -t <trajectory time period specified in minutes> -w <warning level in % change> -c <critical level in % change>
Question:
1. Has anyone implemented something like this before?
2. Is there a plugin that does this already?
3. I am thinking that as I already use PnP4Nagios there may be an option to do some RRD calculations on the back off an event handler after the scheduled disk space check has been carried out.