Page 1 of 1
Linux Disk Check monitoring?
Posted: Fri Apr 20, 2012 11:55 am
by jbennett
I'm wondering if anyone has found a way to monitor disk checks on Linux drives?
We have about 300 active Linux boxes currently. Down time costs us money (as it does for most applications). Our systems will run a disk check occasionally upon reboot. On 1TB drives, this disk check can take hours.
We have figured out where this check can be disabled, and when it is scheduled.
Is it possible to monitor when this check will happen?
To check the settings: tune2fs –l /dev/hda3
Re: Linux Disk Check monitoring?
Posted: Fri Apr 20, 2012 1:33 pm
by scottwilkerson
I'm not really familiar with this and haven't seen it in the past, but I'm sure it is possible..
Mine didn't output a next check time so I'm not really sure what we are looking for.
I did get a last check and could slim the result to.
Code: Select all
tune2fs -l /dev/sda1 |grep 'Last checked'
Once you get a value, you can create a shell script around it to do the check.
Re: Linux Disk Check monitoring?
Posted: Tue Apr 24, 2012 5:03 pm
by jbennett
scottwilkerson wrote:I'm not really familiar with this and haven't seen it in the past, but I'm sure it is possible..
Mine didn't output a next check time so I'm not really sure what we are looking for.
I did get a last check and could slim the result to.
Code: Select all
tune2fs -l /dev/sda1 |grep 'Last checked'
Once you get a value, you can create a shell script around it to do the check.
I'm now able to dedicate some time to this.
On our system, I ran the command provided above, and I get the following:
Code: Select all
testlane2:~ # tune2fs -l /dev/hda3
tune2fs 1.36 (05-Feb-2005)
Filesystem volume name: /
Last mounted on: <not available>
Filesystem UUID: 8524fc83-09d8-40ca-bdd8-79628dcb5be5
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal filetype needs_recovery sparse_super
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 60981248
Block count: 121943390
Reserved block count: 6097169
Free blocks: 117529175
Free inodes: 60929370
First block: 0
Block size: 4096
Fragment size: 4096
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Filesystem created: Thu Oct 28 09:01:35 2010
Last mount time: Wed Feb 22 19:52:35 2012
Last write time: Wed Feb 22 19:52:35 2012
Mount count: 4
Maximum mount count: 31
Last checked: Tue Nov 1 17:30:44 2011
Check interval: 15552000 (6 months)
Next check after: Sun Apr 29 17:30:44 2012
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
Default directory hash: tea
Directory Hash Seed: c5c923ba-4ced-4352-8c18-e82ac6bba46a
Journal backup: inode blocks
What would be ideal, is to set Nagios to monitor this portion:
Code: Select all
Last checked: Tue Nov 1 17:30:44 2011
Check interval: 15552000 (6 months)
Next check after: Sun Apr 29 17:30:44 2012
and compare it against the clock. If the 'Next Check after' time is within the next week, have it show up in Nagios as needing attention.
How would one go about even starting this? Since I'm picking up the Nagios server from an individual who is no longer with the company, I actually have not set up monitoring in any fashion as of yet, so I'm sorry if this seems to be a very fundamental question.
Re: Linux Disk Check monitoring?
Posted: Wed Apr 25, 2012 11:58 am
by scottwilkerson
jbennett wrote:How would one go about even starting this? Since I'm picking up the Nagios server from an individual who is no longer with the company, I actually have not set up monitoring in any fashion as of yet, so I'm sorry if this seems to be a very fundamental question.
You would somehow have to parse that date value and then act upon it.
Here is a resource that outlines some basics of Nagios plugin development
http://nagiosplug.sourceforge.net/devel ... lines.html
Re: Linux Disk Check monitoring?
Posted: Wed Apr 25, 2012 1:45 pm
by jbennett
I have found this plug-in and am wondering if it will allow what I'm trying to accomplish?
http://exchange.nagios.org/directory/Pl ... ut/details
I should qualify this by stating that I'm not a programmer by trade. Only a little here and there.
If I were to implement this plug-in, I would imagine my command would look something like this:
Code: Select all
check_execgrep.pl --contains NO --warning ??? --critical ??? --command tune2fs /dev/hda3
The issue I'm having is how to fill in the warning and critical with REGEX that do what I want.
The
command gives me dates (among other info) in the format
I would like to figure out how to have this plug-in compare the grep date from 'Next check after' to the system date and provide a warning if within the next 7 days (10080 minutes) and a critical if within the next 1 day (1440 minutes).
Is this even possible?
Re: Linux Disk Check monitoring?
Posted: Wed Apr 25, 2012 2:11 pm
by scottwilkerson
That won't get you there because you need to do math on the date.
I can get you close to that would go in the shell script, the command to convert that time to something you can do math on would look like this
Code: Select all
date --utc --date "`tune2fs -l /dev/sda1 |grep 'Next check after'|tr -s ' ' ' '| cut -d " " -f 5,6,8`" +%s
If you would like our team to develop this custom plugin please contact
[email protected]
Re: Linux Disk Check monitoring?
Posted: Wed Apr 25, 2012 2:20 pm
by jbennett
scottwilkerson wrote:That won't get you there because you need to do math on the date.
I can get you close to that would go in the shell script, the command to convert that time to something you can do math on would look like this
Code: Select all
date --utc --date "`tune2fs -l /dev/sda1 |grep 'Next check after'|tr -s ' ' ' '| cut -d " " -f 5,6,8`" +%s
If you would like our team to develop this custom plugin please contact
[email protected]
Thanks for the info. Looks like I will need to look into this a bit deeper before trying to go much further.