Linux Disk Check monitoring?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Linux Disk Check monitoring?

Post by jbennett »

I'm wondering if anyone has found a way to monitor disk checks on Linux drives?

We have about 300 active Linux boxes currently. Down time costs us money (as it does for most applications). Our systems will run a disk check occasionally upon reboot. On 1TB drives, this disk check can take hours.

We have figured out where this check can be disabled, and when it is scheduled.

Is it possible to monitor when this check will happen?

To check the settings: tune2fs –l /dev/hda3
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Linux Disk Check monitoring?

Post by scottwilkerson »

I'm not really familiar with this and haven't seen it in the past, but I'm sure it is possible..

Mine didn't output a next check time so I'm not really sure what we are looking for.

I did get a last check and could slim the result to.

Code: Select all

tune2fs -l /dev/sda1 |grep 'Last checked'
Once you get a value, you can create a shell script around it to do the check.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Linux Disk Check monitoring?

Post by jbennett »

scottwilkerson wrote:I'm not really familiar with this and haven't seen it in the past, but I'm sure it is possible..

Mine didn't output a next check time so I'm not really sure what we are looking for.

I did get a last check and could slim the result to.

Code: Select all

tune2fs -l /dev/sda1 |grep 'Last checked'
Once you get a value, you can create a shell script around it to do the check.
I'm now able to dedicate some time to this.

On our system, I ran the command provided above, and I get the following:

Code: Select all

testlane2:~ # tune2fs -l /dev/hda3
tune2fs 1.36 (05-Feb-2005)
Filesystem volume name:   /
Last mounted on:          <not available>
Filesystem UUID:          8524fc83-09d8-40ca-bdd8-79628dcb5be5
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal filetype needs_recovery sparse_super
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              60981248
Block count:              121943390
Reserved block count:     6097169
Free blocks:              117529175
Free inodes:              60929370
First block:              0
Block size:               4096
Fragment size:            4096
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         16384
Inode blocks per group:   512
Filesystem created:       Thu Oct 28 09:01:35 2010
Last mount time:          Wed Feb 22 19:52:35 2012
Last write time:          Wed Feb 22 19:52:35 2012
Mount count:              4
Maximum mount count:      31
Last checked:             Tue Nov  1 17:30:44 2011
Check interval:           15552000 (6 months)
Next check after:         Sun Apr 29 17:30:44 2012
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               128
Journal inode:            8
Default directory hash:   tea
Directory Hash Seed:      c5c923ba-4ced-4352-8c18-e82ac6bba46a
Journal backup:           inode blocks
What would be ideal, is to set Nagios to monitor this portion:

Code: Select all

Last checked:             Tue Nov  1 17:30:44 2011
Check interval:           15552000 (6 months)
Next check after:         Sun Apr 29 17:30:44 2012
and compare it against the clock. If the 'Next Check after' time is within the next week, have it show up in Nagios as needing attention.

How would one go about even starting this? Since I'm picking up the Nagios server from an individual who is no longer with the company, I actually have not set up monitoring in any fashion as of yet, so I'm sorry if this seems to be a very fundamental question.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Linux Disk Check monitoring?

Post by scottwilkerson »

jbennett wrote:How would one go about even starting this? Since I'm picking up the Nagios server from an individual who is no longer with the company, I actually have not set up monitoring in any fashion as of yet, so I'm sorry if this seems to be a very fundamental question.
You would somehow have to parse that date value and then act upon it.

Here is a resource that outlines some basics of Nagios plugin development
http://nagiosplug.sourceforge.net/devel ... lines.html
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Linux Disk Check monitoring?

Post by jbennett »

I have found this plug-in and am wondering if it will allow what I'm trying to accomplish?

http://exchange.nagios.org/directory/Pl ... ut/details

I should qualify this by stating that I'm not a programmer by trade. Only a little here and there.

If I were to implement this plug-in, I would imagine my command would look something like this:

Code: Select all

check_execgrep.pl --contains NO --warning ??? --critical ??? --command tune2fs /dev/hda3
The issue I'm having is how to fill in the warning and critical with REGEX that do what I want.

The

Code: Select all

tune2fs /dev/hda3
command gives me dates (among other info) in the format

Code: Select all

Sun Apr 29 17:30:44 2012
I would like to figure out how to have this plug-in compare the grep date from 'Next check after' to the system date and provide a warning if within the next 7 days (10080 minutes) and a critical if within the next 1 day (1440 minutes).

Is this even possible?
Last edited by jbennett on Wed Apr 25, 2012 2:13 pm, edited 1 time in total.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Linux Disk Check monitoring?

Post by scottwilkerson »

That won't get you there because you need to do math on the date.

I can get you close to that would go in the shell script, the command to convert that time to something you can do math on would look like this

Code: Select all

date --utc --date "`tune2fs -l /dev/sda1 |grep 'Next check after'|tr -s ' ' ' '| cut -d " " -f 5,6,8`" +%s
If you would like our team to develop this custom plugin please contact [email protected]
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Linux Disk Check monitoring?

Post by jbennett »

scottwilkerson wrote:That won't get you there because you need to do math on the date.

I can get you close to that would go in the shell script, the command to convert that time to something you can do math on would look like this

Code: Select all

date --utc --date "`tune2fs -l /dev/sda1 |grep 'Next check after'|tr -s ' ' ' '| cut -d " " -f 5,6,8`" +%s
If you would like our team to develop this custom plugin please contact [email protected]
Thanks for the info. Looks like I will need to look into this a bit deeper before trying to go much further.
Locked