Disk Latency Monitoring using PerfMon

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Disk Latency Monitoring using PerfMon

Post by neworderfac33 »

Good afternoon,
I am attempting to use the following services against a number of Windows 2008/2012 servers, via NSClient++ 0.4.3.143

Code: Select all

#--------------------------------------------------------
# Latency - C
#--------------------------------------------------------
#check_nrpe disk LATENCY - READ - C
define service{
        use                     generic-service
        host_name        
        hostgroup_name
        service_description     C_Disk_Read_Latency
        check_command           check_nrpe!checkcounter! -a 'Counter:C: Avg. Disk sec/Read=\LogicalDisk(C:)\Avg. Disk sec/Read' ShowAll MaxWarn=10 MaxCrit=20
        #register               0
}

#check_nrpe disk LATENCY - WRITE - C
define service{
        use                     generic-service
        host_name               
        hostgroup_name          
        service_description     C_Disk_Write_Latency
        check_command           check_nrpe!checkcounter! -a 'Counter:C: Avg. Disk sec/Write=\LogicalDisk(C:)\Avg. Disk sec/Write' ShowAll MaxWarn=10 MaxCrit=20
        #register               0
}
#--------------------------------------------------------
# Latency - E
#--------------------------------------------------------
#check_nrpe disk LATENCY - READ - E
define service{
        use                     generic-service
        host_name               
        hostgroup_name          
        service_description     E_Disk_Read_Latency
        check_command           check_nrpe!checkcounter! -a 'Counter:E: Avg. Disk sec/Read=\LogicalDisk(E:)\Avg. Disk sec/Read' ShowAll MaxWarn=10 MaxCrit=20
        #register               0
}
#check_nrpe disk LATENCY - WRITE - E
define service{
        use                     generic-service
        host_name               
        hostgroup_name          
        service_description     E_Disk_Write_Latency
        check_command           check_nrpe!checkcounter! -a 'Counter:E: Avg. Disk sec/Write=\LogicalDisk(E:)\Avg. Disk sec/Write' ShowAll MaxWarn=10 MaxCrit=20
        #register               0
Essentially, I'm attempting to monitor read and write disk latency for C and E drives.
The problem I'm encountering is, if I attempt to run the services for C: drive read/write OR E: drive read/write, all works fine.
if, however, I try to monitor read/write for C AND E: drives, when i restart Nagios (4.3.4), the statuses of the two most recent services to be uncommented remain at "Pending" for all the relavant hosts. Nagios also takes over a minute to restart.
Can anyone suggest where I might be going wrong - all the hosts in question have a C drive and an E drive.
Thanks in advance
Pete
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Re: Disk Latency Monitoring using PerfMon

Post by neworderfac33 »

Weirdest thing - in generic-service in templates.cfg I had check_interval set to 0.25.
I set it to 1 and restarted Nagios and everything was fine. But I then set it back to 0.25 and restarted - and everything was STILL fine!
The reason it was 0.25 (15 seconds) was so it matched the cgi refresh rate, but I'm wondering if this was what caused the problem.
I've added in more hoists and it's STILL fine! It's been a long day and I'm going home!
Thanks
Pete
bolson

Re: Disk Latency Monitoring using PerfMon

Post by bolson »

Does your issue appear to be resolved and may we close this topic?
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Re: Disk Latency Monitoring using PerfMon

Post by neworderfac33 »

It does, other than I wouldn't mind someone confirming if it's OK for me to use decimal values for check_interval!
Obviously, I could change interval_length to 1 in /usr/local/nagios/etc/nagios.cfg then set check_interval to 15, but the comments around this don't look encouraging!
Pete
bolson

Re: Disk Latency Monitoring using PerfMon

Post by bolson »

Decimal values are not supported. This doesn't mean your config (using .25) wont work, only that it's not officially supported. My advice would be to leave it at .25 and see if you have any issues going forward or if you get values that don't seem right.
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Re: Disk Latency Monitoring using PerfMon

Post by neworderfac33 »

What I HAVE noticed is when I set it to 1, the Nagios restart is WAY quicker than when it's at 0.25 (10-15 seconds, as against over a minute) which suggests that there's something it doesn't like, even though it verifies with no errors.
Anyhow, thanks for your confirmation that decimal values aren't officially supported. This thread can now be closed.
Pete
bolson

Re: Disk Latency Monitoring using PerfMon

Post by bolson »

Thank you. Closing this topic as resolved.
Locked