Disk Performance Checks


Disk Performance Checks

Disk Performance checks allow you to monitor the input / output (IO) performance of the physical disks in your system. A physical disk is the block level device, depending on the agent and operating system (OS) you can monitor the physical disk and then the partitions on that disk. The closer you are to the physical device the more true the results will be, however being able to monitor separate partitions on the same disk will help identify which partition has the most IO.

The sections below provide examples of how to perform these checks using different methods.

 

Nagios Plugins

Nagios Plugins does not include a disk performance plugin.

NCPA

NPCA includes a disk module that allows you to check the performance of the physical disks in your system. The term "physical" can vary depending on the operating system, this will be explained first.

Windows

NCPA on Windows provides metrics for the physical disks in your system. For example in Disk Management "Disk 0" is reference in NCPA as "disk/physical/PhysicalDrive0". This is a direct relationship, it does not provide metrics for the partitions you create on the disk.

Linux

NCPA on Linux is a little more complicated, it provides metrics for the partitions on your physical disks. This can be hard to understand when you are partitioning your disk using the Logical Volume Manager (LVM). This is best explained by executing the following command on a Linux system:

lsblk --output NAME,KNAME,TYPE,SIZE,MOUNTPOINT

Output:

NAME            KNAME TYPE  SIZE MOUNTPOINT
sda sda disk 16G
├─sda1 sda1 part 500M /boot
└─sda2 sda2 part 15.5G
├─centos-root dm-0 lvm 13.9G /
└─centos-swap dm-1 lvm 1.6G [SWAP]
sr0 sr0 rom 1024M

 

The value in the KNAME column is how you reference it in NCPA.

"sda1" is the boot partition, this is referenced as "disk/physical/sda1".

"sda2" is a partition that is a LVM physical disk, this is referenced as "disk/physical/sda2".

The LVM has two volumes in it, these can also be referenced.

Volume "centos-root" is referenced as "disk/physical/dm-0".

Volume "centos-swap" is referenced as "disk/physical/dm-1".

You'll notice in NCPA that you cannot get metrics for the actual physical disk sda, this is how it works on Linux.

 

Now that these differences have been explained, the examples below show the different metrics that can be monitored.

 

Bytes Read / Bytes Write

Unit: M
Warning: 50MB/s
Critical: 100MB/s

Commands:

./check_ncpa.py -H 10.25.14.91 -t Str0ngT0k3n -M 'disk/physical/PhysicalDrive0/read_bytes' -d -u M -w 50 -c 100
./check_ncpa.py -H 10.25.14.91 -t Str0ngT0k3n -M 'disk/physical/PhysicalDrive0/write_bytes' -d -u M -w 50 -c 100

Output:

OK: Read_bytes was 5.15 MB/s | 'read_bytes'=5.15;50;100;
OK: Write_bytes was 0.05 MB/s | 'write_bytes'=0.05;50;100;

 

Read Time / Write Time

Unit: ms
Warning: 50ms/s
Critical: 100ms/s

Commands:

./check_ncpa.py -H 10.25.14.91 -t Str0ngT0k3n -M 'disk/physical/PhysicalDrive0/read_time' -d -w 50 -c 100
./check_ncpa.py -H 10.25.14.91 -t Str0ngT0k3n -M 'disk/physical/PhysicalDrive0/write_time' -d -w 50 -c 100

Output:

OK: Read_time was 18.69 ms/s | 'read_time'=18.69;50;100;
WARNING: Write_time was 73.87 ms/s | 'write_time'=73.87;50;100;

 

The read_count and write_count nodes are also available.

NSClient++ via check_nt

NSClient++ does not include a disk performance module.

An alternative method is to query a performance counter, for example:

\LogicalDisk(C:)\% Disk Read Time
\PhysicalDisk(0 C:)\Avg. Disk Bytes/Write

More information about performance counters can be found in the Performance Counter Checks KB article.

NSClient++ via check_nrpe

NSClient++ does not include a disk performance module.

An alternative method is to query a performance counter, for example:

\LogicalDisk(C:)\% Disk Read Time
\PhysicalDisk(0 C:)\Avg. Disk Bytes/Write

More information about performance counters can be found in the Performance Counter Checks KB article.

WMI

Check WMI Plus includes a checkio module. These disks checks use WMI Raw counters to calculate values over a given timeperiod.

Bytes Read / Bytes Write

Unit: M
Warning: 50MB/s (50000000)
Critical: 100MB/s (100000000)

Commands:

./check_wmi_plus.pl -H 10.25.14.3 -u wmiagent -p Str0ngP@ssw0rd -m checkio -s physical -a C: -w _DiskReadBytesPersec=50000000 -c _DiskReadBytesPersec=100000000 -w _DiskWriteBytesPersec=50000000 -c _DiskWriteBytesPersec=100000000

Output:

Overall Status - OK (Sample Period 74 sec) -  Physical Drive Name="0 C:" (OK) - _PercentIdleTime=100%, _PercentBusyTime=0%, _PercentDiskTime=0%, _PercentDiskReadTime=0%, _PercentDiskWriteTime=0%, _DiskReadBytesPersec=0B/sec,
_DiskReadsPersec=0/sec, _DiskWriteBytesPersec=337B/sec, _DiskWritesPersec=0/sec, CurrentDiskQueueLength=0, _AvgDiskQueueLength=0.0, _AvgDiskReadQueueLength=0.0, _AvgDiskWriteQueueLength=0.0|'_PercentIdleTime0 C:'=100;
'_PercentBusyTime0 C:'=0; '_PercentDiskTime0 C:'=0; '_PercentDiskReadTime0 C:'=0; '_PercentDiskWriteTime0 C:'=0; '_DiskReadBytesPersec0 C:'=0;50000000;100000000; '_DiskReadsPersec0 C:'=0;
'_DiskWriteBytesPersec0 C:'=337;50000000;100000000; '_DiskWritesPersec0 C:'=0; 'CurrentDiskQueueLength0 C:'=0; '_AvgDiskQueueLength0 C:'=0.0; '_AvgDiskReadQueueLength0 C:'=0.0; '_AvgDiskWriteQueueLength0 C:'=0.0;

 

A lot of metrics are available as you can see from the output above, all of these can have warning or critical thresholds.

SNMP

You will need to download a third party plugin that provides this functionality, please check out the Nagios Exchange.

 

 

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/



Article ID: 785
Created On: Sun, Nov 26, 2017 at 10:51 PM
Last Updated On: Sun, Nov 26, 2017 at 10:51 PM
Authored by: tlea

Online URL: https://support.nagios.com/kb/article/disk-performance-checks-785.html