Page 1 of 1

Issues measuring Disk Usage with check_nrpe

Posted: Tue Jan 09, 2018 4:38 am
by neworderfac33
Good morning, I have a service defined as follows to measure disk usage:

Code: Select all

define service{
       use                      generic-service
       #host_name                MyServer
       hostgroup_name           MyHostGroup
       service_description      MyDescription
       check_command            check_nt!USEDDISKSPACE!-l c -w 90 -c 95
       }
This works fine, but I can only generate alerts based on percentage of used space and I have been asked to generate alerts when available space is reduced to an amount in GB, rather than a percentage.
So, I created revised the service definition and made it use a newly defined command as follows:

Code: Select all

define service{
       use                      generic-service
       #host_name                MyServer
       hostgroup_name           MyHostGroup
       service_description      MyDescription
       check_command            Win_Disk_Space_C
       }

define command{
        command_name Win_Disk_Space_C
        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -p 5666 -c check_drivesize -a drive=C: 'warning=free<2G' 'critical=free<1G' show-all 'perf-config=*(unit:G)' detail-syntax='{${drive_or_name} ${free} free / ${size} total}' top-syntax='${status}: ${problem_list}'
}
This reports correct usage values in Nagios Core 4.3.4 using NSClient++ 0.4.3.143, but when I pass the data into Grafana, it returns completely incorrect values for disk usage (e.g. 45GB instead of 2GB on a 10GB disk!)

Can anyone see anything wrong with my command definition, or advise if the issue might be resolved by upgrading to NSClient++ 0.5.2?

Thanks in advance

Pete

Re: Issues measuring Disk Usage with check_nrpe

Posted: Tue Jan 09, 2018 9:48 am
by mcapra
Can you share the output of the command executed from the CLI of your Nagios Core machine? The above is useful information, but it doesn't show how the performance data is actually being reported. There may be some disconnect between how NSClient++ is formatting the data and what Grafana is expecting to receive.

Also, what are you using to transfer performance data from Nagios Core to Grafana's database?

Re: Issues measuring Disk Usage with check_nrpe

Posted: Tue Jan 09, 2018 9:56 am
by neworderfac33
Here's some typical output:

Code: Select all

OK: {C: 21.552GB free / 39.656GB total}|'C: free'=21.5524G;5;2;0;39.65624 'C: free %'=54%;13;5;0;100
and I'm using Graphios and InFluxDB to get the data from Nagios to Grafana.
I've upgraded to NSClient++ 0.5.1.44 on one host, but it's still reporting erroneously within Grafana.
Thanks for taking a look!
Pete

Re: Issues measuring Disk Usage with check_nrpe

Posted: Tue Jan 09, 2018 4:03 pm
by cdienger
Perhaps someone with more Graphios/InFluxDB/Grafana experience can chime in, but it appears that Nagios is getting the information correctly and the problem is then upstream. You can try increasing Graphios logging level in the graphios.cfg with:

log_level = logging.DEBUG

but I suspect that will just show data is transferred but not necessarily what data. Maybe we'll get lucky though. InFluxDB appears to have some logging options as well(https://docs.influxdata.com/influxdb/v0 ... /#graphite), but I'm not sure what could be enabled to give us potentially useful data from that platform.

Re: Issues measuring Disk Usage with check_nrpe

Posted: Wed Jan 10, 2018 9:59 am
by neworderfac33
Thanks very much for taking the time to reply - I've posted in both NSClient++ and Grafana forums, so I'll wait to see what gets thrown up.
Pete

Re: Issues measuring Disk Usage with check_nrpe

Posted: Wed Jan 10, 2018 5:16 pm
by npolovenko
@neworderfac33, Sounds good, keep us updated.