Nagios Support Forum

Posted: **Wed Aug 28, 2013 10:41 am**

Well, I'm not competely there yet. I already know how to pass performance data to nagios. I also know how to exit statuscodes, but at this moment I don't seem to be able to exit the stauscode and the peformance data at once. Any help or documentation about this?

Posted: **Wed Aug 28, 2013 11:29 am**

The way that I did it in my check update script was to use Write-Output to send the return message to standard out and then used exit $exitcode to exit with my specific exit code. You might take a look at my script for some ideas as to how I implemented this.

http://exchange.nagios.org/directory/Pl ... ll/details

Posted: **Wed Aug 28, 2013 3:06 pm**

I am having a look into this passive service (with NSCA) way to run the script. Just a quick question (as I did not saw this before) Can you confirm it is possible to submit performance data with a passive NSCA check?

Posted: **Wed Aug 28, 2013 3:14 pm**

It absolutely should be, just the exact as with an active check. Simply separate with a | and label each counter with [countername]=[value] and separate each set of counters and values with spaces. This link is pretty helpful too!

Posted: **Wed Sep 04, 2013 3:37 am**

As I have not enought time for the moment to research passive nsca checks, I'm still working with the active check. I got it partially working, but I seem to be missing something. I'll post the script again, as there have been some updates:

Code: Select all

Param(
[Parameter(Mandatory=$true)][string]$dfspad,
[Parameter(Mandatory=$true)][int]$quota,
[Parameter(Mandatory=$true)][decimal]$warningperc,
[Parameter(Mandatory=$true)][decimal]$criticalperc
)
$sumMB = 0
$warningMB = 0
$criticalMB = 0
$status = 0
$folderlist = Get-ChildItem $dfspad -Recurse -ErrorAction SilentlyContinue | Measure-Object -property length -sum
$sumMB = [System.Math]::Round($folderlist.sum / 1MB, 2)
$warningMB = $quota * $warningperc / 100
$criticalMB = $quota * $criticalperc / 100
$strsumMB=[string]$sumMB + "MB"

#echo "warningperc: $warningperc"
#echo "criticalperc: $criticalperc"
#echo "quota: $quota"
#echo "sumMB: $sumMB"
#echo "warning: $warningMB"
#echo "critical: $criticalMB"

if ($sumMB -ge $criticalMB) {
	$status = 2
	echo "$dfspad has exceeded it's critical threshold ($criticalperc %)!"
#	echo "Size = $sumMB MB, Critical Threshold = $criticalMB MB"
	Write-Output "Used Storage = $sumMB MB, Critical Threshold = $criticalMB MB | Used_Storage=$strsumMB"
}
elseif ($sumMB -ge $warningMB) {
	$status = 1
	echo "$dfspad has exceeded it's warning threshold ($warningperc %)!" 
#	echo "Size = $sumMB MB, Warning Threshold = $criticalMB MB"
	Write-Output "Used Storage = $sumMB MB, Warning Threshold = $warningMB MB | Used_Storage=$strsumMB"
}
else {
	echo "$dfspad has not exceeded any thresholds."
	Write-Output "Used Storage = $sumMB MB | Used_Storage=$strsumMB"
}

# echo "status: $status"
# $sumMBOut="$sumMB" + "MB"

# $strsumMB+="MB"

# Write-Output "Used Storage = $sumMB | Used_Storage=$strsumMB"

# $perfdata = "$sumMB MB | Used_Storage=$sumMB"
#$return.output = "$sumMBMB | Used_Storage=$sumMB"
#$outputstring=$return.output
#Write-Output "$outputstring"
Exit $status

Notice that everything after # was for testing purposes. So, all my tests go well. The arguments are passed nicely, the Powershell script returns the correct exitcode and also passes performance data.

But It seems the peformance data is only doing it's job the first few checks. When I copy the service, do a few checks, everything seems fine. When I check a few days later, it seems the performance graph is still kind of emtpy. I'm only doing this check once per 24 hours as the script can take some time to complete.

See screenshots of the graph after a few days (I would expect a blue line at 93k) and a screenshot of the advanced tab of the service. Any help is as Always much appreciated!

Posted: **Wed Sep 04, 2013 9:37 am**

I will have to check to be 100% sure, but it seems that the graph is accurate considering you should only have one point in a 24h period. I take it you were expecting an even line across? The part I will have to check on, is if highcharts(graph explorer) will not display a horizontal line, if there is only a single point of data. The other graphs(old style PnP) also could take a week or more to generate as they require a set number of minimum data points. Also could you send me the perfdata you are currently sending. Does this alter in size from MB, GB, to TB at all?

Posted: **Wed Sep 04, 2013 10:35 am**

Well indeed I am expecting a horizontal line in the graph. But this is not the core rpoblem I think. In fact after the first test checks, the day after the first day, new checks seems to not send any perfdata.. For example with this service you see in the screenshots, the graph is from the last 24 hours and you can see the check ran at 16:15. Still I don't get any perfdata in the graph... While in the advanced tab, it says Performance Data: 'Used_Storage'=93677.68MB
I already tried several time to copy the service, give it another name and try again, but it seems I get the same issue every time.

Posted: **Wed Sep 04, 2013 10:51 am**

I was going to ask if we could modify the plugin being run to also output the return data to a text file on the windows server, however to could you send me the rrd for that service, since it seems to fit your issue so specifically.

Posted: **Wed Sep 04, 2013 12:59 pm**

OK, so I checked with our resident perfdata storage\rrd master. He actually fully expected this to happen. The reason being, pnp4nagios creates rrds and the rra configurations with the idea that they will have 5 minute intervals between data points. If this is not the case the unfilled time is specified as not a number(nan) for graphs this is effectively 0. So the single point you see is going from nan to ~9000MB and back to nan. If you were to zoom in to a 5-15 minute graph surrounding that point, you may see instead that it holds that value for 5 minutes of a horizontal line before going back to nan. There are two solutions that we worked out, one is a quick and dirty fix, but should not be applied if you plan on using this for more than a select few services. The second can be applied to many services without issue.

First solution would be to create a new rrd that specifies 24h expected data points. This way when a fetch is performed there should only be valid data and no or very few nan's in the data. Graphs then will show correctly as it does not raise and lower to accommodate the nan. This is not something you would want to do for many items but is much easier than the second option.

Second option, is to create a new pnp template that specifies for this type of service, to create the correct rrd like mentioned above. It would work the same as the first option, but could be used for many services without issue as it would continually create the same style or definition of rrd.

We can go whichever route you would like, both will require some fiddling on both of our ends to get it correct. Also let me know if that description doesn't make sense, it is a bit hard to describe.

Posted: **Wed Sep 11, 2013 7:46 am**

Hi,

Just changed the uptime service checks from check_nt to nrpe (as it has more options) As I configured the service with a template and checks run only once a day (I think there is no need to poll a server every 5 minutes for it's uptime), I might be running into the same problem as the Powershell script, because the graphs generated by Nagios seem to be empty, except for the first few checks.
So I would really like to find out how I can create this new pnp template so my graphs are consistent even when checks run on a lower then 5 minutes interval.
Please let me know how I should proceed with this.

Something else I would like to know is if it's possible to disable perfdata for a service completely. As having a graph for uptime seems not worth the hassle, in fact I would like to know if it's possible to jsut completely disable perfdata for a service or for all services that depend on a service template.

This is the command I use:
$USER1$/check_nrpe -H $HOSTADDRESS$ -t 30 -c CheckUpTime -a MaxWarn=$ARG1$ MaxCrit=$ARG2$ ShowAll
I tried adding perfData=false but apparently checkuptime doesn't know that argument.

Thanks again for the help and sorry for the late answer, but I'm kind of swarmed in work lately..

Grtz

Nagios Support Forum

Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell

Re: Passing arguments to Powershell