Page 1 of 1

Issue with MSSQL Alerting Thresholds

Posted: Mon Apr 29, 2013 2:38 am
by dbsaust
Hi,

I have been struggling for some time to get the alerting thresholds to be interpreted correctly. We have set up a stored procedure to be triggered by the Nagios client as follows:

-H XX.XX.XX.XX --port 1433 --username XXXXXX --password "XXXXXXX" --database XXXXXX --query "EXECUTE usp_Alert_JobEvent" --result 0 --warning 1 --critical 1

The stored procedure will provide a result of 1 if there is an issue, however I don't seem to be able to get Nagios to interpret that as a notification alert. Have I perhaps set this up incorrectly?

Re: Issue with MSSQL Alerting Thresholds

Posted: Mon Apr 29, 2013 7:18 am
by scottwilkerson
Leave the --warning 1 --critical 1 off as if present they are used to time the query, whereas the result will match the output

Re: Issue with MSSQL Alerting Thresholds

Posted: Tue Apr 30, 2013 2:59 am
by dbsaust
Hi Scott,

I removed the critical and warning parameters as advised. I.e.
-H XX.XX.XX.XX --port 1433 --username XXXXXX --password "XXXXXXX" --database XXXXXX --query "EXECUTE usp_Alert_JobEvent" --result 0

Then I updated the stored procedure to provide an exit output of "1" for Ok (just to test a critical alert response). However after triggering the check command, the Nagios client was still giving me an OK message. Even though the result is expecting a "0" instead of the returned "1". Any idea why?

Re: Issue with MSSQL Alerting Thresholds

Posted: Tue Apr 30, 2013 1:05 pm
by scottwilkerson
Are you getting output "Query results matched"?

Re: Issue with MSSQL Alerting Thresholds

Posted: Wed May 01, 2013 12:16 am
by dbsaust
Hi Scott,

From the command line:
[root@aunagiosxi libexec]# ./check_mssql -H XX.XX.XX.XX --port 1433 --username XXXXX --password "XXXXXX" --database XXXXXXX --query "EXECUTE usp_Alert_JobEventDataNotPopulating" --result 1
OK: Query results matched, query duration=0.011417 seconds.

[root@aunagiosxi libexec]# ./check_mssql -H XX.XX.XX.XX --port 1433 --username XXXXX --password "XXXXXX" --database XXXXXXX --query "EXECUTE usp_Alert_JobEventDataNotPopulating" --result 0
OK: Query duration=0.011783 seconds.

So it is registering ok, just not interpreting the alert correctly. When I tag on the warning and critical I get different results intermittently, it's all very strange:

[root@aunagiosxi libexec]# ./check_mssql -H XX.XX.XX.XX --port 1433 --username XXXXX --password "XXXXXX" --database XXXXXXX --query "EXECUTE usp_Alert_JobEventDataNotPopulating" --result 1 --warning 0 --critical 0
CRITICAL: Query expected "1" but got "0".

[root@aunagiosxi libexec]# ./check_mssql -H XX.XX.XX.XX --port 1433 --username XXXXX --password "XXXXXX" --database XXXXXXX --query "EXECUTE usp_Alert_JobEventDataNotPopulating" --result 0 --warning 1 --critical 1
OK: Query duration=0.011621 seconds.

[root@aunagiosxi libexec]# ./check_mssql -H XX.XX.XX.XX --port 1433 --username XXXXX --password "XXXXXX" --database XXXXXXX --query "EXECUTE usp_Alert_JobEventDataNotPopulating" --result 1 --warning 0 --critical 0
CRITICAL: Query results matched, query duration=0.010963 seconds.

Re: Issue with MSSQL Alerting Thresholds

Posted: Wed May 01, 2013 11:13 am
by scottwilkerson
I believe you have a version of the plugin that has a bug in it.

Open /usr/local/nagios/libexec/check_mssql

around line 302 change

Code: Select all

if ($querytype == "query" && !empty($expected_result)) {
to

Code: Select all

if ($querytype == "query" && isset($expected_result)) {
this was a bug that was discovered, corrected and then creeped back into the code...

Re: Issue with MSSQL Alerting Thresholds

Posted: Thu May 09, 2013 1:45 am
by dbsaust
Hi Scott, I updated the code and that appears to have fixed the issue. All of a sudden all of my queries are firing up alerts which in the past they did not!

So thankyou very much for that insight into the issue. It's had me scratching my head for quite some time!