MSSQL Query Issue
-
FrontlineIT
- Posts: 94
- Joined: Tue Jul 26, 2016 8:46 am
MSSQL Query Issue
Hello. We're having issues with certain MSSQL query checks that work on an older version of Nagios XI (5.4.13) but do not work on the newer Nagios XI versions (5.6.5). Below is an example query we're having trouble with. You can see the query works fine and returns an okay status as should be on v5.4.13. However, on v5.6.5 it returns a "Query result FE was higher than Query critical threshold 180" when that is clearly not the case. Can I get some assistance?
--username ***** --password "******" --database ***** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
Old Nagios XI Result (5.4.13)
/usr/local/nagios/libexec/check_mssql -H ********** --username ***** --password "*****" --database ***** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
OK: Query duration=0.032597 seconds.|query_duration=0.032597s;50;200 'computed'=61;;180
New Nagios XI Result (5.6.5)
/usr/local/nagios/libexec/check_mssql -H ***** --username **** --password "****" --database **** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
--username ***** --password "******" --database ***** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
Old Nagios XI Result (5.4.13)
/usr/local/nagios/libexec/check_mssql -H ********** --username ***** --password "*****" --database ***** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
OK: Query duration=0.032597 seconds.|query_duration=0.032597s;50;200 'computed'=61;;180
New Nagios XI Result (5.6.5)
/usr/local/nagios/libexec/check_mssql -H ***** --username **** --password "****" --database **** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
You do not have the required permissions to view the files attached to this post.
Re: MSSQL Query Issue
You have --querycritical set to 180. querycritical measures the number of records returned, rather than the amount of time taken to execute the query. That still leaves us with the question of why one is alerting, but the other is not. It may have been a bug in the earlier version of check_mssql. If you run your query in SQL Server Management Studio, do you get more than 180 records returned?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
FrontlineIT
- Posts: 94
- Joined: Tue Jul 26, 2016 8:46 am
Re: MSSQL Query Issue
Hmmm...The check_mssql command is working fine in the earlier version of Nagios XI and the --querycritical is setup correctly to catch if an FE record hasn't been updated in > 180 seconds. The problem lies with the newer version of Nagios XI. Right now everything should be green but the newer version of Nagios XI returns the error as seen below in the screenshot. The older version of Nagios XI is returning an okay status and has the correct query critical result in seconds as shown in the Core Config Manager output below.
Re: MSSQL Query Issue
The old version is running the way you want it to run, but it's not necessarily running correctly.
--critical 200
The time to execute the query takes longer than 200 seconds.
Or
--querycritical 180
There are more than 180 records returned in the query.
I feel like there is a mystery here. But instead of diving right into that mystery, let's take the simpler route. The big question is, what do you want it to alert on? Do you want it to alert just on query execution time, or do you also want it alerting on query result count?
So as your command is shown right now, your check should go critical if-w, --warning Warning threshold in seconds on duration of check
-c, --critical Critical threshold in seconds on duration of check
-W, --querywarning Query warning threshold
-C, --querycritical Query critical threshold
--critical 200
The time to execute the query takes longer than 200 seconds.
Or
--querycritical 180
There are more than 180 records returned in the query.
I feel like there is a mystery here. But instead of diving right into that mystery, let's take the simpler route. The big question is, what do you want it to alert on? Do you want it to alert just on query execution time, or do you also want it alerting on query result count?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
FrontlineIT
- Posts: 94
- Joined: Tue Jul 26, 2016 8:46 am
Re: MSSQL Query Issue
I want it to alert based on the query result count. As seen in the attached screen shot directly from SQL Server Management Studio, the result count was currently sitting at 104 in the highlighted cell. I then ran the check manually through the Core Config Manager and got a similar query result count of 113 (though slightly different due to running a few seconds later) as seen below:
/usr/local/nagios/libexec/check_mssql -H ******** --username ***** --password "*****" --database ***** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
OK: Query duration=0.032689 seconds.|query_duration=0.032689s;50;200 'computed'=113;;180
/usr/local/nagios/libexec/check_mssql -H ******** --username ***** --password "*****" --database ***** --query "select Max (DateDiff (ss, LastUpdate, GetDate())),AppID from appstatus (nolock) where appprefix = 'ComSrv' and appid = 'FE' and appinst = 51 and partitionnum >= 0 group by AppID" --result "" --decode --warning 50 --critical 200 --querycritical 180
OK: Query duration=0.032689 seconds.|query_duration=0.032689s;50;200 'computed'=113;;180
You do not have the required permissions to view the files attached to this post.
-
FrontlineIT
- Posts: 94
- Joined: Tue Jul 26, 2016 8:46 am
Re: MSSQL Query Issue
I want the monitor to alert if that highlighted cell contains an integer > 180 which it does in the older Nagios XI version, just not the current version.
Re: MSSQL Query Issue
Ah, excellent, thank you for clarifying. I was caught up on the query execution time.
Can you try changing --result "" to --result "(No column name)" and see if that gets Nagios looking at the correct field? Otherwise the next step would be to select that aggregate column as a name and have --result look for that name.
I'm going to go back and see if I can find when/why this changed.
Can you try changing --result "" to --result "(No column name)" and see if that gets Nagios looking at the correct field? Otherwise the next step would be to select that aggregate column as a name and have --result look for that name.
I'm going to go back and see if I can find when/why this changed.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
FrontlineIT
- Posts: 94
- Joined: Tue Jul 26, 2016 8:46 am
Re: MSSQL Query Issue
I changed the --result '''' to --result "(No column name)" but that did not resolve the issue. I have reached out to our DBA team for some assistance on selecting that aggregate column as a name and have --result look for that name. Will let you know how it goes.
Re: MSSQL Query Issue
Alright, we will wait to hear back.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
FrontlineIT
- Posts: 94
- Joined: Tue Jul 26, 2016 8:46 am
Re: MSSQL Query Issue
I made the change to select that aggregate column as a name (I named the column "test") and have --result look for that name ("test") but am still having the same issue. I highlighted in bold the additions I made to the query and check below. I confirmed that the column name in SSMS is indeed named test. Is their a syntax issue with the --result section?
--username ***** --password "******" --database ***** --query "SELECT MAX(DateDiff (ss, LastUpdate, GetDate())) AS [test], AppID FROM appstatus (nolock) WHERE appprefix = 'ComSrv' AND appid = 'FE' AND appinst = 51 and partitionnum >= 0 GROUP BY AppID" --result "test" --decode --warning 50 --critical 200 --querycritical 180
--username ***** --password "******" --database ***** --query "SELECT MAX(DateDiff (ss, LastUpdate, GetDate())) AS [test], AppID FROM appstatus (nolock) WHERE appprefix = 'ComSrv' AND appid = 'FE' AND appinst = 51 and partitionnum >= 0 GROUP BY AppID" --result "test" --decode --warning 50 --critical 200 --querycritical 180