check_mssql_server.py type 'exceptions.IndexError'
Posted: Mon Nov 27, 2017 4:44 pm
Afternoon everyone,
I am trying to troubleshoot an issue where on one of our MSSQL Clusters, when I query the SQL instance for checks using check_mssql_server.py, I get the following error:
What makes this especially strange is that it works on one cluster, but not on another. To that effect, I've been trying to decipher what's different about them to see why it works on one but not the other. Both servers are SQL Server 2012, and both are using a local "nagios" SQL user that has the same database-wide permissions. Both of the instances are named MSSQLSERVER, and the "MSSQL Connection Time" check does work. All of the others create the error above.
One of the first things I thought to try based on the messages above was to look at nmap and see what the accessible ports were for each server. Here's the results:
NOT WORKING SERVER
WORKING SERVER
Things I notice based on this:
* The working server has an rDNS entry ... the non-working server doesn't.
* The working server has more open ports, but the port I would expect that we need for this to work, 1433, is open on both servers, so I don't think the differences in open ports matters a whole lot here.
Would love everyone's thought as to what I might be missing here. Thanks!
I am trying to troubleshoot an issue where on one of our MSSQL Clusters, when I query the SQL instance for checks using check_mssql_server.py, I get the following error:
Code: Select all
<type 'exceptions.IndexError'>
Caught unexpected error. This could be caused by your sysperfinfo not containing the proper entries for this query, and you may delete this service check.One of the first things I thought to try based on the messages above was to look at nmap and see what the accessible ports were for each server. Here's the results:
NOT WORKING SERVER
Code: Select all
Starting Nmap 6.25 ( http://nmap.org ) at 2017-11-27 14:24 EST
Nmap scan report for (not working server) (10.100.96.14)
Host is up (0.00043s latency).
Not shown: 986 closed ports
PORT STATE SERVICE
80/tcp open http
135/tcp open msrpc
139/tcp open netbios-ssn
445/tcp open microsoft-ds
1433/tcp open ms-sql-s
1556/tcp open veritas_pbx
2383/tcp open ms-olap4
2701/tcp open sms-rcinfo
3389/tcp open ms-wbt-server
5666/tcp open nrpe
13782/tcp open netbackup
49152/tcp open unknown
49153/tcp open unknown
49154/tcp open unknown
Nmap done: 1 IP address (1 host up) scanned in 2.48 secondsCode: Select all
Starting Nmap 6.25 ( http://nmap.org ) at 2017-11-27 16:04 EST
Nmap scan report for (working server) (10.100.8.34)
Host is up (0.00018s latency).
rDNS record for 10.100.8.34: (hidden)
Not shown: 978 closed ports
PORT STATE SERVICE
80/tcp open http
111/tcp open rpcbind
135/tcp open msrpc
139/tcp open netbios-ssn
445/tcp open microsoft-ds
1023/tcp open netvenuechat
1433/tcp open ms-sql-s
1556/tcp open veritas_pbx
2301/tcp open compaqdiag
2381/tcp open compaq-https
2383/tcp open ms-olap4
2701/tcp open sms-rcinfo
3389/tcp open ms-wbt-server
5666/tcp open nrpe
6129/tcp open unknown
8009/tcp open ajp13
8080/tcp open http-proxy
8443/tcp open https-alt
13782/tcp open netbackup
49152/tcp open unknown
49153/tcp open unknown
49154/tcp open unknown
Nmap done: 1 IP address (1 host up) scanned in 1.57 seconds* The working server has an rDNS entry ... the non-working server doesn't.
* The working server has more open ports, but the port I would expect that we need for this to work, 1433, is open on both servers, so I don't think the differences in open ports matters a whole lot here.
Would love everyone's thought as to what I might be missing here. Thanks!