"UNKNOWN" services checks status while monitoring our window
Posted: Thu Mar 28, 2019 9:09 am
Hi,
We are experiencing issues with many "UNKNOWN" services checks status while monitoring our windows servers
The Nagios log is full of the following error message :
"CURRENT SERVICE STATE: DUUH-XXXX-01.XXXX.corp;Performance - TCPv4/Connection Failures;UNKNOWN;HARD;3;UNKNOWN: Error occurred while running the plugin. Use the verbose flag for more details."
"SERVICE ALERT: XXXX-XX-04.XXX.local;Telnet Socket - pcp+ptr snap_players_count;UNKNOWN;SOFT;1;UNKNOWN: Error occurred while running the plugin. Use the verbose flag for more details.."
It seems the many service checks fails in their first attempt and in the next checks it succeeds.
We WERE NOT able to manually reproduce the failure while using the verbose mode .
On the remote monitored windows server the local NCPA logs is has the following error message :
"2019-03-28 12:35:04,694:ERROR:windowscounters:(-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
Traceback (most recent call last):
File "C:\ncpa\agent\listener\windowscounters.py", line 43, in counter_method
return WindowsCountersNode.get_counter_val(self.name, *args, **kwargs)
File "C:\ncpa\agent\listener\windowscounters.py", line 79, in get_counter_val
_, value = win32pdh.GetFormattedCounterValue(counter, win32pdh.PDH_FMT_DOUBLE)
error: (-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
2019-03-28 12:35:08,891:ERROR:windowscounters:(-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
Traceback (most recent call last):
File "C:\ncpa\agent\listener\windowscounters.py", line 43, in counter_method
return WindowsCountersNode.get_counter_val(self.name, *args, **kwargs)
File "C:\ncpa\agent\listener\windowscounters.py", line 79, in get_counter_val
_, value = win32pdh.GetFormattedCounterValue(counter, win32pdh.PDH_FMT_DOUBLE)
error: (-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.') "
The strange thing about the above is that our installation isn't in this location but rather on drive D:\
Our environment is VMWare based
Our current configuration is
Nagios XI version: 5.5.5
Number of services : 6560
Number of Hosts: 456
NCPA version – 2.1.5
We will appreciate your assistance with this problem
We are experiencing issues with many "UNKNOWN" services checks status while monitoring our windows servers
The Nagios log is full of the following error message :
"CURRENT SERVICE STATE: DUUH-XXXX-01.XXXX.corp;Performance - TCPv4/Connection Failures;UNKNOWN;HARD;3;UNKNOWN: Error occurred while running the plugin. Use the verbose flag for more details."
"SERVICE ALERT: XXXX-XX-04.XXX.local;Telnet Socket - pcp+ptr snap_players_count;UNKNOWN;SOFT;1;UNKNOWN: Error occurred while running the plugin. Use the verbose flag for more details.."
It seems the many service checks fails in their first attempt and in the next checks it succeeds.
We WERE NOT able to manually reproduce the failure while using the verbose mode .
On the remote monitored windows server the local NCPA logs is has the following error message :
"2019-03-28 12:35:04,694:ERROR:windowscounters:(-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
Traceback (most recent call last):
File "C:\ncpa\agent\listener\windowscounters.py", line 43, in counter_method
return WindowsCountersNode.get_counter_val(self.name, *args, **kwargs)
File "C:\ncpa\agent\listener\windowscounters.py", line 79, in get_counter_val
_, value = win32pdh.GetFormattedCounterValue(counter, win32pdh.PDH_FMT_DOUBLE)
error: (-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
2019-03-28 12:35:08,891:ERROR:windowscounters:(-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.')
Traceback (most recent call last):
File "C:\ncpa\agent\listener\windowscounters.py", line 43, in counter_method
return WindowsCountersNode.get_counter_val(self.name, *args, **kwargs)
File "C:\ncpa\agent\listener\windowscounters.py", line 79, in get_counter_val
_, value = win32pdh.GetFormattedCounterValue(counter, win32pdh.PDH_FMT_DOUBLE)
error: (-1073738822, 'GetFormattedCounterValue', 'The returned data is not valid.') "
The strange thing about the above is that our installation isn't in this location but rather on drive D:\
Our environment is VMWare based
Our current configuration is
Nagios XI version: 5.5.5
Number of services : 6560
Number of Hosts: 456
NCPA version – 2.1.5
We will appreciate your assistance with this problem