Page 1 of 1
Windows Servers incorrectly reported as DOWN
Posted: Fri Jun 01, 2018 12:10 pm
by macarranza
We are monitoring windows servers with WMI.
Eventually, we get reports that the server is DOWN (the server is really UP).
The following are recurring issues:
1. We go to this server "HOST DETAILS" Down
UNKNOWN - Plugin Timed out (15 sec). There are multiple possible reasons for this, some of them include - The host 172.24.64.30 might just be really busy, it might not even be running Windows".
2. No data is displayed in the graph for server availability (the information for CPU, MEMORY, etc is available)
3. whe we force the check or do a ping, we do get a response
4. the server is not "really busy" at the time of the checks
5. sometimes restargint the WMI service fixes the issue.
Please help
Re: Windows Servers incorrectly reported as DOWN
Posted: Fri Jun 01, 2018 1:46 pm
by npolovenko
@macarranza, Have you tried increasing the timeout from 15 seconds to 40 seconds? Just add -t 40 at the end of your command.
Re: Windows Servers incorrectly reported as DOWN
Posted: Fri Jun 01, 2018 2:59 pm
by macarranza
No, we have not. I will do it. But its not a time out issue.there is information for CPU, Meme, Ping and c drive (see attachment). ITs jus reporting server as down for some reason. If I force the check, it reports success...but the server still shows as down.
Re: Windows Servers incorrectly reported as DOWN
Posted: Fri Jun 01, 2018 3:19 pm
by macarranza
BTW, to which command should i add the t-40?
Re: Windows Servers incorrectly reported as DOWN
Posted: Fri Jun 01, 2018 3:36 pm
by npolovenko
@macarranza, There are different protocols involved when you run the WMI checks and when you run a ping check. Ping discovery can be turned off but the WMI checks will still work. Or you may have some other discovery issues on the network.
Please go to the core configurations manager and click on Hosts, then open the WMI host settings and take a screenshot of that page for me. The config menu usually looks like this:
Untitled.png
Re: Windows Servers incorrectly reported as DOWN
Posted: Fri Jun 01, 2018 5:58 pm
by macarranza
Thanks. here is the screenshot.
I see WMI is using drive C to check if host is alive? is this correct?
Is it better to do it with ping?
all our windows servers were setup the same way, can they all be change at the same time.
Thanks
Re: Windows Servers incorrectly reported as DOWN
Posted: Mon Jun 04, 2018 9:43 am
by lmiltchev
Usually, you would use check_icmp (check_ping) for the host check. Here's an example of a "default" template that is used for a Windows WMI host, added by the "Windows WMI" wizard:
example01.PNG
You could either add the "xiwizard_windowswmi_host" template to the host, or define the check command directly on a host level under the CCM.
Re: Windows Servers incorrectly reported as DOWN
Posted: Thu Jun 07, 2018 11:06 am
by macarranza
Thanks.
That helped.
We changed all windows servers to "no command (blank)" for the "check command" option. Apparently this makes the check for host-is-alive to use ping and not CPU in all servers.
Thanks
MC
Re: Windows Servers incorrectly reported as DOWN
Posted: Thu Jun 07, 2018 11:40 am
by lmiltchev
I am glad I was able to help! I will be closing this topic now. If you have any more questions/issues, please start a new thread.