Windows Servers incorrectly reported as DOWN

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
macarranza
Posts: 14
Joined: Fri May 18, 2018 5:26 pm

Windows Servers incorrectly reported as DOWN

Post by macarranza »

We are monitoring windows servers with WMI.
Eventually, we get reports that the server is DOWN (the server is really UP).
The following are recurring issues:
1. We go to this server "HOST DETAILS" Down
UNKNOWN - Plugin Timed out (15 sec). There are multiple possible reasons for this, some of them include - The host 172.24.64.30 might just be really busy, it might not even be running Windows".
2. No data is displayed in the graph for server availability (the information for CPU, MEMORY, etc is available)
3. whe we force the check or do a ping, we do get a response
4. the server is not "really busy" at the time of the checks
5. sometimes restargint the WMI service fixes the issue.

Please help
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Windows Servers incorrectly reported as DOWN

Post by npolovenko »

@macarranza, Have you tried increasing the timeout from 15 seconds to 40 seconds? Just add -t 40 at the end of your command.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
macarranza
Posts: 14
Joined: Fri May 18, 2018 5:26 pm

Re: Windows Servers incorrectly reported as DOWN

Post by macarranza »

No, we have not. I will do it. But its not a time out issue.there is information for CPU, Meme, Ping and c drive (see attachment). ITs jus reporting server as down for some reason. If I force the check, it reports success...but the server still shows as down.
You do not have the required permissions to view the files attached to this post.
macarranza
Posts: 14
Joined: Fri May 18, 2018 5:26 pm

Re: Windows Servers incorrectly reported as DOWN

Post by macarranza »

BTW, to which command should i add the t-40?
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Windows Servers incorrectly reported as DOWN

Post by npolovenko »

@macarranza, There are different protocols involved when you run the WMI checks and when you run a ping check. Ping discovery can be turned off but the WMI checks will still work. Or you may have some other discovery issues on the network.
Please go to the core configurations manager and click on Hosts, then open the WMI host settings and take a screenshot of that page for me. The config menu usually looks like this:
Untitled.png
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
macarranza
Posts: 14
Joined: Fri May 18, 2018 5:26 pm

Re: Windows Servers incorrectly reported as DOWN

Post by macarranza »

Thanks. here is the screenshot.
I see WMI is using drive C to check if host is alive? is this correct?
Is it better to do it with ping?
all our windows servers were setup the same way, can they all be change at the same time.

Thanks
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Windows Servers incorrectly reported as DOWN

Post by lmiltchev »

Usually, you would use check_icmp (check_ping) for the host check. Here's an example of a "default" template that is used for a Windows WMI host, added by the "Windows WMI" wizard:
example01.PNG
You could either add the "xiwizard_windowswmi_host" template to the host, or define the check command directly on a host level under the CCM.
You do not have the required permissions to view the files attached to this post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
macarranza
Posts: 14
Joined: Fri May 18, 2018 5:26 pm

Re: Windows Servers incorrectly reported as DOWN

Post by macarranza »

Thanks.
That helped.
We changed all windows servers to "no command (blank)" for the "check command" option. Apparently this makes the check for host-is-alive to use ping and not CPU in all servers.
Thanks
MC
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Windows Servers incorrectly reported as DOWN

Post by lmiltchev »

I am glad I was able to help! I will be closing this topic now. If you have any more questions/issues, please start a new thread.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked