Looking for help with Nagios Core 3.5.0 and Windows servers
Posted: Tue Oct 11, 2016 12:04 pm
We are running an old version of nagios (intentionally) and are seeing an issue with ping.
Most of the time the ping works fine. Occasionally we see a
[10-10-2016 21:45:21] SERVICE ALERT: mylocalserver;PING;CRITICAL;HARD;1;CRITICAL - 999.999.99.99: rta nan, lost 100%
[10-10-2016 21:45:51] SERVICE ALERT: mylocalserver;PING;OK;HARD;1;OK - 999.999.99.99: rta 2.694ms, lost 0%
This server is a windows server. Other windows servers occasionally (maybe once or twice a month) show the same behavior.
When we look at the Nagios details, it reports 100% up time - apparently ignoring these moments of lost information.
Someone locally has suggested that perhaps too many pings are occurring around the same time.
Does any of this sound familiar to anyone? We don't want to just start making changes willy nilly without some sort of confirmation that the direction makes sense - or if there is some place to begin - or if there's a document we could read to help us trouble shoot this.
People are frustrated by getting calls about server issues, but when they log into their server, there's no sign of an issue.
Help us, Obi-Wan...
Most of the time the ping works fine. Occasionally we see a
[10-10-2016 21:45:21] SERVICE ALERT: mylocalserver;PING;CRITICAL;HARD;1;CRITICAL - 999.999.99.99: rta nan, lost 100%
[10-10-2016 21:45:51] SERVICE ALERT: mylocalserver;PING;OK;HARD;1;OK - 999.999.99.99: rta 2.694ms, lost 0%
This server is a windows server. Other windows servers occasionally (maybe once or twice a month) show the same behavior.
When we look at the Nagios details, it reports 100% up time - apparently ignoring these moments of lost information.
Someone locally has suggested that perhaps too many pings are occurring around the same time.
Does any of this sound familiar to anyone? We don't want to just start making changes willy nilly without some sort of confirmation that the direction makes sense - or if there is some place to begin - or if there's a document we could read to help us trouble shoot this.
People are frustrated by getting calls about server issues, but when they log into their server, there's no sign of an issue.
Help us, Obi-Wan...