Page 1 of 1
Hostreport uptime 100% despite servers were restarted
Posted: Fri Apr 22, 2016 10:01 am
by mkhan12282
Hello team
We have few servers i.e. Windows servers.
Two of servers were restarted on 9th Mar.
Server name: Laanice1 restarted around 1733 GMT
Laanice2 restarted around 1933 GMT
Monthly report that nagios sends out show 100% Time up. We trying to understand why it is showing 100% uptime despite the fact that servers were restarted.
Let me know should you need any more details from Nagios. Please find word doc attach.
Thanks.
MK
Re: Hostreport uptime 100% despite servers were restarted
Posted: Fri Apr 22, 2016 10:22 am
by bwallace
Those results are expected if you had scheduled downtime for these servers for maintenance or whatever the reason. Did you schedule downtime for theses servers during the times you mentioned?
Re: Hostreport uptime 100% despite servers were restarted
Posted: Mon Apr 25, 2016 4:09 am
by mkhan12282
Hi
Sorry for the delay in replying back as i was checking if downtime was scheduled.
I can confirm, downtime was not scheduled in nagios.
Would it matter for Nagios reporting if servers got back online within 10 checks duration of 10 minutes? Say ping check, we have this set as followed.
Max check attemps: 10
Normal check interval: 3 min
Retry check Interval: 1 min
Please let me know.
Thanks.
MK
Re: Hostreport uptime 100% despite servers were restarted
Posted: Mon Apr 25, 2016 8:27 am
by nozlaf
your host is rebooting between checks and as such is not being noticed by nagios
if you check once every 3 minutes and your server takes less than 3 minutes to reboot nagios doesnt know and if you use soft then hard states it has even more time
Re: Hostreport uptime 100% despite servers were restarted
Posted: Mon Apr 25, 2016 8:34 am
by eloyd
nozlaf is right. A scheduled reboot is also not considered downtime for most people. You have a 3 minute check interval and max check attempts set to 10 with 1 minute retries. That means your machine has to be unresponsive for up to 3 minutes plus (10-1) x 1 minutes = 12 minutes before Nagios will alert. My recommendation would be to increase check_interval to five minutes and decrease max check attempts to 3. This means that you would only wait up to 5 minutes plus (3-1) x 1 minutes = 7 minutes for a notification. if the machine is unresponsive.
Re: Hostreport uptime 100% despite servers were restarted
Posted: Mon Apr 25, 2016 10:23 am
by rkennedy
Thanks @nozlaf & @eloyd!
@mkhan12282 - they are both correct. Let us know if you have any further questions.
Re: Hostreport uptime 100% despite servers were restarted
Posted: Mon Apr 25, 2016 11:49 am
by mkhan12282
A big Thank You to all of you.
We can close this ticket
Regards
MK