I had received ton of Host down alerts notification. I logged in and checked all the servers were UP, pingable and ran uptime command and everything the apps and server were UP, none of them recycled and got recovered like 10 minutes later after the next check. Why were we seeing this alerts??
Notification Type: PROBLEM
Host: xx.xxx.xxx.xxx
State: DOWN
Info: CRITICAL - xx.xx.xx.xx: rta nan, lost 100%
Thanks in advance.
Host Down
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Host Down
This looks like at the time of the alert they were not reachable from the XI server.