Hello Team,
Recently i have received one host down alert that UNIX server and i have checked my state history so that check attempts are happened in frequently that means if host is showing down in 06:48:10 and it converted to OK at 06:48:28 so i can't understand why it's happened
note:my check interval 5 retry interval:1 check attempts:5
at the same time i have scheduled one job for continues pinging and that day morning it's went down that is soft alert at morning 6:40 but i have checked my state history in Nagios down alert is there but my text file which is scheduled in my putty there i am not able to see any packet loss ?
1)could you please help me to understand why checks are happened frequently (within a seconds) ?
2)why it's showing down in state history but in ping command it's not showing ?
Host down alert issue
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Host down alert issue
When services of the host are checked and the host is currently down, the service will trigger a re-check of the host to make sure it is still in a down state of the host was checked longer ago that the cached_host_check_horizon value in the nagios.cfg (which is 15 seconds by default). If the host is now found ot be in an UP state the state change is recorded at that time.Mani.Murugesan wrote:1)could you please help me to understand why checks are happened frequently (within a seconds) ?
There are numerous possibilities depending on the thresholds set for the host. It could have been down because the latency was higher than allowed, or there could have been a network problem in the split second that the check was attempted which caused the host to be down.Mani.Murugesan wrote:2)why it's showing down in state history but in ping command it's not showing ?
-
Mani.Murugesan
- Posts: 67
- Joined: Fri May 12, 2017 1:37 am
Re: Host down alert issue
THANKS scottwilkerson
nice explanation.
same host down issue . now the scenario is for example yesterday i have received host down alert in nagios that is unix server so unix team told like server is up since last 10 days and now i found that 100 % packet loss so i decided to check with network team.
They said like we have configured too many servers in this Vlan why only this server went down ?
if i am wrong please correct me if you are configuring 3 servers A,B,C servers in one VLAN so yesterday A server went down how at the same time B server goes down if problem occurred in only A server may be any checks or packet drops anything can happen in right? so that's why A went down but B and C server is working fine at the same time A server is down state.
Now my question is which side i need to check network or nagios ?
nice explanation.
same host down issue . now the scenario is for example yesterday i have received host down alert in nagios that is unix server so unix team told like server is up since last 10 days and now i found that 100 % packet loss so i decided to check with network team.
They said like we have configured too many servers in this Vlan why only this server went down ?
if i am wrong please correct me if you are configuring 3 servers A,B,C servers in one VLAN so yesterday A server went down how at the same time B server goes down if problem occurred in only A server may be any checks or packet drops anything can happen in right? so that's why A went down but B and C server is working fine at the same time A server is down state.
Now my question is which side i need to check network or nagios ?
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Host down alert issue
I would say network.. It's possible that several of the servers stayed up while the other went down, it would depend on the configuration of the network.Mani.Murugesan wrote: Now my question is which side i need to check network or nagios ?
-
Mani.Murugesan
- Posts: 67
- Joined: Fri May 12, 2017 1:37 am
Re: Host down alert issue
Thanks for your Response.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Host down alert issue
No problemMani.Murugesan wrote:Thanks for your Response.