Host down alert issue

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
Mani.Murugesan
Posts: 67
Joined: Fri May 12, 2017 1:37 am

Host down alert issue

Post by Mani.Murugesan »

Hello Team,

Recently i have received one host down alert that UNIX server and i have checked my state history so that check attempts are happened in frequently that means if host is showing down in 06:48:10 and it converted to OK at 06:48:28 so i can't understand why it's happened

note:my check interval 5 retry interval:1 check attempts:5

at the same time i have scheduled one job for continues pinging and that day morning it's went down that is soft alert at morning 6:40 but i have checked my state history in Nagios down alert is there but my text file which is scheduled in my putty there i am not able to see any packet loss ?

1)could you please help me to understand why checks are happened frequently (within a seconds) ?
2)why it's showing down in state history but in ping command it's not showing ?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Host down alert issue

Post by scottwilkerson »

Mani.Murugesan wrote:1)could you please help me to understand why checks are happened frequently (within a seconds) ?
When services of the host are checked and the host is currently down, the service will trigger a re-check of the host to make sure it is still in a down state of the host was checked longer ago that the cached_host_check_horizon value in the nagios.cfg (which is 15 seconds by default). If the host is now found ot be in an UP state the state change is recorded at that time.
Mani.Murugesan wrote:2)why it's showing down in state history but in ping command it's not showing ?
There are numerous possibilities depending on the thresholds set for the host. It could have been down because the latency was higher than allowed, or there could have been a network problem in the split second that the check was attempted which caused the host to be down.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Mani.Murugesan
Posts: 67
Joined: Fri May 12, 2017 1:37 am

Re: Host down alert issue

Post by Mani.Murugesan »

THANKS scottwilkerson :)

nice explanation.
same host down issue . now the scenario is for example yesterday i have received host down alert in nagios that is unix server so unix team told like server is up since last 10 days and now i found that 100 % packet loss so i decided to check with network team.
They said like we have configured too many servers in this Vlan why only this server went down ?
if i am wrong please correct me if you are configuring 3 servers A,B,C servers in one VLAN so yesterday A server went down how at the same time B server goes down if problem occurred in only A server may be any checks or packet drops anything can happen in right? so that's why A went down but B and C server is working fine at the same time A server is down state.

Now my question is which side i need to check network or nagios ?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Host down alert issue

Post by scottwilkerson »

Mani.Murugesan wrote: Now my question is which side i need to check network or nagios ?
I would say network.. It's possible that several of the servers stayed up while the other went down, it would depend on the configuration of the network.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Mani.Murugesan
Posts: 67
Joined: Fri May 12, 2017 1:37 am

Re: Host down alert issue

Post by Mani.Murugesan »

Thanks for your Response.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Host down alert issue

Post by scottwilkerson »

Mani.Murugesan wrote:Thanks for your Response.
No problem
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked