Re: [Nagios-devel] Re: Percieved problem with host checks

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Re: Percieved problem with host checks

Post by Guest »

It is my personal opinion that if a host should be considered to be
in an "UP" state, at least one of the services you're monitoring
should be in an OK state (or at least change between non-OK states
when the host recovers). That model should work for most everyone
using Nagios.

However, if you really want the status of the host re-verified after
service checks even if all services associated with the host stay in
a non-OK state when it recovers, enable the aggressive host check
option in the main config file. Performance will suffer if you
enable this option, but that's a tradeoff you'll have to be willing
to accept.

A snippet from base/checks.c starting at line 888:
---

/*******************************************/
/******* SERVICE CHECK PROBLEM LOGIC *******/
/*******************************************/

/* hey, something's not working quite like it should... */
else{

/* reset the recovery notification flag (it may get set again
though) */
temp_service->no_recovery_notification=FALSE;

/* check the route to the host if its supposed to be up right
now... */
if(temp_host->status==HOST_UP)
route_result=verify_route_to_host(temp_host);

/* else the host is either down or unreachable, so recheck it if
necessary */
else{

/* we're using agressive host checking, so really do recheck the
host... */
if(use_aggressive_host_checking==TRUE)
route_result=verify_route_to_host(temp_host);

/* the service wobbled between non-OK states, so check the
host... */
else if(state_change==TRUE && temp_service-
>last_hard_state!=STATE_OK)
route_result=verify_route_to_host(temp_host);

/* else fake the host check, but (possibly) resend host
notifications to contacts... */
else{

---



On 18 Sep 2002 at 21:44, SyBase wrote:

> you are completely missing my point.. Ping (icmp) is controlled by the
> OS. If your OS is down (i.e. never booted) icmp will not come back, and
> neither will your service. But however if the host (operating system)
> booted but for some reason services were never engaged then the problem
> is completely different. Simply stating that all nagios cares about is
> services and that if you have no available services (for which you
> monitor) then the box is down, then what is the point in the host check
> to begin with? Lets remove that completely and make people add their own
> icmp service if they want to check that. I hope you see my point. I
> would think accuracy in showing what is really the happening (i.e. not
> saying host down when the host is really up) would be very important. If
> you do not agree, that is fine.. Simply a suggestion.
>
> Kenneth.ray wrote:
>
> > Dear Sir,
> > thank you for your email however, I do believe
> > actually your test is flawed. you are under the premise that
> > the host is the important piece to your network. but actually
> > the service running on the host is the most important issue.
> > what good is a host that has no services available,
> > ping is a really basic service and only helps in determing
> > that the network interface is accessable. In some cases
> > it is quite possible for the ping to work and the box be down.
> > If this is a real problem for you, add a seperate service called
> > "alive " "pingable" or something related, and run the ping as a
> > service ,this will change your host to be "up" even if no services
> > are available from it. But Again, in my own humble opinion. this
> > proves nothing other than, you can ping the interface, your not
> > even pinging the server, your pinging the network card which
> > is hooked to the box.
> >
> > think of the logic in this sense, if a host is not a actual entity
> > but really a container/conduit to your services. so logic would
> > dictate that regardless if the entity for serving your services is
> > available, the real issue is not the entity but the service provided
> > by the container. IMHO you can actually replace the word "host" with
> > container and the logic of netsaint still holds up. Netsaint is
> > service based
> > not server based. and uses the logic of, "what good is a host without
> > someth

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked