It is my personal opinion that if a host should be considered to be
in an "UP" state, at least one of the services you're monitoring
should be in an OK state (or at least change between non-OK states
when the host recovers). That model should work for most everyone
using Nagios.
However, if you really want the status of the host re-verified after
service checks even if all services associated with the host stay in
a non-OK state when it recovers, enable the aggressive host check
option in the main config file. Performance will suffer if you
enable this option, but that's a tradeoff you'll have to be willing
to accept.
A snippet from base/checks.c starting at line 888:
---
/*******************************************/
/******* SERVICE CHECK PROBLEM LOGIC *******/
/*******************************************/
/* hey, something's not working quite like it should... */
else{
/* reset the recovery notification flag (it may get set again
though) */
temp_service->no_recovery_notification=FALSE;
/* check the route to the host if its supposed to be up right
now... */
if(temp_host->status==HOST_UP)
route_result=verify_route_to_host(temp_host);
/* else the host is either down or unreachable, so recheck it if
necessary */
else{
/* we're using agressive host checking, so really do recheck the
host... */
if(use_aggressive_host_checking==TRUE)
route_result=verify_route_to_host(temp_host);
/* the service wobbled between non-OK states, so check the
host... */
else if(state_change==TRUE && temp_service-
>last_hard_state!=STATE_OK)
route_result=verify_route_to_host(temp_host);
/* else fake the host check, but (possibly) resend host
notifications to contacts... */
else{
---
On 18 Sep 2002 at 21:44, SyBase wrote:
> you are completely missing my point.. Ping (icmp) is controlled by the
> OS. If your OS is down (i.e. never booted) icmp will not come back, and
> neither will your service. But however if the host (operating system)
> booted but for some reason services were never engaged then the problem
> is completely different. Simply stating that all nagios cares about is
> services and that if you have no available services (for which you
> monitor) then the box is down, then what is the point in the host check
> to begin with? Lets remove that completely and make people add their own
> icmp service if they want to check that. I hope you see my point. I
> would think accuracy in showing what is really the happening (i.e. not
> saying host down when the host is really up) would be very important. If
> you do not agree, that is fine.. Simply a suggestion.
>
> Kenneth.ray wrote:
>
> > Dear Sir,
> > thank you for your email however, I do believe
> > actually your test is flawed. you are under the premise that
> > the host is the important piece to your network. but actually
> > the service running on the host is the most important issue.
> > what good is a host that has no services available,
> > ping is a really basic service and only helps in determing
> > that the network interface is accessable. In some cases
> > it is quite possible for the ping to work and the box be down.
> > If this is a real problem for you, add a seperate service called
> > "alive " "pingable" or something related, and run the ping as a
> > service ,this will change your host to be "up" even if no services
> > are available from it. But Again, in my own humble opinion. this
> > proves nothing other than, you can ping the interface, your not
> > even pinging the server, your pinging the network card which
> > is hooked to the box.
> >
> > think of the logic in this sense, if a host is not a actual entity
> > but really a container/conduit to your services. so logic would
> > dictate that regardless if the entity for serving your services is
> > available, the real issue is not the entity but the service provided
> > by the container. IMHO you can actually replace the word "host" with
> > container and the logic of netsaint still holds up. Netsaint is
> > service based
> > not server based. and uses the logic of, "what good is a host without
> > someth
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]