Page 1 of 2
Stop Service Checks When Host Down
Posted: Mon Dec 30, 2013 12:45 am
by zaji_nms
Dear Expert,
how nagios auto disable all service check when associated host goes down ( if ping critical 100% packetloss ) . i have a host and 300 services associated with it i dont want nagios to check all 300 services if host is down(pl 100% ).
if possible please explain in step by step.. its very urgent and important for other users reference as well
Regards
Re: Stop Service Checks When Host Down
Posted: Mon Dec 30, 2013 11:23 am
by abrist
Make sure that your host check intervals are less than or equal to your service check intervals. Nagios will usually not check services in a scheduled interval until the host is checked, unless the intervals for the services are less than the intervals for the hosts.
Re: Stop Service Checks When Host Down
Posted: Tue Dec 31, 2013 3:12 am
by zaji_nms
so inorder to avoid my scenario i have to keep less intreval for all my host and if host state down related service will not be checked please correct me if i stated wrong ?
Re: Stop Service Checks When Host Down
Posted: Thu Jan 02, 2014 10:10 am
by slansing
Yes, generally you want your hosts to be checked faster, or between your services. This way, if your host goes down nagios will know immediately, and stop your services from spamming you.
Re: Stop Service Checks When Host Down
Posted: Sun Oct 05, 2014 4:24 am
by zaji_nms
Dear Expert
When our HOST is down (not reachable), SERVICES should not check at all.
Please let us know how/where to find what is HOST check time (interval), the same way, how to set SERVICES interval because when we adding HOST from CONFIGURE , MONTIORE, ROUTER/SWITCH there no option to specify HOST and SERVICE check separately.
By default config, NAGIOS should check HOST faster (let say 30 seconds earlier than SERVICES).
FYI, whenever our HOST not reachable, all the SERVICES status got changed from DOWN [CRITICAL] to SNMP POLLING ISSUE [WARNING] and when HOST became stable/reachable all the SERVICE TIME STAMP (Duration) got changed and counting restart (its wrong, it should not be), when HOST was down, SERVICES status should not change , must remain same , whatever DOWN, UP, ERROR
Regards
Re: Stop Service Checks When Host Down
Posted: Mon Oct 06, 2014 9:58 am
by lmiltchev
Please let us know how/where to find what is HOST check time (interval), the same way, how to set SERVICES interval because when we adding HOST from CONFIGURE , MONTIORE, ROUTER/SWITCH there no option to specify HOST and SERVICE check separately.
When you run the Network Switch / Router wizard, your host/service check intervals will be set with the "default" 5 min. You can view/modify the check interval under the Host/Service Management in the CCM ("Check Settings" tab). If you had a Nagios XI Enterprise Edition, you could also use the "Bulk Modifications Tool" and set the value you want in bulk.
FYI, whenever our HOST not reachable, all the SERVICES status got changed from DOWN [CRITICAL] to SNMP POLLING ISSUE [WARNING] and when HOST became stable/reachable all the SERVICE TIME STAMP (Duration) got changed and counting restart (its wrong, it should not be), when HOST was down, SERVICES status should not change , must remain same , whatever DOWN, UP, ERROR
Can you show us a sreenshot of the state changes you describe?
Reports->State History
Re: Stop Service Checks When Host Down
Posted: Mon Oct 06, 2014 1:36 pm
by zaji_nms
Dear lmiltchev
Please find attached, now when HOST will available, all the previously DOWN services from that HOST/those HOSTs(for hours, weeks, months) will recount DURATION and will show down from (let say) 2 minute. So our NHM controller try to follow-up for these down SERVICES and later he found ohhhh its down from last one week/month, why he following
You suggest for some setting , CCM , Service Management , Check Settings (its too scary, to many options, don't want to play)....Just look into the matter if there is any option IF HOST DOWN, SERVICE CHECK SHOULD BE ON HALT.
Regards
Re: Stop Service Checks When Host Down
Posted: Mon Oct 06, 2014 2:08 pm
by abrist
Nagios performs different actions if a host is DOWN than UNREACHABLE. When a host is DOWN, services will still be checked and scheduled. When a host is UNREACHABLE (it's parent is DOWN), neither services nor host will be checked until the parent has recovered.
Re: Stop Service Checks When Host Down
Posted: Mon Oct 06, 2014 2:32 pm
by zaji_nms
Dear Abrist
In our scenario, we did not configure any Parent Host for any HOST, so in our case SERVICES are child and HOST is parent. All HOST act as a standalone without any child/parent relation.
We want if HOST is DOWN/UNREACHABLE, SERVICE check should on hold.
Please note we have big network and now and then there is some network outage, may be HOST down (UPLINK INTERFACE DOWN for that HOST) or due to some attack HOST going reachable/unreachable (flapping) due to very high delay , very high packet loss, that time we face big issue, all the previously DOWN SERVICES for Long Period showing down recently. DURATION PERIOD get reset as all the SERVICES down comes in state of WARNING SNMP ERROR issue and then when HOST is available, all the SERVICES down since long, reset the counter and showing down just now.
Please check attached, this SERVICE is DOWN since long but showing 3d 12h 24m 53s. The DURATION counter should change between UP and DOWN state, not when there SNMP ERROR.
Regards
Re: Stop Service Checks When Host Down
Posted: Mon Oct 06, 2014 2:40 pm
by abrist
SNMP error is considered critical, at least it is in your screenshot. This will start a new duration timer for the service because it's state has changed - what else would duration imply other than time elapsed since last state change? Just an FYI - services do not go UP or DOWN. They change between OK, WARNING, CRITCAL and UNKNOWN states. Why do you want the service checks to stop completely if the host is DOWN? There is a chance that a host check fails due to icmp blocking, etc, but the services could still be reached.