Page 1 of 2
Reducing notifications
Posted: Wed Jun 25, 2014 3:31 pm
by amprantino
Hello all,
I am trying to reduce the number of notifications admins are receiving.
My problem is this:
Assume that a host has about 10 services check.
When a host is down, I get one critical notification for host down & one notification for each service in critical state.
How can I configure nagios, so:
when a service is down to automatically check host state.
If host state is down (ping), then only one notification is sent for the host and not for all services.
When a service comes online, again Nagios should check host state.
If host state is up, then should send a notification only for host recovery/up state.
Later, when all services are re-checked if any service remain in critical state, a notification should be sent
Any idea how can I achieve the above?
Thank you
Re: Reducing notifications
Posted: Wed Jun 25, 2014 3:41 pm
by eloyd
If configured correctly, Nagios will only check the host if a service on the host comes back as non-OK. If the host responds non-OK, then all services on that host will be ignored because Nagios assumes that the host is down. So it sounds like your Nagios is not configured properly.
Re: Reducing notifications
Posted: Thu Jun 26, 2014 1:52 am
by chris.fixter
hi Eloyd,
Could you point out how this is done or where I could read further information about this behavior ? In my experience this isn't what my nagios' doing.
Re: Reducing notifications
Posted: Thu Jun 26, 2014 6:58 am
by eloyd
Re: Reducing notifications
Posted: Thu Jun 26, 2014 9:19 pm
by chris.fixter
Sorry for my ignorance. I can't find where says host down status would suppress service notification or service checking. I have been looking for such solution for a while because we monitor a lot of sites over Internet and we have frequent false service alert due to internet interruption.
As workaround now, I have to use a distributed approach for each site, where each site has its own nagios running active checks, and reports to a central nagios who accepts passive checks.
Re: Reducing notifications
Posted: Fri Jun 27, 2014 12:00 am
by Box293
It all depends on your host and service directives for check_interval, max_check_attempts and retry_interval. For example:
Host
check_interval = 5
max_check_attempts = 3
retry_interval = 2
Service(s)
check_interval = 2
max_check_attempts = 3
retry_interval = 1
1:10pm - Host is checked and detected as UP, next check is 1.15pm
1.11pm - Host goes down, nagios does not know about it yet
1.12pm - Service check fails, retry interval is 1 so next attempt is 1.13pm (soft state)
1.13pm - Service check retry fails, retry interval is 1 so next attempt is 1.14pm (soft state)
1.14pm - Service check fails, max_check_attempts reached so alert is sent (hard state)
1.15pm - Host check fails, retry interval is 2 so next attempt is 1.17pm (soft state)
more service checks happening / retrying / alerting
1.17pm - Host check fails, retry interval is 2 so next attempt is 1.19pm (soft state)
more service checks happening / retrying / alerting
1.19pm - Host check fails, max_check_attempts reached so alert is sent (hard state)
No more service alerts will be sent until the host recovers
Basically, service notifications will continue to be sent until it's host goes into a hard state.
Does this help?
This link has a lot of explanations on how notifications work:
http://nagios.sourceforge.net/docs/3_0/ ... tions.html
And some information about hard and soft states:
http://nagios.sourceforge.net/docs/3_0/statetypes.html
Re: Reducing notifications
Posted: Fri Jun 27, 2014 1:28 am
by chris.fixter
Interesting. Thanks for the explanation.
Re: Reducing notifications
Posted: Sat Jun 28, 2014 6:15 am
by amprantino
This is exactly the problem I have!
What I want to avoid is the service notification at 1.14pm
When a service is down (soft state), I would like to force a host check.
If host is down, then send only notification for host and not for services.
One solution is a host state to come in hard non-OK state sooner than a service non-OK but this isnt the case always.
For example, a search might be critical and be checked every 1 minutes and host every 5 minutes....
Re: Reducing notifications
Posted: Mon Jun 30, 2014 11:55 am
by eloyd
You could change your max check attempts to be one which would effectively make zero soft states and go instantly to a hard state. No intermediate checks done at that point.
Re: Reducing notifications
Posted: Mon Jul 07, 2014 9:13 am
by tmcdonald
@amprantino and
@chris.fixter, did
@eloyd and
@Box293's answers work for you?