Reducing notifications

amprantino · Post by **amprantino** » Wed Jun 25, 2014 3:31 pm

Hello all,

I am trying to reduce the number of notifications admins are receiving.
My problem is this:

Assume that a host has about 10 services check.
When a host is down, I get one critical notification for host down & one notification for each service in critical state.

How can I configure nagios, so:

when a service is down to automatically check host state.
If host state is down (ping), then only one notification is sent for the host and not for all services.

When a service comes online, again Nagios should check host state.
If host state is up, then should send a notification only for host recovery/up state.

Later, when all services are re-checked if any service remain in critical state, a notification should be sent

Any idea how can I achieve the above?

Thank you

Post by **eloyd** » Wed Jun 25, 2014 3:41 pm

If configured correctly, Nagios will only check the host if a service on the host comes back as non-OK. If the host responds non-OK, then all services on that host will be ignored because Nagios assumes that the host is down. So it sounds like your Nagios is not configured properly.

chris.fixter · Post by **chris.fixter** » Thu Jun 26, 2014 1:52 am

hi Eloyd,

Could you point out how this is done or where I could read further information about this behavior ? In my experience this isn't what my nagios' doing.

Post by **eloyd** » Thu Jun 26, 2014 6:58 am

http://nagios.sourceforge.net/docs/3_0/hostchecks.html

chris.fixter · Post by **chris.fixter** » Thu Jun 26, 2014 9:19 pm

Sorry for my ignorance. I can't find where says host down status would suppress service notification or service checking. I have been looking for such solution for a while because we monitor a lot of sites over Internet and we have frequent false service alert due to internet interruption.
As workaround now, I have to use a distributed approach for each site, where each site has its own nagios running active checks, and reports to a central nagios who accepts passive checks.

Post by **Box293** » Fri Jun 27, 2014 12:00 am

It all depends on your host and service directives for check_interval, max_check_attempts and retry_interval. For example:
Host
check_interval = 5
max_check_attempts = 3
retry_interval = 2

Service(s)
check_interval = 2
max_check_attempts = 3
retry_interval = 1

1:10pm - Host is checked and detected as UP, next check is 1.15pm
1.11pm - Host goes down, nagios does not know about it yet
1.12pm - Service check fails, retry interval is 1 so next attempt is 1.13pm (soft state)
1.13pm - Service check retry fails, retry interval is 1 so next attempt is 1.14pm (soft state)
1.14pm - Service check fails, max_check_attempts reached so alert is sent (hard state)
1.15pm - Host check fails, retry interval is 2 so next attempt is 1.17pm (soft state)
more service checks happening / retrying / alerting
1.17pm - Host check fails, retry interval is 2 so next attempt is 1.19pm (soft state)
more service checks happening / retrying / alerting
1.19pm - Host check fails, max_check_attempts reached so alert is sent (hard state)
No more service alerts will be sent until the host recovers

Basically, service notifications will continue to be sent until it's host goes into a hard state.

Does this help?

This link has a lot of explanations on how notifications work:
http://nagios.sourceforge.net/docs/3_0/ ... tions.html
And some information about hard and soft states:
http://nagios.sourceforge.net/docs/3_0/statetypes.html

chris.fixter · Post by **chris.fixter** » Fri Jun 27, 2014 1:28 am

Interesting. Thanks for the explanation.

amprantino · Post by **amprantino** » Sat Jun 28, 2014 6:15 am

This is exactly the problem I have!
What I want to avoid is the service notification at 1.14pm

When a service is down (soft state), I would like to force a host check.
If host is down, then send only notification for host and not for services.

One solution is a host state to come in hard non-OK state sooner than a service non-OK but this isnt the case always.
For example, a search might be critical and be checked every 1 minutes and host every 5 minutes....

Post by **eloyd** » Mon Jun 30, 2014 11:55 am

You could change your max check attempts to be one which would effectively make zero soft states and go instantly to a hard state. No intermediate checks done at that point.

tmcdonald · Post by **tmcdonald** » Mon Jul 07, 2014 9:13 am

@amprantino and @chris.fixter, did @eloyd and @Box293's answers work for you?

Nagios Support Forum

Reducing notifications

Reducing notifications

Re: Reducing notifications

Re: Reducing notifications

Re: Reducing notifications

Re: Reducing notifications

Re: Reducing notifications

Re: Reducing notifications

Re: Reducing notifications

Re: Reducing notifications

Re: Reducing notifications