Reducing notifications

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
amprantino
Posts: 140
Joined: Thu Apr 18, 2013 8:25 am
Location: libexec

Reducing notifications

Post by amprantino »

Hello all,

I am trying to reduce the number of notifications admins are receiving.
My problem is this:

Assume that a host has about 10 services check.
When a host is down, I get one critical notification for host down & one notification for each service in critical state.

How can I configure nagios, so:

when a service is down to automatically check host state.
If host state is down (ping), then only one notification is sent for the host and not for all services.

When a service comes online, again Nagios should check host state.
If host state is up, then should send a notification only for host recovery/up state.

Later, when all services are re-checked if any service remain in critical state, a notification should be sent

Any idea how can I achieve the above?

Thank you
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Reducing notifications

Post by eloyd »

If configured correctly, Nagios will only check the host if a service on the host comes back as non-OK. If the host responds non-OK, then all services on that host will be ignored because Nagios assumes that the host is down. So it sounds like your Nagios is not configured properly.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
chris.fixter
Posts: 22
Joined: Wed Jun 18, 2014 4:15 am

Re: Reducing notifications

Post by chris.fixter »

hi Eloyd,

Could you point out how this is done or where I could read further information about this behavior ? In my experience this isn't what my nagios' doing.
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Reducing notifications

Post by eloyd »

Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
chris.fixter
Posts: 22
Joined: Wed Jun 18, 2014 4:15 am

Re: Reducing notifications

Post by chris.fixter »

Sorry for my ignorance. I can't find where says host down status would suppress service notification or service checking. I have been looking for such solution for a while because we monitor a lot of sites over Internet and we have frequent false service alert due to internet interruption.
As workaround now, I have to use a distributed approach for each site, where each site has its own nagios running active checks, and reports to a central nagios who accepts passive checks.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Reducing notifications

Post by Box293 »

It all depends on your host and service directives for check_interval, max_check_attempts and retry_interval. For example:
Host
check_interval = 5
max_check_attempts = 3
retry_interval = 2

Service(s)
check_interval = 2
max_check_attempts = 3
retry_interval = 1

1:10pm - Host is checked and detected as UP, next check is 1.15pm
1.11pm - Host goes down, nagios does not know about it yet
1.12pm - Service check fails, retry interval is 1 so next attempt is 1.13pm (soft state)
1.13pm - Service check retry fails, retry interval is 1 so next attempt is 1.14pm (soft state)
1.14pm - Service check fails, max_check_attempts reached so alert is sent (hard state)
1.15pm - Host check fails, retry interval is 2 so next attempt is 1.17pm (soft state)
more service checks happening / retrying / alerting
1.17pm - Host check fails, retry interval is 2 so next attempt is 1.19pm (soft state)
more service checks happening / retrying / alerting
1.19pm - Host check fails, max_check_attempts reached so alert is sent (hard state)
No more service alerts will be sent until the host recovers

Basically, service notifications will continue to be sent until it's host goes into a hard state.

Does this help?

This link has a lot of explanations on how notifications work:
http://nagios.sourceforge.net/docs/3_0/ ... tions.html
And some information about hard and soft states:
http://nagios.sourceforge.net/docs/3_0/statetypes.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
chris.fixter
Posts: 22
Joined: Wed Jun 18, 2014 4:15 am

Re: Reducing notifications

Post by chris.fixter »

Interesting. Thanks for the explanation.
amprantino
Posts: 140
Joined: Thu Apr 18, 2013 8:25 am
Location: libexec

Re: Reducing notifications

Post by amprantino »

This is exactly the problem I have!
What I want to avoid is the service notification at 1.14pm

When a service is down (soft state), I would like to force a host check.
If host is down, then send only notification for host and not for services.

One solution is a host state to come in hard non-OK state sooner than a service non-OK but this isnt the case always.
For example, a search might be critical and be checked every 1 minutes and host every 5 minutes....
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Reducing notifications

Post by eloyd »

You could change your max check attempts to be one which would effectively make zero soft states and go instantly to a hard state. No intermediate checks done at that point.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Reducing notifications

Post by tmcdonald »

@amprantino and @chris.fixter, did @eloyd and @Box293's answers work for you?
Former Nagios employee
Locked