Limit total number of emails sent per check

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
bencundiff
Posts: 3
Joined: Wed Jun 07, 2017 9:39 am

Limit total number of emails sent per check

Post by bencundiff »

Hi guys,
I am very new to Nagios, so my apologies if I am asking an obvious question.
We have an existing Nagios3 installation (3.2.3-3ubuntu1.1) running on a 12.04 server.
We're migrating it to a Nagios3 installation (3.5.1.dfsg-2.1ubuntu1.1) running on a 16.04 server once we have this figured out.

We would like to be alerted once a check for a host's service fails, and then continue to be alerted, about 5 times per host, and then have no more alerts.
Currently, we get bombarded with emails if a service fails until the checks stop failing.
For example, if someone mistypes 5 DNS records at 4:30 PM on Friday and leaves, we get 5 emails for the DNS check failure every 30 minutes from 4:30 PM on Friday until someone fixes it at 8AM Monday morning.
We would like to change the below example to a scenario where the first check sends an alert, and then every additional failed check sends alerts for each host that's failed for the next 5 alerts, and then stops sending alerts.

My first thought was to include all hosts in a hostgroup called "all-hosts" and then define an escalation rule for that group that, after 5th alert, uses a contact with a dummy email address.
The problem is that after 5 alerts, emails are now sent to both the original contact group and to the dummy contact.

Please let me know the best way to get such a configuration, or let me know if I have a fundamental misunderstanding of how Nagios works.

Thanks,
Ben
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Limit total number of emails sent per check

Post by dwhitfield »

This is mostly written for XI, but I think it should give you some clues: https://support.nagios.com/kb/article.php?id=36

Most importantly is the idea of acknowledging issues.

You may also want to check out timeperiods: https://support.nagios.com/kb/article.php?id=369

There's not an easy way to do exactly what you want, but there are certainly plenty of ways to minimize your number of alerts.
bencundiff
Posts: 3
Joined: Wed Jun 07, 2017 9:39 am

Re: Limit total number of emails sent per check

Post by bencundiff »

Thanks for the reply.

is there a script I could write for nagios core that would automate acknowledging the alerts? Some of the hosts failing checks are Windows 2003, others 2008R2. The 2003 boxes use NC_Net and the 2008R2 boxes use NSClient++.

I think what I want to do is define my event handler and a new command as specified in the doc page I found but I am unclear what commands to insert to tell the nagios server to acknowledge that specific service on that specific host.

Based off of posts like this I think I need to have the remote host/monitored host execute something via NC_net or NSClient++. i.e. I need to define a command and then define the command as running a batch script running something similar to the below example, echoing the appropriate variables and appropriate value for the service name.

Code: Select all

/usr/bin/printf "[%lu] ACKNOWLEDGE_HOST_PROBLEM;$hosttoack;1;1;1;nagiosadmin;Down I know\n" 
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Limit total number of emails sent per check

Post by mcapra »

bencundiff wrote: I am unclear what commands to insert to tell the nagios server to acknowledge that specific service on that specific host.
You might find this helpful, though I'm not sure if the commands are consistent with Core 3 VS Core 4:
https://old.nagios.org/developerinfo/ex ... ndlist.php

https://old.nagios.org/developerinfo/ex ... mand_id=39
https://old.nagios.org/developerinfo/ex ... mand_id=40

In a nutshell, there's a file you can write to which allows you to do "nagios things" from the CLI. Having the Windows machine be responsible for that is a whole other deal, though. Having the Windows machines submit passive check results might be a better route:
https://assets.nagios.com/downloads/nag ... hecks.html

You could use something like NRDP/NSCA (which would live on the Nagios box) to act as the "bucket" that NSClient++/NC_Net sends the results to :
https://exchange.nagios.org/directory/A ... or/details
https://exchange.nagios.org/directory/A ... or/details
Former Nagios employee
https://www.mcapra.com/
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Limit total number of emails sent per check

Post by dwhitfield »

Thanks for the assist @mcapra!
bencundiff
Posts: 3
Joined: Wed Jun 07, 2017 9:39 am

Re: Limit total number of emails sent per check

Post by bencundiff »

mcapra wrote:
bencundiff wrote: I am unclear what commands to insert to tell the nagios server to acknowledge that specific service on that specific host.
You might find this helpful, though I'm not sure if the commands are consistent with Core 3 VS Core 4:
https://old.nagios.org/developerinfo/ex ... ndlist.php

https://old.nagios.org/developerinfo/ex ... mand_id=39
https://old.nagios.org/developerinfo/ex ... mand_id=40

In a nutshell, there's a file you can write to which allows you to do "nagios things" from the CLI. Having the Windows machine be responsible for that is a whole other deal, though. Having the Windows machines submit passive check results might be a better route:
https://assets.nagios.com/downloads/nag ... hecks.html

You could use something like NRDP/NSCA (which would live on the Nagios box) to act as the "bucket" that NSClient++/NC_Net sends the results to :
https://exchange.nagios.org/directory/A ... or/details
https://exchange.nagios.org/directory/A ... or/details

Ok, that helps a lot!
If NC_Net isn't an active project and I can't figure out how to get passive checks to work on server 2003 hosts, perhaps that's just another indicator it's time for us to stop using Server 2003 in a production environment....


Would a third option be to have notifications sent to a dummy contact, and then escalate to a real contact group for alerts 2 to 6, and then escalate back to the dummy contact group for alerts 7+?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Limit total number of emails sent per check

Post by scottwilkerson »

bencundiff wrote:Would a third option be to have notifications sent to a dummy contact, and then escalate to a real contact group for alerts 2 to 6, and then escalate back to the dummy contact group for alerts 7+?
Yes, this is what I would do to limit by quantity of notifications
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked