Page 1 of 2
Nagios sending out too many alerts
Posted: Mon Apr 02, 2012 2:43 am
by johndoe
Hi all,
About our setup: We have nagios being distributed with passive NRDP on our servers and freshness enabled.
Problem: Nagios is sending way too many alerts when a single host is unavailable... Example: if a host goes down I get about (say 10) results (one per service) of the service problem and then 10 more when the server comes back up... As you can imagine this is not ideal...
Question: Is there a way to combine these into one alert only? Or atleast fewer?
Re: Nagios sending out too many alerts
Posted: Mon Apr 02, 2012 9:02 am
by scottwilkerson
You can setup host/service dependencies to reduce this
Configure -> Core Config Manager -> Advanced -> Host/Service Dependencies
Re: Nagios sending out too many alerts
Posted: Wed Apr 04, 2012 2:55 am
by johndoe
Thanks for the reply, I had a look at host and service dependencies however from what i could understand I'd have to set that manually for each existing host and manually again for every new host we add... I guess that's not viable on a large-expanding environment where machines are being added on a constant basis...
I came across this post whilst searching for a solution for this:
http://forums.meulie.net/viewtopic.php? ... 281#p17288
I have "sort of" resolved this by ensuring that hosts go "down" before services go "critical". Bascically, both hosts and services are both set to be checked every 15 seconds, but hosts have max_check_attempts set to 2 and services have this value set to 4. This means that when a host goes down, it generates a host alert 30 seconds before the first service alert, and this suppresses service notifications for hosts that are down.
It's a kludge, but it's better than getting 14 e-mail alerts (one for the server and one for each service on that server) every time a server reboots.
If anyone has any advice about how to get Nagios to check the host automatically when a service goes down before it sends a service alert, that would still be better.
Cheers!
Miles
Although I think the solution the poster had could suit us I have slight problem with it... As previously mentioned we have all checks coming via NRDP and at the moment both HOST checks and SERVICE checks are coming at the same interval, meaning the HOST checks don't necessarily arrive before the service checks...
I could change our check times to adapt to this as mentioned in the quoted post (Changing HOST checks to be more frequent than SERVICE checks) however i wonder if this is the right path to follow before I start changing everything which will be quite a time-consuming task...
It is worth noting that NagiosXI DOES NOT have access to ping the host or any active checks, our monitoring infrastructure is based mainly in passive checks and freshness.
Re: Nagios sending out too many alerts
Posted: Wed Apr 04, 2012 12:27 pm
by scottwilkerson
This can get into a sort of chicken and the egg thing... One solution would be to use
BPI Addon and setup a business process for each group and then only process notifications for the whole process
Re: Nagios sending out too many alerts
Posted: Tue Apr 10, 2012 10:07 am
by johndoe
Thanks Scott,
So I've set up a BPI entry per host with all the services inside (needless to say this makes nagios even messier...)
Now here's my current issue: on the services I have "Notification enabled" set to "Skip" and each service uses the template "xiwizard_passive_service" which in turn has "Notification enabled" set to "Off".
I thought this would mean that the notifications for that service would be disabled as it would inherit the "Off" status from the template (and then I would set up the notifications on the newly created BPI group)... However this is not what is happening, The services still send out notifications and don't seem to inherit the "Off" status...
Is my assumption wrong that it should inherit the "Off" status or ?
Re: Nagios sending out too many alerts
Posted: Tue Apr 10, 2012 10:15 am
by scottwilkerson
This is what should happen, and I just tested it on my test box running 2011R2.2 and it appears to be working as expected.
Are all your services that are sending notifications using the xiwizard_passive_service template?
And the obvious, did you Apply Configuration after making the change?
Re: Nagios sending out too many alerts
Posted: Tue Apr 10, 2012 10:42 am
by johndoe
The configs are indeed synched, have a look at the attachments below
Re: Nagios sending out too many alerts
Posted: Tue Apr 10, 2012 11:30 am
by scottwilkerson
scottwilkerson wrote:
Are all your services that are sending notifications using the xiwizard_passive_service template?
Re: Nagios sending out too many alerts
Posted: Tue Apr 10, 2012 11:33 am
by johndoe
scottwilkerson wrote:scottwilkerson wrote:
Are all your services that are sending notifications using the xiwizard_passive_service template?
Yes, i'm testing this with just one for the time being and forcing it to Critical/OK to "Force" a notification
Re: Nagios sending out too many alerts
Posted: Tue Apr 10, 2012 11:42 am
by scottwilkerson
when you view this in XI does it have the "no notifications" icon?
notif.PNG
If not if could be because you have submitted a external command to Nagios disabling and re-enabling notifications.