Page 1 of 2
Email Suppression
Posted: Tue May 07, 2013 9:52 am
by WillemDH
Hello,
Is there any way to prevent Nagios from sending out the same email several time (withing a short timeperiod). To be specific, I have set up NSCP to alert on certain eventlog id's with NSCA from the moment a cluster does a failover. The problem is that when the cluster fails over, there are several almost identical event id's, so we seem to get an email for each event.. As we don't want 8 emails when a cluster fails over, I wondered if there is anway to automatically suppress subsequent email for the same service (passive). Ok, I hear you thinking why don't we differentiate in the eventlog setup so only one event is sent to Nagios? This is because there is one event id for each cluster resource. As we always want a critical service from the moment there is a failover for any one of these cluster resources, we need to send them all. hte problem is that most of the time if one resource fail they all do..
Thanks for any input.
Re: Email Suppression
Posted: Tue May 07, 2013 11:02 am
by sreinhardt
hmm, so it sounds like you have a cluster of servers, with each server being monitored via nsca. When you have a failover each server is sending an alert because they are all now "failing". Is this correct? If they all, always will fail at the same time, it would probably work to disabled notifications for all but one. Otherwise if there is a primary server that you monitor per cluster you may look at doing a parent child relationship so that when the parent goes down you only see notifications for that and not additionally the children, this would also allow for an individual server to fail and still send notifications.
Re: Email Suppression
Posted: Wed May 08, 2013 3:13 am
by WillemDH
In fact it's a MS failover cluster with two physical nodes and 8 cluster instances. Each cluster instance gives an alert when the cluster fails over from one node to another. I will look into this parent child relationship. Any directions where to configure this? Is it in CCM Advanced - Host Dependencies?
Re: Email Suppression
Posted: Wed May 08, 2013 10:46 am
by abrist
You can create relationships by adding a host as a child of another by clicking "manage parents" in the hosts page in the CCM.
Re: Email Suppression
Posted: Wed May 22, 2013 5:05 am
by WillemDH
When a host is a child of another host, what are the implications?
- When the parent is down, the children will not generate alerts anymore?
- When one of the children goes down, does the parent also go down?
Re: Email Suppression
Posted: Wed May 22, 2013 11:34 am
by slansing
This should help explain a little faster than I can by hand, I believe you want to see the bottom of the page:
http://nagios.sourceforge.net/docs/3_0/ ... ility.html
Re: Email Suppression
Posted: Fri Mar 21, 2014 1:09 pm
by WillemDH
Well, I'm cleaning up my unclosed threads and as I now know a litle more of Nagios, in my opinion parent child relationships might be good for networks, but not for failover clusters. The cluster resources can't really be parents as when they are down, not neceassrily all cluster nodes are down.. Or should both of the nodes be parents of the cluster instances? When a host has multiple parents, will it only get the status of unreachable when both parents are down?
Re: Email Suppression
Posted: Fri Mar 21, 2014 2:12 pm
by tmcdonald
WillemDH wrote:When a host has multiple parents, will it only get the status of unreachable when both parents are down?
Correct. All the parents must be down for the child to go down as well, but if a single dependency goes down then this dependent does as well.
Re: Email Suppression
Posted: Fri Mar 21, 2014 3:02 pm
by WillemDH
Can you clarify
but if a single dependency goes down then this dependent does as well.
please. I'm not 100 % sure what you mean. I read
http://support.nagios.com/knowledgebase ... wdesc=true in the meantime, where they claridy that parent child relation ship should be used to reflect network setup. So this would imply I better use dependencies for failover clusters. Is this correct?
But in
http://nagios.sourceforge.net/docs/3_0/ ... ncies.html they say:
Tip: Do not confuse host dependencies with parent/child host relationships. You should be using parent/child host relationships (defined with the parents directive in host definitions) for most cases, rather than host dependencies. A description of how parent/child host relationships work can be found in the documentation on network reachability.
So what would be the best setup for a Windows failover cluster with different cluster instances each with their own ip / hostname in active - active mode?
- SERVERNODE01
- SERVERNODE02
- CLUSTERNAME
- CLUSTERRESOURCE01
- CLUSTERRESOURCE02
- CLUSTERRESOURCE03
- CLUSTERRESOURCE04
Willem
Re: Email Suppression
Posted: Mon Mar 24, 2014 10:23 am
by tmcdonald
With a parent/child relationship, all of the parents must go down before the child will show down as well. So if a server is connected to three switches then all three of those switches must go down before the server cannot connect out.
With dependencies, if any one of them goes down then the dependent host will go down as well. This might be the case with a web app that needs access to 4 different databases, so if one goes down the app is unusable.
To use an analogy, parent/child is like a room with windows for sunlight: you can still see if at least one of the windows is letting light through. Dependencies are like wheels on a car: as soon as one is missing, you have problems.
As for your specific instance, I would need to see a diagram. The documentation says you "should" use them a certain way, but it is not a requirement and there are definitely cases where one would make more sense than the other.