Page 1 of 1

Re: [Nagios-devel] Dependencies in redundant networks and services:

Posted: Mon May 23, 2011 6:35 am
by Guest
On 05/22/2011 10:12 PM, Matthew Pounsett wrote:
>
> Searching back through the archives it seems that the issue of
> handing service and host dependencies on redundant services or hosts
> comes up from time to time (actually, far less often than I would
> have expected) and nobody seems to have a really good solution to the
> problem.
>
> Imagine a web service (call it W) which depends on two separate
> databases (call them database A and database B), where both databases
> have redundant backups and the web service can contact either the
> primary or backup for each database and still do its job (A1, A2, B1,
> B2). Doing this without a more flexible dependency system requires
> either some very complicated combinatorial setup where we have W1
> dependent on A1,B1, W2 dependent on A1,B2, W3 on A2,B1, etc. or one
> very complicated custom check script which implements the
> dependencies itself.
>
> I've been thinking about this a fair bit over the last couple of
> weeks since I manage a network and suite of services where nearly
> everything is redundant, and almost no single outage of any component
> results in an 'unreachable' state for any other component. I'd very
> much like to avoid having to run all kinds of duplicate checks and
> train the rest of my staff to ignore alerts unless they arrive in
> pairs.
>
> I think I've hit upon an idea, but it's a fairly significant change
> to the way service and host dependencies work today, and so I don't
> think it's reasonable to pursue it any earlier than Nagios 4.0, but
> I'd like to get some feedback to see if others think this might be
> the right way to go (and I'm hoping I don't get too many TL;DRs).
>
> In a nutshell, my idea is to separate the definition of the master
> service/host from the association to it by the dependent
> service/host, and make the association by reference from the
> dependent service or host definition... much the same way as a
> service is associated to a host by reference.
>
> There are two big wins from doing this: 1) If the dependency is
> created by reference from the service or host definition, that opens
> the door to using a boolean syntax in that reference, allowing both
> simple *and* complex dependencies. 2) Moving the dependency
> association into the service or host definition also allows the
> association to be applied to services or hosts by
> servicegroup/hostgroup which simplifies configuration file
> authoring.
>
> Here's one example where using a hostgroup for the master service (or
> a list of hosts) contains the implicit assumption that all of the
> services referenced in a single servicedependency definition are
> redundancies of each other. I don't like doing anything by
> implication, but this provides a match to the current implication
> that all master services referenced by a dependent are not
> redundancies of each other, and keeps the configuration very simple.
>
>
> define service { host_name web-host service_description Web
> Service W dependencies db-a-dependency,db-b-dependency }
>
> define hostgroup { hostgroup_name database-hosts members
> db-host-1,db-host-2 }
>
> define service { hostgroup_name database-hosts
> service_desription Database A }
>
> define service { hostgroup_name database-hosts
> service_desription Database B }
>
> define servicedependency { servicedependency_name
> db-a-dependency hostgroup_name database-hosts
> service_description Database A
> notification_failure_criteria w,u,c,p dependency_period
> 24x7 }
>
> define servicedependency { servicedependency_name
> db-b-dependency hostgroup_name database-hosts
> service_description Database B
> notification_failure_criteria w,u,c,p dependency_period
> 24x7 }
>
> Since the implication by using a hostgroup_name or a list of hosts in
> the servicedependency definition is that the referenced services are
> redundant, the servicedependency doesn't 'fail' until all of the
> referenced services meet *any* of the notifcation_failure_criteria
> (e.g. on

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]