Re: [Nagios-devel] Core 4 Remote Workers

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Core 4 Remote Workers

Post by Guest »

On 02/03/2013 02:37 AM, Jochen Bern wrote:
> On 02.02.2013 15:12, Eric Stanley wrote:
>> The host key should be allowed to specify one or more IP addresses, IP
>> subnets, contiguous IP address ranges, host names and host name
>> patterns/wildcards (i.e. *.example.com). If multiple workers register
>> for the same host, some sort of distribution mechanism should be used to
>> load balance the workers.
>
> First off, I'm even more firmly opposed to the assumption that
> $HOSTADDRESS$ == IP address than Andreas. I've set up Nagios instances
> for customers where $HOSTADDRESS$ actually happened to be
> -- router management address plus SNMP index of customer-facing
> interface (for a carrier who considered it verboten to snoop *into*
> the network of CPE-less business customers to determine whether the
> links are "up" in an SLA-relevant way)
> -- IP address enriched with VLAN tags (which code for the path through
> the WDM multiplexers to the CPE's management interface)
> -- IP plus SSH port, plus optionally ssh options (server admin insists
> on login banner, Nagios admin throws a "-q" into the gearbox ...)
> -- *CHAINS* of IP / SSH port pairs (software supplier also supplies the
> monitoring, but some of his customers insist on burying the server
> *several* SSH hops deep within his own network)
> and I suppose I've been lucky not to have had to deal with a mixed
> IPv4/IPv6 shop yet.
>
> Having that said: From your description, I'm under the impression that
> you're picturing a scenario of a complex network where the central
> Nagios actually cannot reach the "leaf" hosts itself, whereas the worker
> concept seems to be oriented towards load distribution IIUC.
>
> These two scenarios aren't 100% compatible in their technical needs. For
> example, when the central Nagios winds up with no suitable worker for
> certain target hosts, the disjoint-nets scenario likely would leave it
> no choice but to mark the upcoming checks UNREACHABLE/UNKNOWN, while the
> load-distrib scenario would call for it to run the checks itself (and
> try harder to push *other* checks to workers to rid itself of the
> increased load). Also, responsibility and, thus, configuration tends to
> follow the segregation of the networks.
>

Scenario 1: Loadbalancing, using remote workers to enhance the cpu and
memory resources available to us.
When workers go offline (for whatever reason), their load is distributed
among remaining workers.


Scenario 2: "Passing" firewalls, using remote workers to run check the
master can't access due to access restrictions.
When workers go offline, checks are either not executed (requires more
configuration), or marked as UNKNOWN. An internal check for the worker
itself will be a parent service of the services it's supposed to handle.
This requires being able to add services on-the-fly from inside Nagios,
which is halfway planned anyway, but will require additional bookkeeping
variables inside the objects and has to be left for 4.1.


Scenario 3: Remote view of inside services, using remote workers to see
the network from a particular point of view, such as a field office using
services inside the main office.
When workers go offline, checks can be executed locally, but a check
to see that the worker is up and running should trigger an alert. The
local check can be set up to return whatever the user wants, and parenting
can be handled either as in scenario 1 or as in scenario 2.

> To sum it up, what I would imagine as Nagios' long-term development for
> *your* scenario wouldn't be a Nagios/worker "tasks go downstream"
> interface but one that allows a local Nagios to push "local" status data
> (from config to current check results) to an upstream "integration and
> oversight" Nagios.
>
> (And yes, pinpointing how exactly you can and want to do access control,
> formation of host/service *groups*, notifications for local/global
> users, yadda yadda, with such a configuration brain split *will* be a bear.)
>

Yup. It *is* useful though. mod_gearman has the same issue, really, and
people use tha

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: ae@op5.se
Locked