Re: [Nagios-devel] Core 4 Remote Workers

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Core 4 Remote Workers

Post by Guest »

On 02/05/2013 12:43 PM, Eric Stanley wrote:
> On 2/2/13 5:52 PM, Andreas Ericsson wrote:
>> On 02/02/2013 03:12 PM, Eric Stanley wrote:
>>> 3. Add a host key to the worker registration to allow workers to specify
>>> the host(s) for which it will handle checks.
>> Not really difficult, although I suspect one will want to use groups
>> instead of specific hosts, and also use the address which the other
>> node is connecting from as the host to monitor (so one can have self-
>> monitoring servers that phone in to Nagios with their results).
> I like the idea of the remote worker checking itself by default, but I
> think we should allow the remote worker to exclude itself from checking
> it's host or maybe from checking certain aspects of itself using the
> check type concept you proposed below.
>>> The reason I have steps 1 and 2, instead of combining them is first,
>>> because a generalized solution is more extensible and second, I think
>>> having multiple TCP listeners is a reasonable use case where you have a
>>> multi-homed system, but you may not want to listen on all interfaces.
>>>
>> That can be firewalled away quite trivially, so no need for us to handle
>> that with code that might break (as I suspect it will see little testing).
>
> I've got to believe that most of the effort and testing would be going
> from 1 listener to 2 and that going from 1 to n is a simple
> generalization of the 1 to 2 case.

The 1 to n case is already handled. That's the whole point behind the I/O
broker.

> Telling the main daemon not to listen
> on certain interfaces provides security in depth.

Except that people won't trust it, and it's ludicrously simple to set
firewall rules for it. But by all means, adding multiple network sockets
really isn't any harder than adding one, so what the hey.

>>> The host key should be allowed to specify one or more IP addresses, IP
>>> subnets, contiguous IP address ranges, host names and host name
>>> patterns/wildcards (i.e. *.example.com). If multiple workers register
>>> for the same host, some sort of distribution mechanism should be used to
>>> load balance the workers.
>>>
>> Umm... Is this what the remote worker should request? If so, we're doing
>> a pretty major change in Nagios where a hosts address is always just a
>> string that we pass to the plugins, and it won't be long until people
>> start requesting regex matching, subdomain matching and whatnot for it,
>> and we'll have to start resolving hostnames.
>>
>> I'd say just go with hostgroups instead. It's easier, and people will
>> have to do some minor configuring of remote workers anyway, so saying
>> "hostgroups=core-routers" in that config in addition to ip and port
>> to Nagios isn't such a big chore.
> Maybe I wasn't clear. I don't see a change in the way Nagios itself
> performs checks. It is just the worker specifying the systems for which
> it is willing and able to perform checks. If you configure Nagios to
> check host x and no worker registers specifically to check host x,
> Nagios will use workers that have not specified the hosts for which
> they'll perform checks, which at least for now, defaults to the local
> workers.
>
> I find the idea of the remote worker using hostgroups to volunteer for
> checks appealing because of its simplicity, but might it not be fragile?
> Assume the members of the hostgroup must be checked using a remote
> worker because of network configuration. If someone removes a host from
> the hostgroup, it will cease to be checked. If someone deletes (or
> renames) the hostgroup, none of the hosts will be checked. If someone
> adds a host to the hostgroup that the remote worker cannot check, it
> will never be checked. You might spend a lot of time trying to figure
> out why your hosts/services aren't being checked in one of these cases.

Same problem with misconfiguration of anything at all. Except for the
"add a host the worker can't check". In that case it will run headlong
into whatever error the inability to check the node would cause and
raise an alert from that.

Using a resolved address wil

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: ae@op5.se
Locked