Re: [Nagios-devel] Core 4 Remote Workers

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Core 4 Remote Workers

Post by Guest »

On 2/2/13 5:52 PM, Andreas Ericsson wrote:
> On 02/02/2013 03:12 PM, Eric Stanley wrote:
>> 3. Add a host key to the worker registration to allow workers to specify
>> the host(s) for which it will handle checks.
> Not really difficult, although I suspect one will want to use groups
> instead of specific hosts, and also use the address which the other
> node is connecting from as the host to monitor (so one can have self-
> monitoring servers that phone in to Nagios with their results).
I like the idea of the remote worker checking itself by default, but I
think we should allow the remote worker to exclude itself from checking
it's host or maybe from checking certain aspects of itself using the
check type concept you proposed below.
>> The reason I have steps 1 and 2, instead of combining them is first,
>> because a generalized solution is more extensible and second, I think
>> having multiple TCP listeners is a reasonable use case where you have a
>> multi-homed system, but you may not want to listen on all interfaces.
>>
> That can be firewalled away quite trivially, so no need for us to handle
> that with code that might break (as I suspect it will see little testing).
I've got to believe that most of the effort and testing would be going
from 1 listener to 2 and that going from 1 to n is a simple
generalization of the 1 to 2 case. Telling the main daemon not to listen
on certain interfaces provides security in depth.
>> The host key should be allowed to specify one or more IP addresses, IP
>> subnets, contiguous IP address ranges, host names and host name
>> patterns/wildcards (i.e. *.example.com). If multiple workers register
>> for the same host, some sort of distribution mechanism should be used to
>> load balance the workers.
>>
> Umm... Is this what the remote worker should request? If so, we're doing
> a pretty major change in Nagios where a hosts address is always just a
> string that we pass to the plugins, and it won't be long until people
> start requesting regex matching, subdomain matching and whatnot for it,
> and we'll have to start resolving hostnames.
>
> I'd say just go with hostgroups instead. It's easier, and people will
> have to do some minor configuring of remote workers anyway, so saying
> "hostgroups=core-routers" in that config in addition to ip and port
> to Nagios isn't such a big chore.
Maybe I wasn't clear. I don't see a change in the way Nagios itself
performs checks. It is just the worker specifying the systems for which
it is willing and able to perform checks. If you configure Nagios to
check host x and no worker registers specifically to check host x,
Nagios will use workers that have not specified the hosts for which
they'll perform checks, which at least for now, defaults to the local
workers.

I find the idea of the remote worker using hostgroups to volunteer for
checks appealing because of its simplicity, but might it not be fragile?
Assume the members of the hostgroup must be checked using a remote
worker because of network configuration. If someone removes a host from
the hostgroup, it will cease to be checked. If someone deletes (or
renames) the hostgroup, none of the hosts will be checked. If someone
adds a host to the hostgroup that the remote worker cannot check, it
will never be checked. You might spend a lot of time trying to figure
out why your hosts/services aren't being checked in one of these cases.
>> Using the second criteria of host to determine which worker gets the
>> check raises the question of the order of precedence for the criteria.
>> Initially, I think the host should have precedence over plugin, but I
>> can see implementing and order of precedence option in the core
>> configuration file. This would be more important if additional worker
>> selection criteria were added.
>>
> Object over check type, any day. We may have to add a "check_type" thing
> to command objects though, so workers can register for only local checks
> and still have their http checks and whatnot done from remote, where
> they make more sense. This requires some thinking.
Good though

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: estanley@nagios.com
Locked