In terms of reloading/restarting nagios to pick up new configuration, =20=
this is a common problem which can be solved with an event broker that =20=
writes directly into Nagios memory space. The timing of this has to be =20=
done carefully so as to avoid any problems, but it is eminently =20
doable. I know this can be successfully accomplished as I was hired as =20=
a consultant for a large university client in Upstate New York, where =20=
I gave them the ability to reconfigure their Nagios instance on the =20
fly without having to restart Nagios. It updates the CGI screens, the =20=
notifications, everything.
Zero down time on a reconfigure.
Daniel.
On May 6, 2009, at 1:47 PM, Andreas Ericsson wrote:
> Mathieu Gagn=E9 wrote:
>> Hi Ethan,
>>
>> First, thank you very much for Nagios.
>>
>> Our enterprise relies heavily on it and Nagios has been a great
>> monitoring tools for us for so many years. Up to now, nothing has
>> surpassed its simplicity of use and we will continue to use it in the
>> foreseeable future.
>>
>> On 5/6/09 11:56 AM, Ethan Galstad wrote:
>>> 4. Big things are coming around the bend for Nagios. Big things =20
>>> take
>>> time. Be patient for a bit longer and you'll see the results.
>>
>> As an enterprise looking to scale Nagios to tens of thousands =20
>> monitored
>> hosts and services, what could be our expectations of the future
>> regarding scalability?
>>
>
> I should think some sort of event-transport module integrating tightly
> with the user interface will handle this. Fortunately, we're working =20=
> on
> exactly such a solution. The event-transport module is reasonably =20
> stable
> and the gui is well under way. Check out www.op5.org, and particularly
> merlin and ninja (merlin will be merged with reports-module and
> reports-gui will be merged into ninja in the near future).
>
>> We are using NDOutils to centralize host/service status.
>>
>> One of our main challenge will be to optimize the configuration and
>> patch Nagios/NDOutils to make reloads as fast as possible since =20
>> addition
>> and removable of monitored hosts have a high turnover rate. (I don't
>> know if it's the correct way to say it in English)
>>
>
> Merlin doesn't have this problem, as it works differently with its
> database.
>
>> Reloading Nagios so it can pickup the new configuration is viewed =20
>> as a
>> "flaw" by our developers team because there's no monitoring done =20
>> during
>> that time.
>>
>
> Well, restarting or just reloading the configuration doesn't really =20=
> make
> a difference to what kind of monitoring is happening during the =20
> reload.
> Even if Nagios were to reload the configuration without requiring a
> restart, no network monitoring would happen during the reloading.
>
>> If we reload Nagios too often, it would simply pass the majority of =20=
>> its
>> time exporting configuration/status to NDOutils and scheduling checks
>> without doing any real work at all. Too seldom and new monitoring =20
>> would
>> take too much time before being scheduled.
>>
>> Any future plan regarding this aspect?
>>
>
> Well, I've experimented a little bit. It seems to be several orders of
> magnitude faster to do the configuration parsing in two passes. One to
> find out how many objects there are of each type and sort them into a
> two-dimensional table of and then doing a binary search on that table,
> as opposed to creating fixed-sized hash tables and pre-insert objects
> into it. This is especially true for huge configurations, and appears
> to be caused by far more beneficial memory access patterns and the
> ability to only parse most objects a single time since we know that
> all hosts have been parsed by the time services are parsed, fe.
>
>> Also, have you ever heard of DNX? http://dnx.sourceforge.net/
>> Any future plan about a similar feature within Nagios?
>>
>
> DNX is an event-broker module. The Nagios core has been modified to
> accommodate modules of that kind, but the actual functionality is of
> the kind that the eventbroker api was designed for, so it's not likely
>
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]