Re: [Nagios-devel] Nagios 3 Performance Monitoring

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Nagios 3 Performance Monitoring

Post by Guest »

Hendrik Bäcker wrote:
> Hi List,
>
> ### Now the complete Mail ###
>
> since a few days I was testing some performance issues with Nagios 3
> (current CVS Version).
>
> For nicer graphing I've written a small & dirty Perl script to parse
> some relevant data from the nagiostats binary.
>
> Output of the plugin is:
>
> 1. STDOUT: OK - output | perfdata
> 2. (optional) Output + Performancedata printed directly the the external
> command pipe of Nagios.
>
> I am running a relativ huge installation with up to 5 instances (for
> load balancing) on one hardwareserver (yes - that works).
>
> Some Backgrounddata:
>
> Instance 1: 371 / 2156 (Hosts/Services)
> Instance 2: 206 / 1405 (Hosts/Services)
> Instance 3: 381 / 3147 (Hosts/Services)
> Instance 4: 3 / 54 (Hosts/Services)
> Instance 5: 299 / 3233 (Hosts/Services)
>
> I have enabled the "use_large_installation_tweaks" feature for all
> instance and was realy happy to see that I have _no_ latency at all.
>
> But after 7-9 hours running time I see that the host/service check
> throuput went down, the host/servicecheck execution time wents up (x2.5)
> and latency comes up too.
>

Are you using embedded perl? If so, turn that off.


> After the beginnings of the latency the graph seems to see no end. It
> goes up to 700 seconds for my fifth instance, I guess it will increase
> if I hadn't restartet the nagios process.
>

Run it for just one instance. If you're debugging something, it doesn't
make sense to run it on a resource-starved system.


>
> I guess the 'performance trouble' seems to be a 'during runtime'
> problem. So I am looking for some blowing up tasks in the code, my
> actual guess is the update_check_stats() in base/utils.c which es
> executed on every service check und more than one time for every host
> check i think.
>
> My idea is, that after a while the data structure for stats reaches a
> amount that will take too much time for update and therefor the
> execution time increases.

In C, data structures are constant size, so it's a bit unclear what you
mean by this.

--
Andreas Ericsson [email protected]
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked