[Nagios-devel] [Fwd: Nagios Performance Monitoring]

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

[Nagios-devel] [Fwd: Nagios Performance Monitoring]

Post by Guest »

Hi Ethan,

please drop the graph PDF I sent you yesterday!

I've made a failure with handling the data from my perl script.

I've executed nagiostats with MRTG_DATA_VARS in Order 1,2,3,4 and put
them internally in a non-sorted hash... so the order of the requested
data resultet in a non-sorted list of graph names.

A few minutes ago I fixed my failure and deleted all graphs.

Tommorow I can send you the exact data.

-
Hendrik

-------- Original-Nachricht --------
Betreff: Nagios Performance Monitoring
Datum: Mon, 22 Oct 2007 20:14:28 +0200
Von: Hendrik Bäcker
An: Ethan Galstad

Hi Ethan,

*** BEWARE ***
*** TONS OF INFORMATION IN HERE ***


*** OK - you have been warned :) ***

as mentioned in the last e-mail and talked about at the conference I
have made some graphing about the performance.

Pre-Scriptum: If you think the nagios-devel might help here, too I will
try to write some compressed information to it.

First, some words about my (scary) setup.

Cause of the magic border of ~2000 Service checks from Nagios 2.x I had
to compile four different nagios instances, all running on the same
hardware server.

So I have one

/usr/local/nagios/etc/

for common files like, ressource.cfg, checkcommands, misccommands, ...

and mainly four directories like

/usr/local/nagios/_1_/bin/
/usr/local/nagios/_1_/etc/
/usr/local/nagios/_1_/var/
etc.

/usr/local/nagios/_2_/bin/
/usr/local/nagios/_2_/etc/
/usr/local/nagios/_2_/var/
etc.

To be able to see each of my instances I renamed the nagios binary to
nagios-1, nagios-2, nagios-3 and so on.

(Yes - you're right. I have four different Web Interfaces ;) )

I am running a fifth instance to monitor the earlier four instances, so
my 5th instance has just 3 hosts (Nagios_Master, Nagios_Slave, my
dedicated internetserver) and <100 Servicechecks.

I am not dealing with NSCA or s.th. similar to feed up my Failover
Server "Nagios_Slave".

Currentliy I am _not_ using NDOUtils.

But all of my instances are processing performance data, but only the
services that are delivering perfdata has the service option
process_perfdata set to "1".

All of the five instance do nearly the same with perfdata:
1. nagios writes the perfdata to a file
( /usr/local/nagios/_instance_/var/perfdata )
2. nagios move that file every 30 seconds to a shared place
( /usr/local/nagios/var/perfspool/ )
3. End of dealing with perfdata for the nagios daemon.
4. Standalone C Daemon to catch up the files from the spooldir and
feeding a perlscript to create and update the rrdfiles ( this shouldn't
care for the nagios processes, i think )

So, for every Nagios Instance it is just writing to a file handle and
move the inode every 30 seconds and re-open a new FH.

Today I've written a small perl script that runs every 60 seconds to
call each of the nagiostats binaries and grep some Data for charting.

See the attached PDF.

I think only the values since 16.40 are interesting (the last time I've
restarted all of the 5 processes).

What you see there is:

Graph Title:

Nagios_X_Performance_AHCL are the "Hostcheck Last" from nagiostats output.

Nagios_X_Performance_SHCL are the "Servicecheck Last" from nagiostats
output.

At the end of Page 4 and 5 you can see the AVG Service Check Latency for
all the instances.

The data are nearly the truth, cause the 4th instance is my "Nagios"
Instance with only the 3 hosts and quiet no latencies.

Interesting:
Please have a look to the left side of the graph "check_latency_5".
There you can see the big latency (max. 992 seconds), from friday until
today.
The latency just climbs up without an end like you can see on the actual
time.

Some information about the host/service count:

Instance 1:
Total Hosts: 371
Total Services: 2156

Instance 2:
Total Hosts: 206
Total Services: 1405

Instance 3:
Total Hosts: 381
Total Services: 3144

Instance 4:
Total Hosts: 3
Total Servic

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked