Page 1 of 1

Federated Monitoring with Graphing

Posted: Sun Aug 31, 2014 7:17 am
by mikew
I am setting up a distributed monitoring system with several hundred Nagios Core servers reporting back to a central Nagios XI server. This is a POC for an organization with many remote sites.

One of the requirements is to have graphing transferred from the remote sites to the Nagios XI server. So at this point the remote sites are Nagios Core 4.0.8 using nagiosgraph. Each of these sites will obsess over services and so I have modified the service_check command to include perfdata which is transferring to the Nagios XI server. I can see the perfdata stats in the service check results.

Problem:
Graphs are not building on Nagios XI even though perfdata files are transferred and the xml and rrd files are building. The problem is probably the format of the nagiosgraph perfdata and the highcharts used on Nagios XI.

Question:
Is there a way to convert that data so it will build perfdata charts on Nagios XI or is there a graphing solution I can install on the Nagios Core 4 that will work with Nagios XI?

Re: Federated Monitoring with Graphing

Posted: Tue Sep 02, 2014 1:53 pm
by abrist
If you push the full check results, they should include the perfdata which will be processed by the central server as any old check with perfdata. Do you have a sample of the outbound check perfdata that is not graphing?

Re: Federated Monitoring with Graphing

Posted: Tue Sep 02, 2014 2:30 pm
by mikew
So the checks are coming in as they are listed in /var/log/messages and seem to have perfdata:

Here are two examples from the XI server:
HTTP OK: HTTP/1.0 200 OK - 19564 bytes in 0.092 second response time time=0.091918s:::0.000000 size=19564B:::0

PING OK - Packet loss = 0%, RTA = 21.08 ms rta=21.083000ms:200.000000:600.000000:0.000000 pl=0%:20:60:0


Sep 2 15:27:29 wxi xinetd[11298]: START: nsca pid=32715 from=::ffff:107.170.214.33
Sep 2 15:27:30 wxi xinetd[11298]: EXIT: nsca status=0 pid=32715 duration=1(sec)

Re: Federated Monitoring with Graphing

Posted: Tue Sep 02, 2014 4:23 pm
by sreinhardt
There is no pipe to separate out standard responses from perfata, I believe they should look like:

Yours:
HTTP OK: HTTP/1.0 200 OK - 19564 bytes in 0.092 second response time time=0.091918s:::0.000000 size=19564B:::0

PING OK - Packet loss = 0%, RTA = 21.08 ms rta=21.083000ms:200.000000:600.000000:0.000000 pl=0%:20:60:0

What nagios expects:
HTTP OK: HTTP/1.0 200 OK - 19564 bytes in 0.092 second response time | time=0.091918s:::0.000000 size=19564B:::0

PING OK - Packet loss = 0%, RTA = 21.08 ms | rta=21.083000ms:200.000000:600.000000:0.000000 pl=0%:20:60:0

This would definitely cause the perfdata to show in short and long output, but not be processed properly for pnp\nagios to handle.

Re: Federated Monitoring with Graphing

Posted: Tue Sep 02, 2014 6:36 pm
by mikew
Yep, I saw that too and was wondering. Thanks for the input as that makes sence. I added a macro to the script that is doing the NSCA sending $PERFDATA$, I bet that is messing it up....I will check and let you know.

Re: Federated Monitoring with Graphing

Posted: Tue Sep 02, 2014 8:00 pm
by mikew
So I removed the perfdata macro from the service_check and now I get no perfdata at all:

HTTP OK: HTTP/1.0 200 OK - 19516 bytes in 0.094 second response time

PING OK - Packet loss = 0%, RTA = 21.14 ms

The graphs are building fine on the Nagios 4.x server with nagiosgraph.

Re: Federated Monitoring with Graphing

Posted: Wed Sep 03, 2014 10:52 am
by lmiltchev
Mike,
Can you show us the actual command that you are running from the command line, along with the output of it (on the remote box)? Also, show us the command that you are using to send the check results to the central.