Capture Performance Data

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Capture Performance Data

Post by rajasegar »

NagiosXI 2014R1.2
RHEL 6.5
Mod gearman

If I wanted to capture all the performance data of hosts and services as it is generated and pipe it to a text file, what would be the best way to go about this?
I just don't want to mess around with reading and exporting RRD files.

Thanks
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Capture Performance Data

Post by sreinhardt »

We do not do this or suggest it in any way, this is one of the few times where unless you have an absolute need and rrds cannot do what you are looking for, we really don't want you to make this change. There are one or two places where you could alter a command before the data is put into the rrd that might be able to capture it for you. However, and this is a huge one, this will cause massive IO load and data usage on your system and have completely untested results in both standard running, and when doing upgrades. We are talking 2-4 times the size of your rrds in storage at least, as none of this data is compressed or averaged, and it will likely be stored in a text format making it much less optimized. What is the issue with rrdtool? It provides some great options for extracting data, and works in scripts extremely well.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Capture Performance Data

Post by rajasegar »

sreinhardt wrote:We do not do this or suggest it in any way, this is one of the few times where unless you have an absolute need and rrds cannot do what you are looking for, we really don't want you to make this change. There are one or two places where you could alter a command before the data is put into the rrd that might be able to capture it for you. However, and this is a huge one, this will cause massive IO load and data usage on your system and have completely untested results in both standard running, and when doing upgrades. We are talking 2-4 times the size of your rrds in storage at least, as none of this data is compressed or averaged, and it will likely be stored in a text format making it much less optimized. What is the issue with rrdtool? It provides some great options for extracting data, and works in scripts extremely well.
I have almost 10000 RRD files. Wont processing them have the same I/O impact.
Not to mention the hassle of processing them in the first place.

Please advice the place to modify the commands. I/O issues and storage can be easily handled.
We will explore this to see if this is worthwhile in Dev env.

Does anyone have any routines to output RRD to CSV text data?
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Capture Performance Data

Post by abrist »

At 10k rrds, with the standard retention and steps, you will use a very large amount of data. 1/5 of thee rrd size is used for "yesterday". In plain text, this single day of data uses 580k per check on average. So at 10k checks, we could assume a minimum of 5gb disk usage a day.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Capture Performance Data

Post by abrist »

rajasegar wrote:Does anyone have any routines to output RRD to CSV text data?
Take a look at rrd2csv:
https://code.google.com/p/rrd2csv/
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Capture Performance Data

Post by rajasegar »

abrist wrote:At 10k rrds, with the standard retention and steps, you will use a very large amount of data. 1/5 of thee rrd size is used for "yesterday". In plain text, this single day of data uses 580k per check on average. So at 10k checks, we could assume a minimum of 5gb disk usage a day.
Storage is not an issue. IO is also not an issue. We have different RAID 10 group to use.
Please advice how to go about this. Thanks.

At the same time we will explore the RRD to CSV as a backup option.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Capture Performance Data

Post by abrist »

You will need to either add some commands (like duplicating the move as a copy to an additional folder not watched by the reaper, or creating a script that performs the original move as well as your commands to do what you will with the data. The commands you will need to alter are process-host-perfdata-file-bulk and process-service-perfdata-file-bulk. (the commands below are the default commands for reference - you will need to implement your desired changes at this point in the check-result parsing)

Code: Select all

define command {
       command_name                             process-host-perfdata-file-bulk
       command_line                             /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.host
}
define command {
       command_name                             process-service-perfdata-file-bulk
       command_line                             /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.service
}
The format of those files resembles:

Code: Select all

DATATYPE::SERVICEPERFDATA       TIMET::1412345626       HOSTNAME::localhost     SERVICEDESC::Current Users      SERVICEPERFDATA::users=2;20;50;0
        SERVICECHECKCOMMAND::check_local_users!20!50    HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD
SERVICEOUTPUT::USERS OK - 2 users currently logged in
Do understand that if your drive fills up due to this issue, or if you break performance data graphing due to a bad command, you will technically have an unsupported change and support will most likely just revert the commands. It will be your responsibility to make sure your changes do not interfere with the perfdata spooler in XI. This will increase i/o on the XI server - be prepared. Also, if you just add a copy command you should understand that within 3 hours you will have so many files in the target directory that the folder will be un-stat()able. So you will need to have another script running on a cron to process those files as they get copied.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Capture Performance Data

Post by rajasegar »

abrist wrote:You will need to either add some commands (like duplicating the move as a copy to an additional folder not watched by the reaper, or creating a script that performs the original move as well as your commands to do what you will with the data. The commands you will need to alter are process-host-perfdata-file-bulk and process-service-perfdata-file-bulk. (the commands below are the default commands for reference - you will need to implement your desired changes at this point in the check-result parsing)

Code: Select all

define command {
       command_name                             process-host-perfdata-file-bulk
       command_line                             /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.host
}
define command {
       command_name                             process-service-perfdata-file-bulk
       command_line                             /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.service
}
The format of those files resembles:

Code: Select all

DATATYPE::SERVICEPERFDATA       TIMET::1412345626       HOSTNAME::localhost     SERVICEDESC::Current Users      SERVICEPERFDATA::users=2;20;50;0
        SERVICECHECKCOMMAND::check_local_users!20!50    HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD
SERVICEOUTPUT::USERS OK - 2 users currently logged in
Do understand that if your drive fills up due to this issue, or if you break performance data graphing due to a bad command, you will technically have an unsupported change and support will most likely just revert the commands. It will be your responsibility to make sure your changes do not interfere with the perfdata spooler in XI. This will increase i/o on the XI server - be prepared. Also, if you just add a copy command you should understand that within 3 hours you will have so many files in the target directory that the folder will be un-stat()able. So you will need to have another script running on a cron to process those files as they get copied.
Noted on your comments.
Thanks
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Capture Performance Data

Post by tmcdonald »

Closing thread.
Former Nagios employee
Locked