NagiosXI 2014R1.2
RHEL 6.5
Mod gearman
If I wanted to capture all the performance data of hosts and services as it is generated and pipe it to a text file, what would be the best way to go about this?
I just don't want to mess around with reading and exporting RRD files.
Thanks
Capture Performance Data
Capture Performance Data
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Capture Performance Data
We do not do this or suggest it in any way, this is one of the few times where unless you have an absolute need and rrds cannot do what you are looking for, we really don't want you to make this change. There are one or two places where you could alter a command before the data is put into the rrd that might be able to capture it for you. However, and this is a huge one, this will cause massive IO load and data usage on your system and have completely untested results in both standard running, and when doing upgrades. We are talking 2-4 times the size of your rrds in storage at least, as none of this data is compressed or averaged, and it will likely be stored in a text format making it much less optimized. What is the issue with rrdtool? It provides some great options for extracting data, and works in scripts extremely well.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: Capture Performance Data
I have almost 10000 RRD files. Wont processing them have the same I/O impact.sreinhardt wrote:We do not do this or suggest it in any way, this is one of the few times where unless you have an absolute need and rrds cannot do what you are looking for, we really don't want you to make this change. There are one or two places where you could alter a command before the data is put into the rrd that might be able to capture it for you. However, and this is a huge one, this will cause massive IO load and data usage on your system and have completely untested results in both standard running, and when doing upgrades. We are talking 2-4 times the size of your rrds in storage at least, as none of this data is compressed or averaged, and it will likely be stored in a text format making it much less optimized. What is the issue with rrdtool? It provides some great options for extracting data, and works in scripts extremely well.
Not to mention the hassle of processing them in the first place.
Please advice the place to modify the commands. I/O issues and storage can be easily handled.
We will explore this to see if this is worthwhile in Dev env.
Does anyone have any routines to output RRD to CSV text data?
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: Capture Performance Data
At 10k rrds, with the standard retention and steps, you will use a very large amount of data. 1/5 of thee rrd size is used for "yesterday". In plain text, this single day of data uses 580k per check on average. So at 10k checks, we could assume a minimum of 5gb disk usage a day.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Capture Performance Data
Take a look at rrd2csv:rajasegar wrote:Does anyone have any routines to output RRD to CSV text data?
https://code.google.com/p/rrd2csv/
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Capture Performance Data
Storage is not an issue. IO is also not an issue. We have different RAID 10 group to use.abrist wrote:At 10k rrds, with the standard retention and steps, you will use a very large amount of data. 1/5 of thee rrd size is used for "yesterday". In plain text, this single day of data uses 580k per check on average. So at 10k checks, we could assume a minimum of 5gb disk usage a day.
Please advice how to go about this. Thanks.
At the same time we will explore the RRD to CSV as a backup option.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: Capture Performance Data
You will need to either add some commands (like duplicating the move as a copy to an additional folder not watched by the reaper, or creating a script that performs the original move as well as your commands to do what you will with the data. The commands you will need to alter are process-host-perfdata-file-bulk and process-service-perfdata-file-bulk. (the commands below are the default commands for reference - you will need to implement your desired changes at this point in the check-result parsing)
The format of those files resembles:
Do understand that if your drive fills up due to this issue, or if you break performance data graphing due to a bad command, you will technically have an unsupported change and support will most likely just revert the commands. It will be your responsibility to make sure your changes do not interfere with the perfdata spooler in XI. This will increase i/o on the XI server - be prepared. Also, if you just add a copy command you should understand that within 3 hours you will have so many files in the target directory that the folder will be un-stat()able. So you will need to have another script running on a cron to process those files as they get copied.
Code: Select all
define command {
command_name process-host-perfdata-file-bulk
command_line /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.host
}
define command {
command_name process-service-perfdata-file-bulk
command_line /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.service
}Code: Select all
DATATYPE::SERVICEPERFDATA TIMET::1412345626 HOSTNAME::localhost SERVICEDESC::Current Users SERVICEPERFDATA::users=2;20;50;0
SERVICECHECKCOMMAND::check_local_users!20!50 HOSTSTATE::UP HOSTSTATETYPE::HARD SERVICESTATE::OK SERVICESTATETYPE::HARD
SERVICEOUTPUT::USERS OK - 2 users currently logged inFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Capture Performance Data
Noted on your comments.abrist wrote:You will need to either add some commands (like duplicating the move as a copy to an additional folder not watched by the reaper, or creating a script that performs the original move as well as your commands to do what you will with the data. The commands you will need to alter are process-host-perfdata-file-bulk and process-service-perfdata-file-bulk. (the commands below are the default commands for reference - you will need to implement your desired changes at this point in the check-result parsing)The format of those files resembles:Code: Select all
define command { command_name process-host-perfdata-file-bulk command_line /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.host } define command { command_name process-service-perfdata-file-bulk command_line /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.service }Do understand that if your drive fills up due to this issue, or if you break performance data graphing due to a bad command, you will technically have an unsupported change and support will most likely just revert the commands. It will be your responsibility to make sure your changes do not interfere with the perfdata spooler in XI. This will increase i/o on the XI server - be prepared. Also, if you just add a copy command you should understand that within 3 hours you will have so many files in the target directory that the folder will be un-stat()able. So you will need to have another script running on a cron to process those files as they get copied.Code: Select all
DATATYPE::SERVICEPERFDATA TIMET::1412345626 HOSTNAME::localhost SERVICEDESC::Current Users SERVICEPERFDATA::users=2;20;50;0 SERVICECHECKCOMMAND::check_local_users!20!50 HOSTSTATE::UP HOSTSTATETYPE::HARD SERVICESTATE::OK SERVICESTATETYPE::HARD SERVICEOUTPUT::USERS OK - 2 users currently logged in
Thanks
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation