Page 1 of 1

Historical Performance Data

Posted: Tue Jul 05, 2011 2:32 pm
by dsdonut
I've noticed that some of the built-in (out of the box) checks will keep historical data, and display that data in the form of a graph. CPU stats, mem_info, etc, have this data.

Is this a function of the script itself, or is this a function of Nagios? For certain things, CPU, memory, disk space, and so on, I need to have this data, and I need these graphs. I was unable to find a script in the Nagios library to monitor the usage of individual CPU cores. All of the ones I could find, just take an average across all available cores. So, I wrote my own script to check each core individually. This check does not provide a graph showing historical data. Is there a way I can get that?

Also, by default how far back does historical data go?

Re: Historical Performance Data

Posted: Tue Jul 05, 2011 4:58 pm
by nscott
dsdonut,

Nagios XI handles a lot of that logic, you'll simply need to make a command in Nagios XI that does what you want, and then you'll need to make a template for it (use the other templates in /usr/local/nagios/share/pnp/templates as a "template" :) ) and name it the same as the plugin in you wrote. Nagios XI should create the graph for you from there.

Also, the default RRD size is 1 year.

Re: Historical Performance Data

Posted: Wed Jul 06, 2011 8:41 am
by dsdonut
I'm a little unsure of what to do.

There are templates in both /usr/local/nagios/share/pnp/templates and /usr/local/nagios/share/pnp/templates.dist

None of those templates are named the same as any of the checks I'm running. (ie I'm running a script called check_cpu_stats.sh, it is giving me performance graphs, yet I can't find a template with that name.) The only template listed in any of my nagios services is xiwizard_nrpe_service. I also don't find a template by that name, so I really don't know what template to copy.

Re: Historical Performance Data

Posted: Wed Jul 06, 2011 10:04 am
by agriffin
Your plugin can provide graphing data in its output after a "|" character. Everything after that character is stripped out by Nagios for status checks and used for performance data. You can find more information in the documentation. Sourceforge.net seems to be down at the moment, which is where much of our online documentation is, but you should have a copy in /usr/local/nagios/share/docs/perfdata.html

Re: Historical Performance Data

Posted: Tue Jul 19, 2011 1:04 pm
by hhlodge
I've been outputting performance data for some custom plugins and haven't done a thing with templates and have nice graphs with history, but don't see any place this data is kept, to include the rrd files directory. Where would this be?

Re: Historical Performance Data

Posted: Tue Jul 19, 2011 1:52 pm
by mguthrie
PNP template are a little bit tricky to work with. They are stored in the following directories:

/usr/local/nagios/share/pnp/templates
/usr/local/nagios/share/pnp/templates.dist #use this one for custom templates

Template names need to correspond to a defined command definition, not the plugin name. Documentation on custom templates is pretty sketchy, both from PNP and even rrdtool. Getting them to work correctly is mostly a matter of trial and error.

Here's the best breakdown I've seen on performance data syntax:
http://docs.pnp4nagios.org/pnp-0.4/abou ... quirements



Hope that helps.

Re: Historical Performance Data

Posted: Wed Jul 20, 2011 12:31 pm
by hhlodge
Sorry if I wasn't clear. I literally created a plugin and command named check_badness that just checks for a file and outputs the hour in perf data format and I get a graph that walks up the X axis each hour for every day with history going back since I made the service. I did nothing with a template by that name. I did find the rrd data. I was looking in /var. I see now the graphs say "Default_Template". Is that a catch-all template, meaning we don't have to make a template copy by name? Sorry, I'm confused on what's being conveyed above.

#!/bin/sh

if [ -f /tmp/nagbad ]
then
echo "CRITICAL: Things are grim!|'Hour is'=`date +%H`;;;0;24"
exit 2
else
echo "OK: All hunky dory.|'Hour is'=`date +%H`;;;0;24"
exit 0
fi
graph.png

Re: Historical Performance Data

Posted: Thu Jul 21, 2011 10:29 am
by mguthrie
Is that a catch-all template, meaning we don't have to make a template copy by name?
Correct, if there's no custom template defined, the default template that you're seeing there will be used.