Need help with a new plugin

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
MaxHeadroom
Posts: 6
Joined: Tue Jun 30, 2015 6:27 am

Need help with a new plugin

Post by MaxHeadroom »

I have a system that implements a messaging middleware. I want to build a plugin for Nagios to capture status messages being sent across this middleware. The status messages can be very robust and include many parameters (E.g., State, mode, status, uptime, software revs, hardware info, installation details, Temperature, etc )

I was thinking the plugin would parse the status messages and pass the detailed information to Nagios. From there I could use the facilities within Nagios to determine thresholds, create alerts, and perform trend analysis and reporting. But it doesn't seem to work that way. It appears I only have OK, Warning, or Critical for passing to Nagios.

Am I reading this wrong? is there a way to pass more information to Nagios????

Thanks
Randy
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Need help with a new plugin

Post by tmcdonald »

The exit code of 0 through 3 determines the OK, WARNING, CRITICAL, or UNKNOWN status. You can also pass back textual information, and information that will be used to graph (performance data):

https://nagios-plugins.org/doc/guidelines.html
Former Nagios employee
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Need help with a new plugin

Post by ssax »

In addition to what tmcdonald posted:

From http://nagios.sourceforge.net/docs/3_0/pluginapi.html:
Plugin Output Spec

At a minimum, plugins should return at least one of text output. Beginning with Nagios 3, plugins can optionally return multiple lines of output. Plugins may also return optional performance data that can be processed by external applications. The basic format for plugin output is shown below:

TEXT OUTPUT | OPTIONAL PERFDATA
LONG TEXT LINE 1
LONG TEXT LINE 2
...
LONG TEXT LINE N | PERFDATA LINE 2
PERFDATA LINE 3
...
PERFDATA LINE N

The performance data (shown in orange) is optional. If a plugin returns performance data in its output, it must separate the performance data from the other text output using a pipe (|) symbol. Additional lines of long text output (shown in blue) are also optional.
You can read more here as well:

https://nagios-plugins.org/doc/guidelines.html
MaxHeadroom
Posts: 6
Joined: Tue Jun 30, 2015 6:27 am

Re: Need help with a new plugin

Post by MaxHeadroom »

tmcdonald wrote:The exit code of 0 through 3 determines the OK, WARNING, CRITICAL, or UNKNOWN status. You can also pass back textual information, and information that will be used to graph (performance data):

https://nagios-plugins.org/doc/guidelines.html
Thanks. Graphing is something that I am interested in. Can I have Nagios compare textual information (E.g., temperature) against a threshold and use the alert facilities of Nagios should I go above some value?
MaxHeadroom
Posts: 6
Joined: Tue Jun 30, 2015 6:27 am

Re: Need help with a new plugin

Post by MaxHeadroom »

ssax wrote:In addition to what tmcdonald posted:

From http://nagios.sourceforge.net/docs/3_0/pluginapi.html:
Plugin Output Spec

At a minimum, plugins should return at least one of text output. Beginning with Nagios 3, plugins can optionally return multiple lines of output. Plugins may also return optional performance data that can be processed by external applications. The basic format for plugin output is shown below:

TEXT OUTPUT | OPTIONAL PERFDATA
LONG TEXT LINE 1
LONG TEXT LINE 2
...
LONG TEXT LINE N | PERFDATA LINE 2
PERFDATA LINE 3
...
PERFDATA LINE N

The performance data (shown in orange) is optional. If a plugin returns performance data in its output, it must separate the performance data from the other text output using a pipe (|) symbol. Additional lines of long text output (shown in blue) are also optional.

Thanks. Could an example temperature performance data look like this:

Temperature | 78
SendingSystemId=123456789
Units=degrees


Then, again, I would want Nagios to compare the temperature (I.e., 78) to a threshold value. If the threshold is breached then I want Nagios to send out the alerts. coding the threshold inside the plugin doesn't seem like a good architecture.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Need help with a new plugin

Post by jdalrymple »

MaxHeadroom wrote:and use the alert facilities of Nagios should I go above some value
This is what Nagios does with the return values 0-3 as indicated by the plugin guidelines already posted.
MaxHeadroom wrote:Graphing is something that I am interested in
This is what perfdata is for
MaxHeadroom wrote:Can I have Nagios compare textual information (E.g., temperature) against a threshold
You can write plugins to compare anything you want, generally I wouldn't consider temperature to be textual, but maybe as an alternative example there are plenty of different logfile readers that can monitor for textual (string) existence and alert on it.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Need help with a new plugin

Post by jdalrymple »

MaxHeadroom wrote:coding the threshold inside the plugin doesn't seem like a good architecture
We agree:
Nagios Plugin Guidelines wrote:There are a few reserved options that should not be used for other purposes:

-V version (--version)
-h help (--help)
-t timeout (--timeout)
-w warning threshold (--warning)
-c critical threshold (--critical)

-H hostname (--hostname)
-v verbose (--verbose)
MaxHeadroom
Posts: 6
Joined: Tue Jun 30, 2015 6:27 am

Re: Need help with a new plugin

Post by MaxHeadroom »

jdalrymple wrote:
MaxHeadroom wrote:coding the threshold inside the plugin doesn't seem like a good architecture
We agree:
Nagios Plugin Guidelines wrote:There are a few reserved options that should not be used for other purposes:

-V version (--version)
-h help (--help)
-t timeout (--timeout)
-w warning threshold (--warning)
-c critical threshold (--critical)

-H hostname (--hostname)
-v verbose (--verbose)

These look like command line options. What if I have dozens of thresholds? thanks for the great dialog.
Randy
MaxHeadroom
Posts: 6
Joined: Tue Jun 30, 2015 6:27 am

Re: Need help with a new plugin

Post by MaxHeadroom »

tmcdonald wrote:The exit code of 0 through 3 determines the OK, WARNING, CRITICAL, or UNKNOWN status
When you say "Exit code", you don't actually mean the plugin exits, do you?
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Need help with a new plugin

Post by jdalrymple »

MaxHeadroom wrote:What if I have dozens of thresholds?
If you have 1000 refrigerators that use the same temperature thresholds they can be defined as one service that looks like this:

Code: Select all

define service {
	service_description			1000 refrigerators
	check_command			   	check_temp!-w 30 -c 40 #implies warning at 30 degrees, critical at 40 degrees
	hostgroup_name		  		all_my_fridges
	...
	}
If you have 2 sets of 500 refrigerators, each set with specific thresholds you'd define them like this:

Code: Select all

define service {
	service_description			500 refrigerators
	check_command			   	check_temp!-w 30 -c 40
	hostgroup_name			  	half_my_fridges
	...
	}
	
define service {
	service_description			500 different refrigerators
	check_command				   check_temp!-w 30 -c 40
	hostgroup_name				  the_other_half
	...
	}
If every fridge is different - no monitoring software I know of can work without having some thresholds defined somewhere, and the mind-link function isn't online yet. If you have to define thousands of thresholds and can't aggregate anything, you have my sympathy:

Code: Select all

define service {
	service_description			a refrigerator
	check_command				   check_temp!-w 30 -c 40
	host_name					    1_fridge
	...
	}
	
define service {
	service_description			a different refrigerator
	check_command				   check_temp!-w 31 -c 41
	host_name					    2_fridge
	...
	}

define service {
	service_description			yet another
	check_command				   check_temp!-w 32 -c 42
	host_name					    red_fridge
	...
	}
	
define service {
	service_description			and another
	check_command				   check_temp!-w 35 -c 36
	host_name					    blue_fridge
	...
	}
MaxHeadroom wrote:When you say "Exit code", you don't actually mean the plugin exits, do you?
That's how it works:

Code: Select all

[jdalrymple@localhost libexec]$ ./check_uptime -w 10 -u days
Uptime OK: 0 day(s) 19 hour(s) 15 minute(s) | uptime=0.000000;10.000000;;
[jdalrymple@localhost libexec]$ echo $?
0
[jdalrymple@localhost libexec]$ ./check_uptime -w 10 -u minutes
Uptime WARNING: 0 day(s) 19 hour(s) 15 minute(s) | uptime=1155.000000;10.000000;;
[jdalrymple@localhost libexec]$ echo $?
1
Locked