Need help with a new plugin
-
MaxHeadroom
- Posts: 6
- Joined: Tue Jun 30, 2015 6:27 am
Need help with a new plugin
I have a system that implements a messaging middleware. I want to build a plugin for Nagios to capture status messages being sent across this middleware. The status messages can be very robust and include many parameters (E.g., State, mode, status, uptime, software revs, hardware info, installation details, Temperature, etc )
I was thinking the plugin would parse the status messages and pass the detailed information to Nagios. From there I could use the facilities within Nagios to determine thresholds, create alerts, and perform trend analysis and reporting. But it doesn't seem to work that way. It appears I only have OK, Warning, or Critical for passing to Nagios.
Am I reading this wrong? is there a way to pass more information to Nagios????
Thanks
Randy
I was thinking the plugin would parse the status messages and pass the detailed information to Nagios. From there I could use the facilities within Nagios to determine thresholds, create alerts, and perform trend analysis and reporting. But it doesn't seem to work that way. It appears I only have OK, Warning, or Critical for passing to Nagios.
Am I reading this wrong? is there a way to pass more information to Nagios????
Thanks
Randy
Re: Need help with a new plugin
The exit code of 0 through 3 determines the OK, WARNING, CRITICAL, or UNKNOWN status. You can also pass back textual information, and information that will be used to graph (performance data):
https://nagios-plugins.org/doc/guidelines.html
https://nagios-plugins.org/doc/guidelines.html
Former Nagios employee
Re: Need help with a new plugin
In addition to what tmcdonald posted:
From http://nagios.sourceforge.net/docs/3_0/pluginapi.html:
https://nagios-plugins.org/doc/guidelines.html
From http://nagios.sourceforge.net/docs/3_0/pluginapi.html:
You can read more here as well:Plugin Output Spec
At a minimum, plugins should return at least one of text output. Beginning with Nagios 3, plugins can optionally return multiple lines of output. Plugins may also return optional performance data that can be processed by external applications. The basic format for plugin output is shown below:
TEXT OUTPUT | OPTIONAL PERFDATA
LONG TEXT LINE 1
LONG TEXT LINE 2
...
LONG TEXT LINE N | PERFDATA LINE 2
PERFDATA LINE 3
...
PERFDATA LINE N
The performance data (shown in orange) is optional. If a plugin returns performance data in its output, it must separate the performance data from the other text output using a pipe (|) symbol. Additional lines of long text output (shown in blue) are also optional.
https://nagios-plugins.org/doc/guidelines.html
-
MaxHeadroom
- Posts: 6
- Joined: Tue Jun 30, 2015 6:27 am
Re: Need help with a new plugin
Thanks. Graphing is something that I am interested in. Can I have Nagios compare textual information (E.g., temperature) against a threshold and use the alert facilities of Nagios should I go above some value?tmcdonald wrote:The exit code of 0 through 3 determines the OK, WARNING, CRITICAL, or UNKNOWN status. You can also pass back textual information, and information that will be used to graph (performance data):
https://nagios-plugins.org/doc/guidelines.html
-
MaxHeadroom
- Posts: 6
- Joined: Tue Jun 30, 2015 6:27 am
Re: Need help with a new plugin
ssax wrote:In addition to what tmcdonald posted:
From http://nagios.sourceforge.net/docs/3_0/pluginapi.html:
Plugin Output Spec
At a minimum, plugins should return at least one of text output. Beginning with Nagios 3, plugins can optionally return multiple lines of output. Plugins may also return optional performance data that can be processed by external applications. The basic format for plugin output is shown below:
TEXT OUTPUT | OPTIONAL PERFDATA
LONG TEXT LINE 1
LONG TEXT LINE 2
...
LONG TEXT LINE N | PERFDATA LINE 2
PERFDATA LINE 3
...
PERFDATA LINE N
The performance data (shown in orange) is optional. If a plugin returns performance data in its output, it must separate the performance data from the other text output using a pipe (|) symbol. Additional lines of long text output (shown in blue) are also optional.
Thanks. Could an example temperature performance data look like this:
Temperature | 78
SendingSystemId=123456789
Units=degrees
Then, again, I would want Nagios to compare the temperature (I.e., 78) to a threshold value. If the threshold is breached then I want Nagios to send out the alerts. coding the threshold inside the plugin doesn't seem like a good architecture.
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Need help with a new plugin
This is what Nagios does with the return values 0-3 as indicated by the plugin guidelines already posted.MaxHeadroom wrote:and use the alert facilities of Nagios should I go above some value
This is what perfdata is forMaxHeadroom wrote:Graphing is something that I am interested in
You can write plugins to compare anything you want, generally I wouldn't consider temperature to be textual, but maybe as an alternative example there are plenty of different logfile readers that can monitor for textual (string) existence and alert on it.MaxHeadroom wrote:Can I have Nagios compare textual information (E.g., temperature) against a threshold
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Need help with a new plugin
We agree:MaxHeadroom wrote:coding the threshold inside the plugin doesn't seem like a good architecture
Nagios Plugin Guidelines wrote:There are a few reserved options that should not be used for other purposes:
-V version (--version)
-h help (--help)
-t timeout (--timeout)
-w warning threshold (--warning)
-c critical threshold (--critical)
-H hostname (--hostname)
-v verbose (--verbose)
-
MaxHeadroom
- Posts: 6
- Joined: Tue Jun 30, 2015 6:27 am
Re: Need help with a new plugin
jdalrymple wrote:We agree:MaxHeadroom wrote:coding the threshold inside the plugin doesn't seem like a good architecture
Nagios Plugin Guidelines wrote:There are a few reserved options that should not be used for other purposes:
-V version (--version)
-h help (--help)
-t timeout (--timeout)
-w warning threshold (--warning)
-c critical threshold (--critical)
-H hostname (--hostname)
-v verbose (--verbose)
These look like command line options. What if I have dozens of thresholds? thanks for the great dialog.
Randy
-
MaxHeadroom
- Posts: 6
- Joined: Tue Jun 30, 2015 6:27 am
Re: Need help with a new plugin
When you say "Exit code", you don't actually mean the plugin exits, do you?tmcdonald wrote:The exit code of 0 through 3 determines the OK, WARNING, CRITICAL, or UNKNOWN status
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Need help with a new plugin
If you have 1000 refrigerators that use the same temperature thresholds they can be defined as one service that looks like this:MaxHeadroom wrote:What if I have dozens of thresholds?
Code: Select all
define service {
service_description 1000 refrigerators
check_command check_temp!-w 30 -c 40 #implies warning at 30 degrees, critical at 40 degrees
hostgroup_name all_my_fridges
...
}Code: Select all
define service {
service_description 500 refrigerators
check_command check_temp!-w 30 -c 40
hostgroup_name half_my_fridges
...
}
define service {
service_description 500 different refrigerators
check_command check_temp!-w 30 -c 40
hostgroup_name the_other_half
...
}Code: Select all
define service {
service_description a refrigerator
check_command check_temp!-w 30 -c 40
host_name 1_fridge
...
}
define service {
service_description a different refrigerator
check_command check_temp!-w 31 -c 41
host_name 2_fridge
...
}
define service {
service_description yet another
check_command check_temp!-w 32 -c 42
host_name red_fridge
...
}
define service {
service_description and another
check_command check_temp!-w 35 -c 36
host_name blue_fridge
...
}That's how it works:MaxHeadroom wrote:When you say "Exit code", you don't actually mean the plugin exits, do you?
Code: Select all
[jdalrymple@localhost libexec]$ ./check_uptime -w 10 -u days
Uptime OK: 0 day(s) 19 hour(s) 15 minute(s) | uptime=0.000000;10.000000;;
[jdalrymple@localhost libexec]$ echo $?
0
[jdalrymple@localhost libexec]$ ./check_uptime -w 10 -u minutes
Uptime WARNING: 0 day(s) 19 hour(s) 15 minute(s) | uptime=1155.000000;10.000000;;
[jdalrymple@localhost libexec]$ echo $?
1