Page 1 of 1

Parsing SSH output

Posted: Wed May 22, 2013 5:08 am
by test541
Hello,
I use check_by_ssh plugin because I need to do some check on remote machines over encrypted tunnel.
I'd like to execute remote scripts and grab their output. Unfortunatly the plugin return only return codes of script execution: 0,1,2 according to Nagios plug-in development guidelines
For example I have a script monitoring only set of filesystems on remote UNIX machine. The perl script has a config file on the remote.
The multiline output (each line for every filesystem) is generated only for overloaded filesystems as below:
Test_ID Bussiness_system_name Service_name Severity Text_message_with_percent_value_and_filesystem_name
For not fullfilled filesystem no output is generated then if everything works fine, there's no output.

When I run the script from shell console on nagios server, everything is OK. Executed this way

Code: Select all

/usr/local/nagios/libexec/check_by_ssh -H 10.10.65.30 -C "export CHKHOME=/opt/my_agt ; /opt/my_agt/var/bin/NAG_disk_space.pl" -l chkusername -E
But when I define command in Nagios and assign it to service and host I can only interpret return value, no output is generated. Only error about return codes.

I prefer using scripts with local configs instead of scripts run via SSH with parameters defined on Nagios server.
It is better approach, because I can only display filesystems with problems with one Nagios command. To use mentioned plugin I should modify the script to accept such parameters: filesystem name, critical and warning value. That scenario is more complicated and has many service entries for only one host. I prefer summarizing them to one entity: filesystems on that host.

Do you know any workarounds or resolutions :?:
Is it whatever possible to parse many lines of Nagios checks?
Filesystems are only example of approach, I need to do another tests on processes or file existence tests.

THX.

Re: Parsing SSH output

Posted: Wed May 22, 2013 3:52 pm
by abrist
Can you show us an example out of the script from the cli?

Re: Parsing SSH output

Posted: Thu May 23, 2013 12:33 am
by test541
Here's the output

Code: Select all

[nagios@localhost ~]$ /usr/local/nagios/libexec/check_by_ssh -H 10.10.65.30 -C "export AGTHOME=/opt/my_agt ; /opt/my_agt/var/bin/NAG_disk_space.pl" -l itm -E
GL:0504:/usr GLOBUS GL.OS.fs critical Disk use of /usr is 46 % ( > 10 %)
GL:0504:/ GLOBUS GL.OS.fs critical Disk use of / is 15 % ( > 10 %)
GL:0504:/home GLOBUS GL.OS.fs critical Disk use of /home is 59 % ( > 10 %)
GL:0504:/opt GLOBUS GL.OS.fs critical Disk use of /opt is 52 % ( > 10 %)
I'd like to use above information to generate 4 events in one script run. I don't need to execute remote script with parameters (filesystem, threshold) four times because I have them already defined in config file for the script NAG_disk_space.pl.
I didn't find any way to interpret such script output - only return codes 0,1,2.

Re: Parsing SSH output

Posted: Thu May 23, 2013 10:26 am
by abrist
What error, verbatim, were you getting about exit codes?
test541 wrote:I didn't find any way to interpret such script output - only return codes 0,1,2.
For clarity, are you just interested in getting the long output to display on the XI interface, or are you also attempting to use an event handler with the check?

Re: Parsing SSH output

Posted: Mon May 27, 2013 6:36 am
by test541
When I run the script from the Nagios/Service/Test command I got:

OUTPUT: UNKNOWN - check_by_ssh: Remote command /usr/local/nagios/libexec/check_by_ssh ..._command_executed_here_ returned status 255

I think it is because multiline. When I create simple script only returning exit code 0, its service is displayed in green color in the XI.
I'd like to use multiline to display the information about status of every overfilled filesystem in Nagios XI interface with one script execution (as in my output).

Re: Parsing SSH output

Posted: Tue May 28, 2013 1:52 pm
by abrist
Your best bet would be to still exit with 1 exit code, for the most serious of the codes. You could then change the string for those checks that failed (this is how the check works by default). If you need to kick off separate events for each check, you should make them individual checks. You could extend the logic in the script to parse the output of each mount point and then report them in the format you wish. Alternatively, you could have each mount point check report passively to kick off their own events.

Nagios will only allow 1 exit code to be returned per check.