Page 1 of 1

Getting error "Return code of 126" when running check

Posted: Thu Mar 12, 2015 6:19 pm
by Metroshica
I'm currently using a variant of the well-known check_snmp_environment.pl script that has support for Brocade FCX switches (which can be found at http://exchange.nagios.org/directory/Pl ... 29/details). While this script works completely fine when I run it (both as root and the nagios user) I'm getting the following error: "Warning: Return code of 126 for check of service 'Overall Health' on host 'foobar.com' was out of bounds.Make sure the plugin you're trying to run is executable."

Obviously this lead me to believe that it's a permissions issue, but the script is owned by nagios and the permissions are set to 755. I was originally getting the "Check of service 'Overall Health' on host 'foobar.com' did not exit properly!" but I changed the enable_embedded_perl value to 0, which fixed that issue.

Here is the output of the commands.cfg file for this check

Code: Select all

define command{
        command_name    check_snmp_environment_fcx
        command_line    $USER1$/check_snmp_environment.pl -H $HOSTADDRESS$ -C public -T $ARG1$
        }
I had changed the arguments around multiple times, trying more or less arugments to pass from the service definition but it didn't make a difference. Speaking of which, here's the service definition:

Code: Select all

define service{
        use                     generic-service
        host_name               foobar.com
        service_description     Overall Health
        check_command           check_snmp_environment_fcx!brocadeFCX
        normal_check_interval   5
        retry_check_interval    1
        }
I've enabled debug mode and have looked at the log. According to the log, the command is correct, here's what I'm seeing in the debug log:

Code: Select all

[1426200429.367186] [2048.1] [pid=14642] **** BEGIN MACRO PROCESSING ***********
[1426200429.367196] [2048.1] [pid=14642] Processing: '$USER1$/check_snmp_environment.pl -H $HOSTADDRESS$ -C public -T $ARG1$'
[1426200429.367222] [2048.1] [pid=14642]   Done.  Final output: '/usr/lib64/nagios/plugins/check_snmp_environment.pl -H 192.168.1.10 -C public -T brocadeFCX'
[1426200429.367233] [2048.1] [pid=14642] **** END MACRO PROCESSING *************
If I copy the command "/usr/lib64/nagios/plugins/check_snmp_environment.pl -H 192.168.1.10 -C public -T brocadeFCX" and run it as either the root or nagios user, the check runs successfully and I get the following output:

Code: Select all

[root@nagios ~]# /usr/lib64/nagios/plugins/check_snmp_environment.pl -H 192.168.1.10 -C public -T brocadeFCX
PS 1 (Power supply 1 (NA - AC - Regular) present, status ok): OK; PS 2 (Power supply 2 (NA - AC - Regular) present, status ok): OK; 
PS 1-1 (Power supply 1): OK; PS 2-1 (Power supply 2): OK; PS 1-2 (Power supply 1): OK; PS 2-2 (Power supply 2): OK; PS 1-3 (Power supply 1): OK; PS 2-3 (Power supply 2): OK; 
Fan 1 (1): OK; Fan 2 (2): OK; 
Fan 1 (1): OK; Fan 2 (2): OK; 
Fan 1 (1): OK; Fan 2 (2): OK; 
Chassis temperature of 43.5?C: OK; 
Chassis unit temperature of 43.5?C: OK; 
Management module: CPU temperature of 60.5?C: OK; 
Management module: MAC 1 temperature of 43.5?C: OK; 
Management module: CPU temperature of 63?C: OK; 
Management module: MAC 1 temperature of 44?C: OK; 
Management module: CPU temperature of 66?C: OK; 
Management module: MAC 1 temperature of 46?C: OK; 
Module 1-1 (ICX6610-48P POE 48-port Management Module): Module status: OK; Redundant status: OK (active); 
Module 1-2 (ICX6610-QSFP 10-port 160G Module): Module status: OK; Redundant status: OK (other); 
Module 1-3 (ICX6610-8-port Dual Mode(SFP/SFP+) Module): Module status: OK; Redundant status: OK (other); 
Module 2-1 (ICX6610-48P POE 48-port Management Module): Module status: OK; Redundant status: OK (standby); 
Module 2-2 (ICX6610-QSFP 10-port 160G Module): Module status: OK; Redundant status: OK (other); 
Module 2-3 (ICX6610-8-port Dual Mode(SFP/SFP+) Module): Module status: OK; Redundant status: OK (other); 
Module 3-1 (ICX6610-48P POE 48-port Management Module): Module status: OK; Redundant status: OK (other); 
Module 3-2 (ICX6610-QSFP 10-port 160G Module): Module status: OK; Redundant status: OK (other); 
Module 3-3 (ICX6610-8-port Dual Mode(SFP/SFP+) Module): Module status: OK; Redundant status: OK (other); 
all OK

[root@nagios ~]# echo $?
0
As you can see at the end, the exit code is 0 when running it manually so it should be good. However, whenever the nagios process actually runs it somehow it's getting an exit status of 126. I'm completely at a loss at what's going on here, anyone have any ideas?

Re: Getting error "Return code of 126" when running check

Posted: Fri Mar 13, 2015 10:36 am
by jdalrymple

Code: Select all

sestatus
is the first thing that comes to mind... how are you looking there?

Killer job of debugging prior to posting BTW - thanks for that!

Re: Getting error "Return code of 126" when running check

Posted: Fri Mar 13, 2015 10:47 am
by Metroshica
You sir have just made my day. That was exactly the problem, this is a new box and I totally forgot selinux was still enabled. That fixed the problem immediately.

Your welcome for the debugging, saved both of us so much time.

Re: Getting error "Return code of 126" when running check

Posted: Fri Mar 13, 2015 10:51 am
by jdalrymple
Couldn't be happier that was the issue, because honestly if it wasn't that you would have me scratching my head.

Going to go ahead and lock this one now. Thanks again.