Page 2 of 3

Re: check works until it is scheduled and then it fails

Posted: Wed Feb 17, 2016 2:32 pm
by Bionic___
I finally found it.

Code: Select all

define service {
       name                          		xiwizard_website_dnsip_service
       use                           		xiwizard_generic_service
       check_command                 		check_xi_service_dns
       register                    		0

}	
Permissions are:

Code: Select all

-rwxr-xr-x 1 nagios nagios   3821 Feb 11 06:58 check_website

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 11:06 am
by Bionic___
I found it and posted the template Wed Feb 17, 2016 1:47 pm posting.

Code: Select all

define service {
       name                                xiwizard_website_dnsip_service
       use                                 xiwizard_generic_service
       check_command                       check_xi_service_dns
       register                          0

}   
Permissions are

Code: Select all

-rwxr-xr-x 1 nagios nagios   3821 Feb 11 06:58 check_website
The problem persists.

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 11:32 am
by tgriep
Try running the following tt change the permissions / rights on the file to see if that resolves the issue.

Code: Select all

chmod 775 /usr/local/nagios/libexec/check_website
chown apache.nagios /usr/local/nagios/libexec/check_website
It is still fails, please run the following and post the output.

Code: Select all

grep nag /etc/group
grep -R check_website /usr/local/nagios/etc/*
su nagios
/usr/local/nagios/libexec/check_website -w 1500 -c 4000 -t 6000 -s -u /_layouts/15/Sp.Login.Custom/Login.aspx\?ReturnUrl=%2f_layouts%2f15%2fAuthenticate.aspx%3fSource%3d%252F portal.dir.texas.gov
Thanks.

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 11:35 am
by lmiltchev
Can you also show the "check_xi_service_dns" command definition?

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 2:24 pm
by Bionic___
Remember, it always works from the command line but not from a schedule.
I executed all the commands and here is the output:

Code: Select all

root@dira4avbaza /usr/local/nagios/libexec> grep nag /etc/group
nagios:x:52001:nagios,apache
nagcmd:x:52002:nagios,apache
root@dira4avbaza /usr/local/nagios/libexec> grep -R check_website /usr/local/nagios/etc/*
/usr/local/nagios/etc/commands.cfg:       command_name                                  check_website
/usr/local/nagios/etc/commands.cfg:       command_line                                  $USER1$/check_website -w $ARG1$ -c $ARG2$ -t $ARG3$ $ARG4$ -u $ARG5$ $ARG6$
/usr/local/nagios/etc/services/NetPlus Web.cfg: check_command                   check_website!1500!4000!6000!!/netplus6/!168.44.248.26!!
/usr/local/nagios/etc/services/http_dir.texas.gov.cfg:  check_command                   check_website!1500!4000!6000!!/!dir.texas.gov!!
/usr/local/nagios/etc/services/ROD Web DEV Analytics simple.cfg:        check_command                   check_website!2000!4000!6000!!/BOE/BI!texasdir-dev-ana.onbmc.com!!
/usr/local/nagios/etc/services/http_dir.texas.gov_resources.cfg:        check_command                   check_website!1500!4000!6000!!/View-Resources/Landing.aspx!dir.texas.gov!!
/usr/local/nagios/etc/services/ROD Web DEV simple.cfg:  check_command                   check_website!2000!4000!6000!!/!texasdir-dev.onbmc.com!!
/usr/local/nagios/etc/services/ROD Web PROD Analytics simple.cfg:       check_command                   check_website!2000!4000!6000!!/BOE/BI!texasdir-ana.onbmc.com!!
/usr/local/nagios/etc/services/ROD Web QA simple.cfg:   check_command                   check_website!2000!4000!6000!!/!texasdir-qa.onbmc.com!!
/usr/local/nagios/etc/services/http_dir.texas.gov_dirpub.cfg:   check_command                   check_website!1500!4000!6000!!/!dirpub.dir.texas.gov!!
/usr/local/nagios/etc/services/http_dir.texas.gov_portal.cfg:   check_command                   check_website!1500!4000!6000!-s!/_layouts/15/Sp.Login.Custom/Login.aspx?ReturnUrl=%2f_layouts%2f15%2fAuthenticate.aspx%3fSource%                                                                                             3d%252F&Source=%2F!portal.dir.texas.gov!!
/usr/local/nagios/etc/services/http_dir.texas.gov_vendor.cfg:   check_command                   check_website!1500!4000!6000!!/View-Information-For-Vendors/Landing.aspx!dir.texas.gov!!
/usr/local/nagios/etc/services/ROD Web PROD simple.cfg: check_command                   check_website!2000!4000!6000!!/!texasdir.onbmc.com!!
root@dira4avbaza /usr/local/nagios/libexec> grep -R check_website /usr/local/nagios/etc/*
/usr/local/nagios/etc/commands.cfg:       command_name                                  check_website
/usr/local/nagios/etc/commands.cfg:       command_line                                  $USER1$/check_website -w $ARG1$ -c $ARG2$ -t $ARG3$ $ARG4$ -u $ARG5$ $ARG6$
/usr/local/nagios/etc/services/NetPlus Web.cfg: check_command                   check_website!1500!4000!6000!!/netplus6/!168.44.248.26!!
/usr/local/nagios/etc/services/http_dir.texas.gov.cfg:  check_command                   check_website!1500!4000!6000!!/!dir.texas.gov!!
/usr/local/nagios/etc/services/ROD Web DEV Analytics simple.cfg:        check_command                   check_website!2000!4000!6000!!/BOE/BI!texasdir-dev-ana.onbmc.com!!
/usr/local/nagios/etc/services/http_dir.texas.gov_resources.cfg:        check_command                   check_website!1500!4000!6000!!/View-Resources/Landing.aspx!dir.texas.gov!!
/usr/local/nagios/etc/services/ROD Web DEV simple.cfg:  check_command                   check_website!2000!4000!6000!!/!texasdir-dev.onbmc.com!!
/usr/local/nagios/etc/services/ROD Web PROD Analytics simple.cfg:       check_command                   check_website!2000!4000!6000!!/BOE/BI!texasdir-ana.onbmc.com!!
/usr/local/nagios/etc/services/ROD Web QA simple.cfg:   check_command                   check_website!2000!4000!6000!!/!texasdir-qa.onbmc.com!!
/usr/local/nagios/etc/services/http_dir.texas.gov_dirpub.cfg:   check_command                   check_website!1500!4000!6000!!/!dirpub.dir.texas.gov!!
/usr/local/nagios/etc/services/http_dir.texas.gov_portal.cfg:   check_command                   check_website!1500!4000!6000!-s!/_layouts/15/Sp.Login.Custom/Login.aspx?ReturnUrl=%2f_layouts%2f15%2fAuthenticate.aspx%3fSource%3d%252F&Source=%2F!portal.dir.texas.gov!!
/usr/local/nagios/etc/services/http_dir.texas.gov_vendor.cfg:   check_command                   check_website!1500!4000!6000!!/View-Information-For-Vendors/Landing.aspx!dir.texas.gov!!
/usr/local/nagios/etc/services/ROD Web PROD simple.cfg: check_command                   check_website!2000!4000!6000!!/!texasdir.onbmc.com!!
root@dira4avbaza /usr/local/nagios/libexec> su nagios
nagios@dira4avbaza /usr/local/nagios/libexec> /usr/local/nagios/libexec/check_website -w 1500 -c 4000 -t 6000 -s -u /_layouts/15/Sp.Login.Custom/Login.aspx\?ReturnUrl=%2f_layouts%2f15%2fAuthenticate.aspx%3fSource%3d%252F portal.dir.texas.gov
HTTPS CRITICAL: 8090ms - https://portal.dir.texas.gov/_layouts/15/Sp.Login.Custom/Login.aspx?ReturnUrl=%2f_layouts%2f15%2fAuthenticate.aspx%3fSource%3d%252F|time=8090ms;1500;4000;0;
nagios@dira4avbaza /usr/local/nagios/libexec>
It still does not work from schedule. I still get:

Code: Select all

 	(Return code of 127 is out of bounds - plugin may be missing)

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 3:01 pm
by Bionic___
check_xi_service_dns command is configured as:

Code: Select all

define command {
       command_name                  		check_xi_service_dns
       command_line                  		$USER1$/check_dns -H $HOSTADDRESS$ $ARG1$
}	

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 3:29 pm
by lmiltchev
Go to the CCM->Services->click on the "problem" service, and show us a screenshot of the page (under the "Common Settings" tab).

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 4:01 pm
by Bionic___
It is attached.

And when I test the Test Check Command this is the result:

Code: Select all

COMMAND: /usr/local/nagios/libexec/check_website -w 1500 -c 4000 -t 6000 -s -u /_layouts/15/Sp.Login.Custom/Login.aspx\?ReturnUrl=%2f_layouts%2f15%2fAuthenticate.aspx%3fSource%3d%252F portal.dir.texas.gov 
OUTPUT: HTTPS CRITICAL: 5755ms - https://portal.dir.texas.gov/_layouts/15/Sp.Login.Custom/Login.aspx?ReturnUrl=%2f_layouts%2f15%2fAuthenticate.aspx%3fSource%3d%252F|time=5755ms;1500;4000;0;
And this is the result of the scheduled check:

Code: Select all

portal.dir.texas.gov    Web - dir.texas.gov Portal     Critical 	1h 42m 18s 	5/5 	2016-02-18 14:54:09 	(Return code of 127 is out of bounds - plugin may be missing)
It is a puzzlement.

Re: check works until it is scheduled and then it fails

Posted: Thu Feb 18, 2016 6:02 pm
by Box293
I can see you are running into a problem which can lead to the issues you are having.

The problem is the use of some special characters in the service check AND using the "Test Check Command" button. Due to some issues with how PHP escapes characters the "Test Check Command" does not work in these situations and should be ignored.

When you make it work uising the "Test Check Command", because of the character escaping stuff, it is not the same as how it works in Nagios Core when the actual check runs.

So for all further testing of this service you need to:

Make the changes to the service
Save the Service
Apply Configuration
Go back to the home screen and find the Service
When viewing the Service Status Details page click the Schedule a forced immediate check link

Just to re-iterate, for all further testing for this service DO NOT use the "Test Check Command" button, follow the steps above.

My first suggestion is to test from the command line first as the nagios user:

Code: Select all

su nagios
Once you have it working at the command line, we can then build that into a service that works. Post your working command here.

Re: check works until it is scheduled and then it fails

Posted: Fri Feb 19, 2016 8:16 am
by Bionic___
Thank you for the help. I am sorry if I was not clear.
When I run the command on the server command line it works with no problem.
When I run the command from the Test Check Command it still works with no problem.
When I schedule the command it fails.
I followed your instructions.
Make the changes to the service
Save the Service
Apply Configuration
Go back to the home screen and find the Service
When viewing the Service Status Details page click the Schedule a forced immediate check link
It still fails
Results:

Code: Select all

(Return code of 127 is out of bounds - plugin may be missing)
I am stumped.