Page 2 of 3

Re: Schedule a forced immediate check not working

Posted: Wed Mar 25, 2015 6:21 pm
by rajasegar
tgriep wrote:For a test, can you disable the gearman worker that is running on the XI server so the XI server will do the local checks instead of the gearman worker?
This may increase the performance of the XI server.
We added gearman because local server had issues handling the scheduling.
Not taking any chances on the production box sending out invalid alerts due to this.

Re: Schedule a forced immediate check not working

Posted: Thu Mar 26, 2015 10:11 am
by jolson
Try navigating to Service Details -> (select a service) -> Advanced, and then select the link "See this service in Nagios Core." See if you can access that url and submit the command correctly/quickly. Core may output more verbose errors.

Re: Schedule a forced immediate check not working

Posted: Sun Apr 05, 2015 11:55 pm
by rajasegar
jolson wrote:Try navigating to Service Details -> (select a service) -> Advanced, and then select the link "See this service in Nagios Core." See if you can access that url and submit the command correctly/quickly. Core may output more verbose errors.
No error messages.

Does anyone have any clue? This happens so often. We have to restart the Gearman services to solve this problem.
Just restarting Nagios wont solve it.

Re: Schedule a forced immediate check not working

Posted: Mon Apr 06, 2015 11:10 am
by abrist
This is very odd. Gearman should not effect the command.cgi at all, as when you submit a command it is done through a an ajax call to the command cgi. You can even see this behavior by watch the requests from your browser console/network log. Do you see any errors in the apache logs at /var/log/httpd/error_log?

Re: Schedule a forced immediate check not working

Posted: Mon Apr 06, 2015 6:09 pm
by rajasegar
abrist wrote:This is very odd. Gearman should not effect the command.cgi at all, as when you submit a command it is done through a an ajax call to the command cgi. You can even see this behavior by watch the requests from your browser console/network log. Do you see any errors in the apache logs at /var/log/httpd/error_log?
See attached log. See a bunch of errors in it.
This happened yesterday 06/04/2014 around 12:30 - 12:55pm

Code: Select all

[Mon Apr 06 12:22:42 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/xi-index.php
[Mon Apr 06 12:22:42 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/xi-index.php
[Mon Apr 06 12:23:39 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/?cmd=view&type=service
[Mon Apr 06 12:23:39 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/?cmd=view&type=service
[Mon Apr 06 12:23:43 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php
[Mon Apr 06 12:24:09 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:25:01 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:25:01 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:25:32 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php
[Mon Apr 06 12:25:47 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:26:25 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:26:25 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:39:59 2015] [error] [client ::1] PHP Notice:  Undefined index: language in /usr/local/nagiosxi/html/includes/components/ccm/includes/common_functions.inc.php on line 710
[Mon Apr 06 12:39:59 2015] [error] [client ::1] PHP Notice:  Undefined index: language in /usr/local/nagiosxi/html/includes/components/ccm/includes/common_functions.inc.php on line 711
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 142, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 144, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 185, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 187, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 142, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 144, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 185, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 187, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:24 2015] [error] [client 10.17.44.34] PHP Warning:  arsort() expects parameter 1 to be array, null given in /usr/local/nagiosxi/html/includes/components/latestalerts/latestalerts.inc.php on line 517, referer: http://10.17.19.235/nagiosxi//includes/page-home-main.php?&=
[Mon Apr 06 12:40:24 2015] [error] [client 10.17.44.34] PHP Warning:  Invalid argument supplied for foreach() in /usr/local/nagiosxi/html/includes/components/latestalerts/latestalerts.inc.php on line 518, referer: http://10.17.19.235/nagiosxi//includes/page-home-main.php?&=
[Mon Apr 06 12:40:24 2015] [error] [client 10.17.44.34] PHP Notice:  Undefined variable: c in /usr/local/nagiosxi/html/includes/components/latestalerts/latestalerts.inc.php on line 521, referer: http://10.17.19.235/nagiosxi//includes/page-home-main.php?&=
[Mon Apr 06 12:40:26 2015] [error] [client 10.17.38.5] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 142, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:26 2015] [error] [client 10.17.38.5] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 144, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:26 2015] [error] [client 10.17.38.5] PHP Warning:  Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 185, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php

error.zip

Re: Schedule a forced immediate check not working

Posted: Tue Apr 07, 2015 1:23 pm
by abrist
If you can submit commands from core, but not XI, you may have an issue with the cmdsubsys cron. What is the output of:

Code: Select all

ps -aef | grep cron
service crond status
cat /etc/cron.d/nagiosxi
tail -25 /var/log/cron
ls -ld /home/nagios
grep nag /etc/group

Re: Schedule a forced immediate check not working

Posted: Tue Apr 07, 2015 7:31 pm
by rajasegar
abrist wrote:If you can submit commands from core, but not XI, you may have an issue with the cmdsubsys cron. What is the output of:

Code: Select all

ps -aef | grep cron
service crond status
cat /etc/cron.d/nagiosxi
tail -25 /var/log/cron
ls -ld /home/nagios
grep nag /etc/group
It was the same problem from core.
Here is the information requested.

Code: Select all

[root@nagiosprodxi1 ~]# ps -aef | grep cron
root      2535     1  0 Mar31 ?        00:00:47 crond
nagios   11532 11513  0 08:29 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios   11533 11519  0 08:29 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios   11535 11532  1 08:29 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios   11536 11533  1 08:29 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios   11538 11514  0 08:29 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios   11539 11516  0 08:29 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios   11541 11515  0 08:29 ?        00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
nagios   11545 11541  1 08:29 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios   11546 11538  1 08:29 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios   11562 11539  1 08:29 ?        00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios   23085  7042  2 08:29 ?        00:00:00 /usr/local/nagios/libexec/check_nrpe -u -H 172.29.37.89 -t 60 -c rs_check_service -a cron
root     23092 22252  0 08:29 pts/0    00:00:00 grep cron
[root@nagiosprodxi1 ~]# service crond status
crond (pid  2535) is running...
[root@nagiosprodxi1 ~]# cat /etc/cron.d/nagiosxi
# /etc/cron.d/nagiosxi: crontab fragment for nagiosxi

# Backup MySQL & PostgreSQL Databases
0   7 * * * root   /root/scripts/automysqlbackup
0   8 * * * root   /root/scripts/autopostgresqlbackup

*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1
01  * * * * nagios /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1

[root@nagiosprodxi1 ~]# tail -25 /var/log/cron
Apr  8 08:26:01 nagiosprodxi1 CROND[10511]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26685]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26686]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26688]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26694]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26695]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26693]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26699]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Apr  8 08:27:01 nagiosprodxi1 CROND[26698]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28272]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28273]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28274]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28271]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28276]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28277]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28278]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Apr  8 08:28:01 nagiosprodxi1 CROND[28275]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11531]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11533]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11532]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11538]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11537]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11540]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11541]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Apr  8 08:29:01 nagiosprodxi1 CROND[11539]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
[root@nagiosprodxi1 ~]# ls -ld /home/nagios
drwx------. 30 nagios users 4096 Apr  8 08:14 /home/nagios
[root@nagiosprodxi1 ~]# grep nag /etc/group
nagios:x:501:nagios,apache,my02390,mycp1z2k
nagcmd:x:502:nagios,apache,my02390,mycp1z2k
[root@nagiosprodxi1 ~]#


Re: Schedule a forced immediate check not working

Posted: Wed Apr 08, 2015 4:09 pm
by abrist
Can you post the config for the User's associated contact? Go to the ccm --> contacts --> click the disk icon next to the user's contact and post the relevant config block for the contact.

Re: Schedule a forced immediate check not working

Posted: Wed Apr 08, 2015 5:54 pm
by rajasegar
abrist wrote:Can you post the config for the User's associated contact? Go to the ccm --> contacts --> click the disk icon next to the user's contact and post the relevant config block for the contact.
The person doing the schedule immediate checks is Admin users in XI.
There is no user in the contact with the same user name.
It was working fine since day 1. Just 1 month back it became very slow and sometimes shows failed to execute.

Code: Select all

define contact {
	contact_name                  		INFRA_MON_Raja_Segar
	alias                         		Raja Segar
	email                         		[email protected]
	pager                         		+60122345678
	use                           		generic-contact
	}	

Re: Schedule a forced immediate check not working

Posted: Thu Apr 09, 2015 12:12 pm
by tgriep
Can you run the following in a shell when you Schedule an immediate check with the user account you are having problems with and post the output back here?

Code: Select all

tail -f /usr/local/nagiosxi/var/cmdsubsys.log