Page 2 of 3
Re: Schedule a forced immediate check not working
Posted: Wed Mar 25, 2015 6:21 pm
by rajasegar
tgriep wrote:For a test, can you disable the gearman worker that is running on the XI server so the XI server will do the local checks instead of the gearman worker?
This may increase the performance of the XI server.
We added gearman because local server had issues handling the scheduling.
Not taking any chances on the production box sending out invalid alerts due to this.
Re: Schedule a forced immediate check not working
Posted: Thu Mar 26, 2015 10:11 am
by jolson
Try navigating to Service Details -> (select a service) -> Advanced, and then select the link "See this service in Nagios Core." See if you can access that url and submit the command correctly/quickly. Core may output more verbose errors.
Re: Schedule a forced immediate check not working
Posted: Sun Apr 05, 2015 11:55 pm
by rajasegar
jolson wrote:Try navigating to Service Details -> (select a service) -> Advanced, and then select the link "See this service in Nagios Core." See if you can access that url and submit the command correctly/quickly. Core may output more verbose errors.
No error messages.
Does anyone have any clue? This happens so often. We have to restart the Gearman services to solve this problem.
Just restarting Nagios wont solve it.
Re: Schedule a forced immediate check not working
Posted: Mon Apr 06, 2015 11:10 am
by abrist
This is very odd. Gearman should not effect the command.cgi at all, as when you submit a command it is done through a an ajax call to the command cgi. You can even see this behavior by watch the requests from your browser console/network log. Do you see any errors in the apache logs at /var/log/httpd/error_log?
Re: Schedule a forced immediate check not working
Posted: Mon Apr 06, 2015 6:09 pm
by rajasegar
abrist wrote:This is very odd. Gearman should not effect the command.cgi at all, as when you submit a command it is done through a an ajax call to the command cgi. You can even see this behavior by watch the requests from your browser console/network log. Do you see any errors in the apache logs at /var/log/httpd/error_log?
See attached log. See a bunch of errors in it.
This happened yesterday 06/04/2014 around 12:30 - 12:55pm
Code: Select all
[Mon Apr 06 12:22:42 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/xi-index.php
[Mon Apr 06 12:22:42 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/xi-index.php
[Mon Apr 06 12:23:39 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/?cmd=view&type=service
[Mon Apr 06 12:23:39 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/?cmd=view&type=service
[Mon Apr 06 12:23:43 2015] [error] [client 10.17.44.34] PHP Notice: Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php
[Mon Apr 06 12:24:09 2015] [error] [client 10.17.44.34] PHP Notice: Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:25:01 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:25:01 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:25:32 2015] [error] [client 10.17.44.34] PHP Notice: Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php
[Mon Apr 06 12:25:47 2015] [error] [client 10.17.44.34] PHP Notice: Undefined index: template_name in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/common_settings.php on line 53, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:26:25 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: ac_needed_js_inject in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 176, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:26:25 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: sync_table_status in /usr/local/nagiosxi/html/includes/components/ccm/page_templates/ccm_table.php on line 196, referer: http://10.17.19.235/nagiosxi/includes/components/ccm/index.php?type=service&page=1
[Mon Apr 06 12:39:59 2015] [error] [client ::1] PHP Notice: Undefined index: language in /usr/local/nagiosxi/html/includes/components/ccm/includes/common_functions.inc.php on line 710
[Mon Apr 06 12:39:59 2015] [error] [client ::1] PHP Notice: Undefined index: language in /usr/local/nagiosxi/html/includes/components/ccm/includes/common_functions.inc.php on line 711
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 142, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 144, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 185, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:15 2015] [error] [client 10.17.38.5] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 187, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 142, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 144, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 185, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:23 2015] [error] [client 10.17.44.60] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 187, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:24 2015] [error] [client 10.17.44.34] PHP Warning: arsort() expects parameter 1 to be array, null given in /usr/local/nagiosxi/html/includes/components/latestalerts/latestalerts.inc.php on line 517, referer: http://10.17.19.235/nagiosxi//includes/page-home-main.php?&=
[Mon Apr 06 12:40:24 2015] [error] [client 10.17.44.34] PHP Warning: Invalid argument supplied for foreach() in /usr/local/nagiosxi/html/includes/components/latestalerts/latestalerts.inc.php on line 518, referer: http://10.17.19.235/nagiosxi//includes/page-home-main.php?&=
[Mon Apr 06 12:40:24 2015] [error] [client 10.17.44.34] PHP Notice: Undefined variable: c in /usr/local/nagiosxi/html/includes/components/latestalerts/latestalerts.inc.php on line 521, referer: http://10.17.19.235/nagiosxi//includes/page-home-main.php?&=
[Mon Apr 06 12:40:26 2015] [error] [client 10.17.38.5] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 142, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:26 2015] [error] [client 10.17.38.5] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 144, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
[Mon Apr 06 12:40:26 2015] [error] [client 10.17.38.5] PHP Warning: Division by zero in /usr/local/nagiosxi/html/includes/components/opscreen/merlin.php on line 185, referer: http://10.17.19.235/nagiosxi/includes/components/opscreen/opscreen.php
error.zip
Re: Schedule a forced immediate check not working
Posted: Tue Apr 07, 2015 1:23 pm
by abrist
If you can submit commands from core, but not XI, you may have an issue with the cmdsubsys cron. What is the output of:
Code: Select all
ps -aef | grep cron
service crond status
cat /etc/cron.d/nagiosxi
tail -25 /var/log/cron
ls -ld /home/nagios
grep nag /etc/group
Re: Schedule a forced immediate check not working
Posted: Tue Apr 07, 2015 7:31 pm
by rajasegar
abrist wrote:If you can submit commands from core, but not XI, you may have an issue with the cmdsubsys cron. What is the output of:
Code: Select all
ps -aef | grep cron
service crond status
cat /etc/cron.d/nagiosxi
tail -25 /var/log/cron
ls -ld /home/nagios
grep nag /etc/group
It was the same problem from core.
Here is the information requested.
Code: Select all
[root@nagiosprodxi1 ~]# ps -aef | grep cron
root 2535 1 0 Mar31 ? 00:00:47 crond
nagios 11532 11513 0 08:29 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios 11533 11519 0 08:29 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 11535 11532 1 08:29 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 11536 11533 1 08:29 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios 11538 11514 0 08:29 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios 11539 11516 0 08:29 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios 11541 11515 0 08:29 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
nagios 11545 11541 1 08:29 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios 11546 11538 1 08:29 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios 11562 11539 1 08:29 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 23085 7042 2 08:29 ? 00:00:00 /usr/local/nagios/libexec/check_nrpe -u -H 172.29.37.89 -t 60 -c rs_check_service -a cron
root 23092 22252 0 08:29 pts/0 00:00:00 grep cron
[root@nagiosprodxi1 ~]# service crond status
crond (pid 2535) is running...
[root@nagiosprodxi1 ~]# cat /etc/cron.d/nagiosxi
# /etc/cron.d/nagiosxi: crontab fragment for nagiosxi
# Backup MySQL & PostgreSQL Databases
0 7 * * * root /root/scripts/automysqlbackup
0 8 * * * root /root/scripts/autopostgresqlbackup
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1
01 * * * * nagios /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1
[root@nagiosprodxi1 ~]# tail -25 /var/log/cron
Apr 8 08:26:01 nagiosprodxi1 CROND[10511]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26685]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26686]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26688]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26694]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26695]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26693]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26699]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Apr 8 08:27:01 nagiosprodxi1 CROND[26698]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28272]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28273]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28274]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28271]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28276]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28277]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28278]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Apr 8 08:28:01 nagiosprodxi1 CROND[28275]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11531]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11533]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11532]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11538]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11537]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11540]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11541]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Apr 8 08:29:01 nagiosprodxi1 CROND[11539]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
[root@nagiosprodxi1 ~]# ls -ld /home/nagios
drwx------. 30 nagios users 4096 Apr 8 08:14 /home/nagios
[root@nagiosprodxi1 ~]# grep nag /etc/group
nagios:x:501:nagios,apache,my02390,mycp1z2k
nagcmd:x:502:nagios,apache,my02390,mycp1z2k
[root@nagiosprodxi1 ~]#
Re: Schedule a forced immediate check not working
Posted: Wed Apr 08, 2015 4:09 pm
by abrist
Can you post the config for the User's associated contact? Go to the ccm --> contacts --> click the disk icon next to the user's contact and post the relevant config block for the contact.
Re: Schedule a forced immediate check not working
Posted: Wed Apr 08, 2015 5:54 pm
by rajasegar
abrist wrote:Can you post the config for the User's associated contact? Go to the ccm --> contacts --> click the disk icon next to the user's contact and post the relevant config block for the contact.
The person doing the schedule immediate checks is Admin users in XI.
There is no user in the contact with the same user name.
It was working fine since day 1. Just 1 month back it became very slow and sometimes shows failed to execute.
Code: Select all
define contact {
contact_name INFRA_MON_Raja_Segar
alias Raja Segar
email [email protected]
pager +60122345678
use generic-contact
}
Re: Schedule a forced immediate check not working
Posted: Thu Apr 09, 2015 12:12 pm
by tgriep
Can you run the following in a shell when you Schedule an immediate check with the user account you are having problems with and post the output back here?
Code: Select all
tail -f /usr/local/nagiosxi/var/cmdsubsys.log