Page 1 of 7

Unable to ACK (again)

Posted: Mon Jan 25, 2016 3:46 pm
by highness
We have a problem that rears it's head every once-in-a-while, but now it's happening 1-2 times a week where we are unable to acknowledge or downtime alerts.

We opened a ticket about this a while back: https://support.nagios.com/forum/viewto ... 16&t=32366

We've done everything in that ticket, but still no joy.

We're running Nagios XI 2014R2.6.

Has anyone else run into this issue? Please help.

Re: Unable to ACK (again)

Posted: Tue Jan 26, 2016 12:30 pm
by highness
It's starting to appear that this happens after we do a configuration apply.

Re: Unable to ACK (again)

Posted: Tue Jan 26, 2016 12:40 pm
by hsmith
What's the cron log look like?

Code: Select all

tail -n50 /var/log/cron

Re: Unable to ACK (again)

Posted: Tue Jan 26, 2016 12:56 pm
by highness
hsmith wrote:What's the cron log look like?

Code: Select all

Jan 26 09:50:01 fe1 CROND[57136]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 26 09:50:01 fe1 CROND[57137]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 26 09:50:01 fe1 CROND[57138]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 26 09:50:01 fe1 CROND[57139]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 26 09:50:01 fe1 CROND[57140]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 26 09:50:01 fe1 CROND[57141]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1)
Jan 26 09:50:01 fe1 CROND[57142]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62052]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62053]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62054]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62056]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62055]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62058]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62059]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 26 09:51:01 fe1 CROND[62060]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1762]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1764]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1763]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1765]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1766]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1767]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1768]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 26 09:52:01 fe1 CROND[1769]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6401]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6400]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6402]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6403]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6404]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6405]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6406]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 26 09:53:01 fe1 CROND[6407]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10826]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10827]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10831]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10832]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10830]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10833]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10835]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 26 09:54:01 fe1 CROND[10834]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15742]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok)
Jan 26 09:55:01 fe1 CROND[15745]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15746]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15747]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15748]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15749]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15751]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15752]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15753]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15755]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 26 09:55:01 fe1 CROND[15754]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)

Re: Unable to ACK (again)

Posted: Tue Jan 26, 2016 5:51 pm
by jolson
First, try an acknowledge or similar and be certain that it failed. After the failure, hopefully one of the following logs will record some usable data to go off of:

Code: Select all

tail -n200 /usr/local/nagios/var/nagios.log
tail -n20 /var/log/httpd/*_log
Ensure that SELinux is off:

Code: Select all

sestatus
Let us know what you find out - thank you!

Re: Unable to ACK (again)

Posted: Thu Jan 28, 2016 10:06 am
by highness
SE Linux is disabled and I PM'd you the logs

Re: Unable to ACK (again)

Posted: Thu Jan 28, 2016 3:34 pm
by scottwilkerson
Can you show the current

Code: Select all

ls -la /usr/local/nagios/var/rw/
Also

Code: Select all

cat /etc/group|grep nag

Re: Unable to ACK (again)

Posted: Thu Jan 28, 2016 4:51 pm
by highness
scottwilkerson wrote:Can you show the current

Code: Select all

ls -la /usr/local/nagios/var/rw/

Code: Select all

[email protected] (Linux) $ ls -la /usr/local/nagios/var/rw/
total 12
drwxrwsr-x 2 nagios nagios 4096 Jan 28 13:39 .
drwxrwxr-x 6 nagios nagios 4096 Jan 28 13:50 ..
prw-rw---- 1 nagios nagios    0 Jan 28 13:39 nagios.cmd
srw-rw---- 1 nagios nagios    0 Jan 28 13:39 nagios.qh
-rw-rw-r-- 1 nagios nagios 1067 Dec 19  2014 nsca.dump
scottwilkerson wrote:Also

Code: Select all

cat /etc/group|grep nag

Code: Select all

[email protected] (Linux) $ cat /etc/group|grep nag
nagios:x:500:nagios,apache,snmptt
nagcmd:x:501:nagios,apache,snmptt

Re: Unable to ACK (again)

Posted: Fri Jan 29, 2016 12:52 pm
by ssax
Try this:

Code: Select all

service nagios stop
rm -rf /usr/local/nagios/var/rw/*
chown nagios.nagcmd /usr/local/nagios/var/rw
chmod g+s /usr/local/nagios/var/rw
service nagios start

Re: Unable to ACK (again)

Posted: Tue Feb 02, 2016 12:10 pm
by highness
ssax wrote:Try this:

Code: Select all

service nagios stop
rm -rf /usr/local/nagios/var/rw/*
chown nagios.nagcmd /usr/local/nagios/var/rw
chmod g+s /usr/local/nagios/var/rw
service nagios start
That seemed to fix the problem. We'll keep an eye on it this week and make sure this fixes it.