Page 1 of 1
update config timeout/stalls
Posted: Fri Oct 24, 2014 8:26 pm
by pkarr
We are having a problem with our soon to be production server
When running an update config it, never completes. I've followed you suggestions in the NagiosXI FAQ,
updating the resource limits in /etc/php.ini and the kernel.msgmni = 256000 in /etc/sysctl.conf, but it didn't work.
We upgraded from 2014R1.3 to 2014R1.5 yesterday. However, I'm told we were having this issue before the upgrade.The mysql db is offloaded to a remote server, which I set up using your documentation.
Additionally, I just noticed that under XI System Component Status, the only processes that show as running (and are, I've confirmed it) as are the
Monitoring Engine,
Performance Grapher and
Database Backend. The rest, the nagiosxi scripts that are run out of cron aren't working. Crond is running and the crontab entries for these scripts look correct, I was able to run them manually.
Penny Karr | IT Infrastructure Monitoring
Harvard Vanguard Medical Associates, an Affiliate of Atrius Health
254 Second Avenue | Needham, MA 02494
P (781) 292-1853 | F (781 292-1980 |
http://www.harvardvanguard.org
Email:
[email protected]
Re: update config timeout/stalls
Posted: Sat Oct 25, 2014 7:25 pm
by krobertson71
I am no expert but from the looks of the ApacheError log Nagios Xi cannot connect to the PostgresSql backend database
Code: Select all
[Fri Oct 24 20:10:06 2014] [error] [client 172.30.240.176] PHP Warning: simplexml_load_string(): Message: A database connection error has been detected, we are attempting to rep in /usr/local/nagiosxi/html/includes/utils-backend.inc.php on line 27, referer: http://lkensherlockp01/nagiosxi/admin/
[Fri Oct 24 20:10:06 2014] [error] [client 172.30.240.176] PHP Warning: simplexml_load_string(): ^ in /usr/local/nagiosxi/html/includes/utils-backend.inc.php on line 27, referer: http://lkensherlockp01/nagiosxi/admin/
[Fri Oct 24 20:10:06 2014] [error] [client ::1] PHP Warning: pg_pconnect(): Unable to connect to PostgreSQL server: could not connect to server: No such file or directory\n\tIs the server running locally and accepting\n\tconnections on Unix domain socket "/tmp/.s.PGSQL.5432"? in /usr/local/nagiosxi/html/db/adodb/drivers/adodb-postgres64.inc.php on line 682
[Fri Oct 24 20:10:06 2014] [error] [client ::1] PHP Notice: Undefined variable: result in /usr/local/nagiosxi/html/includes/db.inc.php on line 241
[Fri Oct 24 20:10:06 2014] [error] [client 172.30.240.176] PHP Warning: simplexml_load_string(): Entity: line 1: parser error : Start tag expected, '<' not found in /usr/local/nagiosxi/html/includes/utils-backend.inc.php on line 27, referer: http://lkensherlockp01/nagiosxi/admin/
Re: update config timeout/stalls
Posted: Mon Oct 27, 2014 10:13 am
by abrist
First of all, it does not look like the mysql db is properly offloaded. From your ndo2db.cfg:
db_servertype=mysql
db_host=localhost
db_port=3306
Editing this file should have been in the documentation.
pkarr wrote:Crond is running and the crontab entries for these scripts look correct, I was able to run them manually.
Lets check just to be sure:
What is your system umask:
Do you force SSL?
Re: update config timeout/stalls
Posted: Mon Oct 27, 2014 11:19 am
by pkarr
Hi Andy and krobertson71,
I think I've got it now. That reply from krobertson71 helped alot. Thanks!
I restarted the postgresdb and also mysqldb which had stopped.
Then, looking at the apache logs I found the following error:
blah...blah.. (nagios) FAILED to authorize user with PAM (Authentication token is no longer vaild, a new one is required)
From there I found that the nagios account had expired on Oct 9th, so I set it to non-expiring and that seems to have fixed it.
The nagiosxi cron jobs are running again and an update config completes promptly.
I'm curious as to why the ndo2db.cfg on LKENSHERLOCKP01 (nagios xi server) would show that I just doubled checked and I had it right.Could it be that since the postgresql db wasn't connecting the info wasn't getting updated properly?
Since we are keen to make sure that our mysqldb was offloaded properly, I'm including a copy of the ndo2db.cfg from LKENSHERLOCKP01, a fresh copy of the server profile and the other info you requested.
Please let me know if there is anything else we should double check.
thanks so much!
Penny
[root@lkensherlockp01 ~]# ps -aef | grep cron
nagios 786 776 0 12:01 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios 787 780 0 12:01 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 788 777 0 12:01 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios 790 779 0 12:01 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios 791 778 0 12:01 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
nagios 796 786 0 12:01 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 797 788 0 12:01 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios 798 790 0 12:01 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 799 791 0 12:01 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios 800 787 0 12:01 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
root 942 25441 0 12:01 pts/3 00:00:00 grep cron
root 27180 1 0 Oct24 ? 00:00:05 crond
[root@lkensherlockp01 ~]# umask
0022
[root@lkensherlockp01 ~]#
Re: update config timeout/stalls
Posted: Mon Oct 27, 2014 12:23 pm
by abrist
pkarr wrote:I'm curious as to why the ndo2db.cfg on LKENSHERLOCKP01 (nagios xi server) would show that I just doubled checked and I had it right.Could it be that since the postgresql db wasn't connecting the info wasn't getting updated properly?
Could you have posted an old copy of your profile.zip? (created before the offload)
Your newly posted configs look correct. Congrats on solving the issue. Is this thread lock-ready?
Re: update config timeout/stalls
Posted: Mon Oct 27, 2014 12:38 pm
by pkarr
It must have been that.
Sure you can go ahead and lock this one.
Thanks,
Penny
Re: update config timeout/stalls
Posted: Mon Oct 27, 2014 12:42 pm
by cmerchant
OK.