Update to 5.5.3 Crashes
-
osk.dthompson
- Posts: 16
- Joined: Tue Oct 18, 2016 2:58 pm
Update to 5.5.3 Crashes
Hello,
We've updated NagiosXI to 5.5.3 and now we're seeing Nagios crash periodically. During the crash, it looks like some kind of process block or something. The load increases, the run queue gets backed up, there are a lot of Nagios processes. I've also noticed some errors in the database that appear related to Nagios updates (see below).
Any thoughts?
Thank you,
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
ERROR: relation "xi_notifications" does not exist
STATEMENT: VACUUM ANALYZE xi_notifications;
We've updated NagiosXI to 5.5.3 and now we're seeing Nagios crash periodically. During the crash, it looks like some kind of process block or something. The load increases, the run queue gets backed up, there are a lot of Nagios processes. I've also noticed some errors in the database that appear related to Nagios updates (see below).
Any thoughts?
Thank you,
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
WARNING: nonstandard use of \\ in a string literal at character 33
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
ERROR: relation "xi_notifications" does not exist
STATEMENT: VACUUM ANALYZE xi_notifications;
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Update to 5.5.3 Crashes
@osk.dthompson, Which version of XI did you upgrade from? Please follow this article and increase the values in the php.ini file:
https://support.nagios.com/kb/article/n ... e-611.html
(I suggest using double values recommended in this tutorial).
Restart the apache with:
If this doesn't fix your issue, please send in your system profile.
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to a cloud storage of your choice. Then share a link with me in a personal message.
After you send a profile please post something in this thread to bring it back up in the support queue.
https://support.nagios.com/kb/article/n ... e-611.html
(I suggest using double values recommended in this tutorial).
Restart the apache with:
And let me know if the XI is still crashing.service httpd restart
If this doesn't fix your issue, please send in your system profile.
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to a cloud storage of your choice. Then share a link with me in a personal message.
After you send a profile please post something in this thread to bring it back up in the support queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
osk.dthompson
- Posts: 16
- Joined: Tue Oct 18, 2016 2:58 pm
Re: Update to 5.5.3 Crashes
We upgraded from 5.5.2. I'm checking out the link you sent and see how that helps.
We are seeing this in the apache error logs:
/var/log/httpd/ssl_error_log:[Mon Sep 24 16:09:16 2018] [error] [client 1.1.1.1] PHP Warning: pg_pconnect(): Unable to connect to PostgreSQL server: could not connect to server: Connection refused\n\tIs
the server running on host "localhost" and accepting\n\tTCP/IP connections on port 5432?\ncould not connect to server: Connection refused\n\tIs the server running on host "localhost" and accepting\n\tTCP/IP
connections on port 5432? in /usr/local/nagiosxi/html/db/adodb/drivers/adodb-postgres64.inc.php on line 699, referer: https://server.local.domain/nagiosxi/in ... &dest=auto
Thank you,
We are seeing this in the apache error logs:
/var/log/httpd/ssl_error_log:[Mon Sep 24 16:09:16 2018] [error] [client 1.1.1.1] PHP Warning: pg_pconnect(): Unable to connect to PostgreSQL server: could not connect to server: Connection refused\n\tIs
the server running on host "localhost" and accepting\n\tTCP/IP connections on port 5432?\ncould not connect to server: Connection refused\n\tIs the server running on host "localhost" and accepting\n\tTCP/IP
connections on port 5432? in /usr/local/nagiosxi/html/db/adodb/drivers/adodb-postgres64.inc.php on line 699, referer: https://server.local.domain/nagiosxi/in ... &dest=auto
Thank you,
Last edited by osk.dthompson on Mon Feb 25, 2019 1:06 pm, edited 1 time in total.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Update to 5.5.3 Crashes
@osk.dthompson, On top of tweaking the php settings please run the following script to check the database for possible corruption:
/usr/local/nagiosxi/scripts/repair_databases.sh
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
osk.dthompson
- Posts: 16
- Joined: Tue Oct 18, 2016 2:58 pm
Re: Update to 5.5.3 Crashes
I did find memory errors in the *rror_logs, so I increased 'memory_limit' to 512g, but we're still seeing the issues. Below is an example of Nagios running processes:
I'll try the database repair now.
Thank you,
nagios 2375 1 0 Sep23 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
nagios 2555 1 0 Sep23 ? 00:00:10 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
nagios 2585 1 0 Sep23 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
postgres 11427 15001 0 Sep24 ? 00:00:02 postgres: nagiosxi nagiosxi ::1(34402) idle
postgres 11832 15001 0 07:54 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42372) idle
nagios 13195 13192 0 07:58 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 13201 13195 0 07:58 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
postgres 13219 15001 0 07:58 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42506) idle
nagios 13609 13606 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1
nagios 13610 13608 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 13612 13607 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios 13613 13603 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios 13615 13609 5 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios 13616 13604 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios 13619 13605 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php >> /usr/local/nagiosxi/var/event_handler.log 2>&1
nagios 13622 13610 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios 13623 13613 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 13624 13619 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php
nagios 13625 13612 5 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 13627 13616 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
postgres 13630 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42538) idle
postgres 13631 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42540) idle
postgres 13633 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42544) idle
postgres 13637 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42546) idle
postgres 13663 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42548) idle
postgres 13667 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42550) idle
nagios 13767 13622 0 07:59 ? 00:00:00 sh -c /usr/bin/iostat -c 5 2 | tail --lines=2 | head --lines=1 | awk '{ print $1,$2,$3,$4,$5,$6 }'
nagios 13768 13767 0 07:59 ? 00:00:00 /usr/bin/iostat -c 5 2
nagios 13769 13767 0 07:59 ? 00:00:00 tail --lines=2
nagios 13770 13767 0 07:59 ? 00:00:00 head --lines=1
nagios 13771 13767 0 07:59 ? 00:00:00 awk { print $1,$2,$3,$4,$5,$6 }
root 13774 5097 0 07:59 pts/3 00:00:00 grep nagios
nagios 15048 1 27 Sep24 ? 04:20:55 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 15049 15048 37 Sep24 ? 06:00:37 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15050 15048 35 Sep24 ? 05:39:17 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15051 15048 11 Sep24 ? 01:50:10 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15052 15048 41 Sep24 ? 06:34:31 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15053 15048 39 Sep24 ? 06:18:06 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15054 15048 36 Sep24 ? 05:44:42 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15055 2585 0 Sep24 ? 00:00:02 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 15056 15055 0 Sep24 ? 00:01:45 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 15059 15048 0 Sep24 ? 00:00:02 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 15588 15049 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15590 15050 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15591 15054 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15592 15053 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15593 15052 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15594 15049 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15598 15053 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15602 15050 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
I'll try the database repair now.
Thank you,
nagios 2375 1 0 Sep23 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
nagios 2555 1 0 Sep23 ? 00:00:10 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
nagios 2585 1 0 Sep23 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
postgres 11427 15001 0 Sep24 ? 00:00:02 postgres: nagiosxi nagiosxi ::1(34402) idle
postgres 11832 15001 0 07:54 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42372) idle
nagios 13195 13192 0 07:58 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 13201 13195 0 07:58 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
postgres 13219 15001 0 07:58 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42506) idle
nagios 13609 13606 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1
nagios 13610 13608 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 13612 13607 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios 13613 13603 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios 13615 13609 5 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios 13616 13604 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1
nagios 13619 13605 0 07:59 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php >> /usr/local/nagiosxi/var/event_handler.log 2>&1
nagios 13622 13610 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios 13623 13613 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 13624 13619 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php
nagios 13625 13612 5 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 13627 13616 3 07:59 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
postgres 13630 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42538) idle
postgres 13631 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42540) idle
postgres 13633 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42544) idle
postgres 13637 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42546) idle
postgres 13663 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42548) idle
postgres 13667 15001 0 07:59 ? 00:00:00 postgres: nagiosxi nagiosxi ::1(42550) idle
nagios 13767 13622 0 07:59 ? 00:00:00 sh -c /usr/bin/iostat -c 5 2 | tail --lines=2 | head --lines=1 | awk '{ print $1,$2,$3,$4,$5,$6 }'
nagios 13768 13767 0 07:59 ? 00:00:00 /usr/bin/iostat -c 5 2
nagios 13769 13767 0 07:59 ? 00:00:00 tail --lines=2
nagios 13770 13767 0 07:59 ? 00:00:00 head --lines=1
nagios 13771 13767 0 07:59 ? 00:00:00 awk { print $1,$2,$3,$4,$5,$6 }
root 13774 5097 0 07:59 pts/3 00:00:00 grep nagios
nagios 15048 1 27 Sep24 ? 04:20:55 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 15049 15048 37 Sep24 ? 06:00:37 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15050 15048 35 Sep24 ? 05:39:17 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15051 15048 11 Sep24 ? 01:50:10 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15052 15048 41 Sep24 ? 06:34:31 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15053 15048 39 Sep24 ? 06:18:06 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15054 15048 36 Sep24 ? 05:44:42 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15055 2585 0 Sep24 ? 00:00:02 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 15056 15055 0 Sep24 ? 00:01:45 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 15059 15048 0 Sep24 ? 00:00:02 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 15588 15049 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15590 15050 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15591 15054 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15592 15053 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15593 15052 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15594 15049 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15598 15053 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
nagios 15602 15050 0 Sep24 ? 00:00:00 [check_nrpe] <defunct>
-
osk.dthompson
- Posts: 16
- Joined: Tue Oct 18, 2016 2:58 pm
Re: Update to 5.5.3 Crashes
We just updated to 5.5.4 to see if that would help. You had mentioned earlier you wanted to see our config--how do I provide that?
Thanks,
Thanks,
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Update to 5.5.3 Crashes
@osk.dthompson, The system profile? To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to a cloud storage of your choice. Then share a link with me in a personal message.
After you send a profile please post something in this thread to bring it back up in the support queue.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and upload it to a cloud storage of your choice. Then share a link with me in a personal message.
After you send a profile please post something in this thread to bring it back up in the support queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
osk.dthompson
- Posts: 16
- Joined: Tue Oct 18, 2016 2:58 pm
Re: Update to 5.5.3 Crashes
We have a company policy of not publishing detailed information publicly. Is there another way I can get this information to you (email, sftp, something)?
Thank you,
Thank you,
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Update to 5.5.3 Crashes
@osk.dthompson, You can send it as an attachment via private message. Other users will not be able to see it.
Otherwise, you can open a support ticket and upload the file in there.
https://support.nagios.com/tickets
Otherwise, you can open a support ticket and upload the file in there.
https://support.nagios.com/tickets
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
osk.dthompson
- Posts: 16
- Joined: Tue Oct 18, 2016 2:58 pm
Re: Update to 5.5.3 Crashes
I've created a support ticket. I think the number is #355825. I've also uploaded our Nagios profile.
Thank you,
Thank you,