Page 2 of 3

Re: httpd stops every hour

Posted: Fri Nov 19, 2021 10:06 am
by gsmith
Hi

Please use [email protected]

Thanks

Re: httpd stops every hour

Posted: Mon Nov 22, 2021 3:48 am
by btsmnagios
Password sent.

Many thanks for your patience.

Re: httpd stops every hour

Posted: Tue Nov 23, 2021 9:57 am
by gsmith
Hi

Thanks - I am able to open the messages file.

Will let you know when I find something.

- G

Re: httpd stops every hour

Posted: Tue Nov 23, 2021 11:47 am
by gsmith
Hi

We are suspecting there is something going on with the database. The DBMaint script runs hourly.

To test this please go to Admin, Performance Settings and go to the Databases tab. Scroll to the bottom,
In the "CCM Database" section change the "Optimize Interval" from 60 to 30

Now wait 90 to 120 minutes to see if httpd restarts every 30 minutes. If it does then we need to look
for corrupted/damaged db tables.

Let me know what the results are. You can change the "Optimize Interval" back to 60 after this test.

Thanks

Re: httpd stops every hour

Posted: Tue Nov 23, 2021 2:26 pm
by btsmnagios
Hi,

There was a suspicion this may be db related as we had to perform db repair twice after out last XI upgrade.

No difference other than the first time after the optimization interval was changed to 30, we saw the number of processes spike to 950 instead of the usual 800ish. Interface came back up after about 5 minutes again then the same behaviour repeated again after about another hour with processes spiking to 800.
To check overnight I have increased the Optimize Interval from 60 to 90 minutes for the XI database and increased the same value to 75 minutes for the NDO database to see if we have any sort of change and will report back in the morning.

Re: httpd stops every hour

Posted: Tue Nov 23, 2021 3:06 pm
by gsmith
Ok - let me know what happens.

Thanks you

Re: httpd stops every hour

Posted: Wed Nov 24, 2021 2:22 am
by btsmnagios
Hi,

Appears to be the XI Database causing the running processes spike. After increasing to 90 minutes we see the spikes happening after 90 minutes and the web interface becoming unreachable during the same times.

Re: httpd stops every hour

Posted: Wed Nov 24, 2021 10:11 am
by gsmith
Hi

Let's try to stop everything and do a database repair and restart. Please log in as root and run the following:

Code: Select all

systemctl stop npcd
systemctl stop nagios
systemctl stop ndo2db
systemctl stop crond
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
if grep --quiet pgsql /usr/local/nagiosxi/html/config.inc.php; then systemctl stop postgresql; fi;
systemctl restart mariadb
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /usr/local/nagios/var/ndo.sock
rm -f /usr/local/nagios/var/ndo2db.lock
rm -f /var/lib/mrtg/mrtg_l
rm -f /usr/local/nagiosxi/var/*.lock
rm -f /usr/local/nagiosxi/tmp/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill python
if grep --quiet pgsql /usr/local/nagiosxi/html/config.inc.php; then service postgresql start; fi;
systemctl restart httpd
systemctl start ndo2db
systemctl start nagios
systemctl start npcd
systemctl start crond
I realize that you will probably have to do this tonight or over the weekend, let me know what happens.

Thanks

Re: httpd stops every hour

Posted: Wed Dec 01, 2021 8:32 am
by btsmnagios
Hi,

Have run the commands this afternoon and initially all seemed ok however we've now reverted back to the spike in processes and the UI drops out again.

Re: httpd stops every hour

Posted: Wed Dec 01, 2021 8:41 am
by btsmnagios
further to this it actually appears to be worse now, we're losing the UI every few minutes.