Page 1 of 1

65 minute hang up

Posted: Thu Sep 03, 2020 9:35 am
by Elcom
Hello,

Every 65 minutes our XI server stops servicing requests for approximately 10 minutes (used to be shorter than 10 minutes but we are doing initial setup and as we grow the hang has as well). looking at Top we can see that mysqld is running pretty high.

we believe this is caused by this job:
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php >> /usr/local/nagiosxi/var/dbmaint.log 2>&1

I am told one of the things this does is check and run a DB optimization if it has not run in the last hour.

We are running the downloaded XI VM. 4 vCPU's 16GB RAM.
Currently ~150 nodes with ~1860 services.

The virtual disk is very busy during the hang.

Any help would be appreciated.
Thanks

Re: 65 minute hang up

Posted: Fri Sep 04, 2020 5:10 am
by Elcom
Bump... Could use some help on this.

Re: 65 minute hang up

Posted: Fri Sep 04, 2020 10:07 am
by benjaminsmith
Hi @Elcom,

There might some database issues if the maintenance job is resulting in services not responding. Please send over the system profile and I can check the logs for you.

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.

Also, let's get query output on the size of the db tables. Thanks, Benjamin

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table