Fusion database lock issue

This support forum board is for questions relating to Nagios Fusion.
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Fusion database lock issue

Post by hbouma »

Overnight, we had our Fusion database get into a locked state and require a cycle. For reference, we have had multiple issues with fusion databases and stability in the past (https://support.nagios.com/forum/viewto ... 17&t=49945) and (https://support.nagios.com/forum/viewto ... 17&t=50159). I am attaching the mariadb.log file and hoping that someone will be able to help us diagnose the issue and prevent it from happening again.

We are running Fusion 4.1.5 on Red Hat 7 64bit VM's. 8 cores and 16GB of RAM.

my.cnf file is as follows:

Code: Select all

[mysqld]
max_connections=818
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Settings user and group are ignored when systemd is used.
# If you need to run mysqld under a different user or group,
# customize your systemd unit file for mariadb according to the
# instructions in http://fedoraproject.org/wiki/Systemd

thread_cache_size = 16

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
max_allowed_packet = 32M

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
query_cache_size = 6M

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
query_cache_limit = 4M

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
tmp_table_size = 64M

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
max_heap_table_size = 64M

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
key_buffer_size = 32M

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
table_open_cache = 32

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
innodb_file_per_table = 1

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
innodb_log_buffer_size = 32M

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
innodb_buffer_pool_size = 6G

# Added by Nagios 2018-08-27 09:20:12 -0400 EDT
innodb_log_file_size = 256M

[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid

#
# include all files from the config directory
#
!includedir /etc/my.cnf.d
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Fusion database lock issue

Post by ssax »

After reviewing multiple pages it seems like this may be related to the innodb_adaptive_hash_index being set to on.

Please try setting innodb_adaptive_hash_index=0 in your /etc/my.cnf and restart the mariadb service:

Code: Select all

service mariadb restart
https://stackoverflow.com/a/24910831
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Fusion database lock issue

Post by ssax »

In addition, are you seeing any semaphore errors in /var/log/messages?
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Fusion database lock issue

Post by hbouma »

I don't see anything about semaphores, but I do see that we apparently ran out of swap space, which is odd because we have 16GB of RAM, and 8GB of swap. We usually run with about 8GB of the RAM free.

Code: Select all

KiB Mem : 16266712 total,  7960080 free,  1762268 used,  6544364 buff/cache
KiB Swap:  8388600 total,  7880432 free,   508168 used. 14066716 avail Mem
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Fusion database lock issue

Post by ssax »

With the DB locking preventing the queries from finishing I could see the poll_subsys.php spooling up multiple copies because they were not finishing, let us know if the innodb_adaptive_hash_index=0 in your /etc/my.cnf resolves your issue.

Thank you
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Fusion database lock issue

Post by hbouma »

Unfortunately, the issue occurred again this weekend. Same issue of running out of memory.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Fusion database lock issue

Post by tgriep »

Do you know which process was using up the memory?
Can you get the /var/log/messages file from when it was failing and post that here so we can check it for any errors?
And, get this file from the Fusion server and post it as well so we can check it's settings.

Code: Select all

/etc/sysctl.conf
Thanks
Be sure to check out our Knowledgebase for helpful articles and solutions!
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Fusion database lock issue

Post by hbouma »

PM sent with logs.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Fusion database lock issue

Post by tgriep »

If looks like the poll_subsys.php script had some sort of problem and kept on running multiple copies until they used up all of the memory so I will need to see the following files from yesterday.

Code: Select all

/var/log/cron
/usr/local/nagiosfusion/var/log/poll_subsys.log
Can you run the following as root and post the /tmp/info.txt file.

Code: Select all

echo 'select * from servers;' |mysql -t -u fusion -pfusion fusion >/tmp/info.txt
echo 'select * from sysstat;' |mysql -t -u fusion -pfusion fusion >>/tmp/info.txt
echo 'select * from polled_averages;' |mysql -t -u fusion -pfusion fusion >>/tmp/info.txt
echo 'select * from polled_deltas;' |mysql -t -u fusion -pfusion fusion >>/tmp/info.txt
echo 'select * from polling_lock;' |mysql -t -u fusion -pfusion fusion >>/tmp/info.txt
echo 'select * from options;' |mysql -t -u fusion -pfusion fusion >>/tmp/info.txt
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Fusion database lock issue

Post by tgriep »

The cron file was only for today so you will have to get one of the older archived .gz files and post it here.

Also, get this file as well. If any of them do not show any data for today, get the archived copy instead.

Code: Select all

/usr/local/nagiosfusion/var/log/dberrors.log
And any log file from this folder for yesterday added to the post would help as well.

Code: Select all

/usr/local/nagiosfusion/var/log
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked