Nagios XI 5.5.11 - Issues

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
msmulpuri
Posts: 27
Joined: Thu Sep 22, 2016 7:40 am

Nagios XI 5.5.11 - Issues

Post by msmulpuri »

Hi,

We have VM instance of Nagios XI 5.5.11 installed and running at a customer site. Seeing all kinds of issues as this instance monitors over 2300 of hosts and over 1100 of services. The monitoring also includes SNMP Traps and polling in place. SNMP Traps keeps on increasing and stuck in the as they slowly clear up. A very high number of traps generated. Also, seeing the below in dbmaint.log even after the suggested repair is done. The file ibdata1 (MariaDB) is huge and keeps on growing. Continuous performance degradation of the VM due to CPU spikes and large Memory and swap space usage. I have attached the System Profile to this post just in case if needed. Please help.

Configuration of the VM:
CentOS 7
Nagios XI 5.5.11
Memory allocated: 24 GB
Swap space: 16 GB

dbmaint.log output:
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' IS OLD - REMOVING
CREATING: /usr/local/nagiosxi/var/dbmaint.lock
<h3>Database Error</h3>A database connection error has been detected, please follow the repair prompt below. If the issue persists, please contact Nagios support.<p>Run the following from the CLI as root to attempt to repair the DB:<br><pre>/usr/local/nagiosxi/scripts/repair_databases.sh</pre></p>LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!
LOCKFILE '/usr/local/nagiosxi/var/dbmaint.lock' EXISTS - EXITING!

Support Edit: profile.zip has been downloaded shared with the team.
You do not have the required permissions to view the files attached to this post.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Nagios XI 5.5.11 - Issues

Post by benjaminsmith »

Hello @msmulpuri,
<h3>Database Error</h3>A database connection error has been detected, please follow the repair prompt below. If the issue persists, please contact Nagios support.<p>Run the following from the CLI as root to attempt to repair the DB:<br>
Besides the error above, also a fatal php database call in the Apache log, and this is most likely causing the CPU/performance issues.

Run through the following commands to stop the processes, clear the message queue, repair the database and restart.

Code: Select all

systemctl stop crond
systemctl stop npcd
systemctl stop nagios
systemctl stop ndo2db
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
systemctl restart mysqld || systemctl restart mariadb
cd /usr/local/nagiosxi/scripts
./repair_databases.sh
systemctl start npcd
systemctl start crond
systemctl start nagios
systemctl start ndo2db
After running the above commands, can you send over a fresh system profile along with the database configuration file ( /etc/my.cnf ), and post the full output of the following commands:

Check for Corrupted Tables

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table | grep NULL
Check Table Sizes

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -uroot -pnagiosxi --table
Thanks.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
msmulpuri
Posts: 27
Joined: Thu Sep 22, 2016 7:40 am

Re: Nagios XI 5.5.11 - Issues

Post by msmulpuri »

Hello,

First of all I would like to thank you for your quick response for my concern. Please find below the output for the steps that you requested me follow.

1. Database repair script yielded the below result.
===============
REPAIR COMPLETE
===============
DATABASE: nagiosql
TABLE:
/var/lib/mysql/nagiosql /usr/local/nagiosxi/var
No *.MYI files found, skipping nagiosql...
DATABASE: nagiosxi
TABLE:
/var/lib/mysql/nagiosxi /usr/local/nagiosxi/var
No *.MYI files found, skipping nagiosxi...

=======================
nagios database repair succeeded
nagiosql database repair skipped, no *.MYI files found
nagiosxi database repair skipped, no *.MYI files found

2. echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table | grep NULL
-No output returned

3. echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -uroot -pnagiosxi --table
+--------------------------------------------+------------+
| Table | Size in MB |
+--------------------------------------------+------------+
| nagios_acknowledgements | 0.00 |
| nagios_commands | 0.02 |
| nagios_commenthistory | 0.12 |
| nagios_comments | 0.00 |
| nagios_configfiles | 0.00 |
| nagios_configfilevariables | 0.01 |
| nagios_conninfo | 0.01 |
| nagios_contact_addresses | 0.00 |
| nagios_contact_notificationcommands | 0.01 |
| nagios_contactgroup_members | 0.00 |
| nagios_contactgroups | 0.00 |
| nagios_contactnotificationmethods | 282.09 |
| nagios_contactnotifications | 298.80 |
| nagios_contacts | 0.00 |
| nagios_contactstatus | 0.00 |
| nagios_customvariables | 0.23 |
| nagios_customvariablestatus | 0.23 |
| nagios_dbversion | 0.00 |
| nagios_downtimehistory | 0.00 |
| nagios_eventhandlers | 324.51 |
| nagios_externalcommands | 706.02 |
| nagios_flappinghistory | 0.05 |
| nagios_host_contactgroups | 0.00 |
| nagios_host_contacts | 0.19 |
| nagios_host_parenthosts | 0.00 |
| nagios_hostchecks | 0.00 |
| nagios_hostdependencies | 0.00 |
| nagios_hostescalation_contactgroups | 0.00 |
| nagios_hostescalation_contacts | 0.00 |
| nagios_hostescalations | 0.00 |
| nagios_hostgroup_members | 0.10 |
| nagios_hostgroups | 0.00 |
| nagios_hosts | 0.55 |
| nagios_hoststatus | 1.28 |
| nagios_instances | 0.00 |
| nagios_logentries | 2440.71 |
| nagios_notifications | 412.47 |
| nagios_objects | 0.32 |
| nagios_processevents | 0.03 |
| nagios_programstatus | 0.00 |
| nagios_runtimevariables | 0.00 |
| nagios_scheduleddowntime | 0.00 |
| nagios_service_contactgroups | 0.01 |
| nagios_service_contacts | 0.19 |
| nagios_service_parentservices | 0.00 |
| nagios_servicechecks | 0.00 |
| nagios_servicedependencies | 0.00 |
| nagios_serviceescalation_contactgroups | 0.00 |
| nagios_serviceescalation_contacts | 0.00 |
| nagios_serviceescalations | 0.00 |
| nagios_servicegroup_members | 0.04 |
| nagios_servicegroups | 0.00 |
| nagios_services | 0.32 |
| nagios_servicestatus | 0.73 |
| nagios_statehistory | 91.80 |
| nagios_systemcommands | 1.46 |
| nagios_timedeventqueue | 0.00 |
| nagios_timedevents | 0.00 |
| nagios_timeperiod_timeranges | 0.01 |
| nagios_timeperiods | 0.00 |
| tbl_command | 0.06 |
| tbl_contact | 0.03 |
| tbl_contactgroup | 0.03 |
| tbl_contacttemplate | 0.03 |
| tbl_domain | 0.03 |
| tbl_host | 0.48 |
| tbl_hostdependency | 0.03 |
| tbl_hostescalation | 0.03 |
| tbl_hostextinfo | 0.03 |
| tbl_hostgroup | 0.03 |
| tbl_hosttemplate | 0.03 |
| tbl_info | 0.17 |
| tbl_lnkContactToCommandHost | 0.02 |
| tbl_lnkContactToCommandService | 0.02 |
| tbl_lnkContactToContactgroup | 0.02 |
| tbl_lnkContactToContacttemplate | 0.02 |
| tbl_lnkContactToVariabledefinition | 0.02 |
| tbl_lnkContactgroupToContact | 0.02 |
| tbl_lnkContactgroupToContactgroup | 0.02 |
| tbl_lnkContacttemplateToCommandHost | 0.02 |
| tbl_lnkContacttemplateToCommandService | 0.02 |
| tbl_lnkContacttemplateToContactgroup | 0.02 |
| tbl_lnkContacttemplateToContacttemplate | 0.02 |
| tbl_lnkContacttemplateToVariabledefinition | 0.02 |
| tbl_lnkHostToContact | 0.16 |
| tbl_lnkHostToContactgroup | 0.02 |
| tbl_lnkHostToHost | 0.02 |
| tbl_lnkHostToHostgroup | 0.02 |
| tbl_lnkHostToHosttemplate | 0.11 |
| tbl_lnkHostToVariabledefinition | 0.09 |
| tbl_lnkHostdependencyToHost_DH | 0.02 |
| tbl_lnkHostdependencyToHost_H | 0.02 |
| tbl_lnkHostdependencyToHostgroup_DH | 0.02 |
| tbl_lnkHostdependencyToHostgroup_H | 0.02 |
| tbl_lnkHostescalationToContact | 0.02 |
| tbl_lnkHostescalationToContactgroup | 0.02 |
| tbl_lnkHostescalationToHost | 0.02 |
| tbl_lnkHostescalationToHostgroup | 0.02 |
| tbl_lnkHostgroupToHost | 0.09 |
| tbl_lnkHostgroupToHostgroup | 0.02 |
| tbl_lnkHosttemplateToContact | 0.02 |
| tbl_lnkHosttemplateToContactgroup | 0.02 |
| tbl_lnkHosttemplateToHost | 0.02 |
| tbl_lnkHosttemplateToHostgroup | 0.02 |
| tbl_lnkHosttemplateToHosttemplate | 0.02 |
| tbl_lnkHosttemplateToVariabledefinition | 0.02 |
| tbl_lnkServiceToContact | 0.23 |
| tbl_lnkServiceToContactgroup | 0.02 |
| tbl_lnkServiceToHost | 0.06 |
| tbl_lnkServiceToHostgroup | 0.02 |
| tbl_lnkServiceToServicegroup | 0.02 |
| tbl_lnkServiceToServicetemplate | 0.06 |
| tbl_lnkServiceToVariabledefinition | 0.05 |
| tbl_lnkServicedependencyToHost_DH | 0.02 |
| tbl_lnkServicedependencyToHost_H | 0.02 |
| tbl_lnkServicedependencyToHostgroup_DH | 0.02 |
| tbl_lnkServicedependencyToHostgroup_H | 0.02 |
| tbl_lnkServicedependencyToService_DS | 0.02 |
| tbl_lnkServicedependencyToService_S | 0.02 |
| tbl_lnkServiceescalationToContact | 0.02 |
| tbl_lnkServiceescalationToContactgroup | 0.02 |
| tbl_lnkServiceescalationToHost | 0.02 |
| tbl_lnkServiceescalationToHostgroup | 0.02 |
| tbl_lnkServiceescalationToService | 0.02 |
| tbl_lnkServicegroupToService | 0.05 |
| tbl_lnkServicegroupToServicegroup | 0.02 |
| tbl_lnkServicetemplateToContact | 0.02 |
| tbl_lnkServicetemplateToContactgroup | 0.02 |
| tbl_lnkServicetemplateToHost | 0.02 |
| tbl_lnkServicetemplateToHostgroup | 0.02 |
| tbl_lnkServicetemplateToServicegroup | 0.02 |
| tbl_lnkServicetemplateToServicetemplate | 0.02 |
| tbl_lnkServicetemplateToVariabledefinition | 0.02 |
| tbl_lnkTimeperiodToTimeperiod | 0.02 |
| tbl_logbook | 0.02 |
| tbl_mainmenu | 0.02 |
| tbl_permission | 0.02 |
| tbl_permission_inactive | 0.02 |
| tbl_service | 0.30 |
| tbl_servicedependency | 0.03 |
| tbl_serviceescalation | 0.03 |
| tbl_serviceextinfo | 0.03 |
| tbl_servicegroup | 0.03 |
| tbl_servicetemplate | 0.03 |
| tbl_session | 0.02 |
| tbl_session_locks | 0.02 |
| tbl_settings | 0.03 |
| tbl_submenu | 0.02 |
| tbl_timedefinition | 0.02 |
| tbl_timeperiod | 0.03 |
| tbl_user | 0.03 |
| tbl_variabledefinition | 0.19 |
| xi_auditlog | 2.06 |
| xi_auth_tokens | 1.03 |
| xi_cmp_trapdata | 0.03 |
| xi_cmp_trapdata_log | 0.03 |
| xi_commands | 0.02 |
| xi_eventqueue | 2348.66 |
| xi_events | 3966.00 |
| xi_incidents | 0.02 |
| xi_meta | 149175.50 |
| xi_options | 0.06 |
| xi_sessions | 0.03 |
| xi_sysstat | 0.03 |
| xi_usermeta | 0.17 |
| xi_users | 0.03 |
+--------------------------------------------+------------+

4. Below is the content of /etc/my.cnf file
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Settings user and group are ignored when systemd is used.
# If you need to run mysqld under a different user or group,
# customize your systemd unit file for mariadb according to the
# instructions in http://fedoraproject.org/wiki/Systemd

[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid

#
# include all files from the config directory
#
!includedir /etc/my.cnf.d

5. Please find attached fresh system profile after the above steps followed.

Once gain I really appreciate your help in assisting me resolve the issue. I have here with listed the ibdata1 file with its size as well. I need your help in reducing the size of this file as well. It grew insanely big.
157G Aug 15 20:03 ibdata1

I have also attached screen capture of Admin
Please let me know if you need anything else.
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
You do not have the required permissions to view the files attached to this post.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Nagios XI 5.5.11 - Issues

Post by benjaminsmith »

Hello @msmulpuri,

Some of the tables in the nagiosxi database has grown so large it's preventing the server from operating correcting. The server may have been shutdown incorrectly, corrupting tables and causing the database tables to grow excessively large.

Run the following commands to truncate the tables, and let us know if this resolve the issue for you.

Code: Select all

systemctl stop crond
systemctl stop npcd
systemctl stop nagios
systemctl stop ndo2db
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
rm -rf /usr/local/nagios/var/ndo2db.lock
rm -rf /usr/local/nagios/var/ndo2db.pid
rm -rf /usr/local/nagios/var/ndo2db.sock
rm -rf /usr/local/nagios/var/ndo.sock
rm -rf /us/local/nagiosxi/var/subsys/ndo2db
rm -rf /var/run/nagios.lock
rm -rf /usr/local/nagios/var/nagios.lock
systemctl restart mysqld || systemctl restart mariadb
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -uroot -pnagiosxi -h 127.0.0.1 nagiosxi
systemctl start ndo2db
systemctl start nagios
systemctl start npcd
systemctl start crond
systemctl restart httpd
systemctl restart snmptt
Also, I would recommend increasing the max connections for the database. We have step-by-step instructions for this process in the following knowledge-base article.
Nagios XI - MySQL/MariaDB - Max Connections
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
msmulpuri
Posts: 27
Joined: Thu Sep 22, 2016 7:40 am

Re: Nagios XI 5.5.11 - Issues

Post by msmulpuri »

Hi,

I have followed the steps to resize the ibdata1 file and then your steps to further troubleshoot the issue. The steps worked and please close the topic. Thank you very much for all your help!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios XI 5.5.11 - Issues

Post by scottwilkerson »

msmulpuri wrote:Hi,

I have followed the steps to resize the ibdata1 file and then your steps to further troubleshoot the issue. The steps worked and please close the topic. Thank you very much for all your help!
Great!

Locking
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked