Increasingly frequent DB connection errors

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
FLCUISIT
Posts: 93
Joined: Mon Feb 01, 2010 12:09 pm

Increasingly frequent DB connection errors

Post by FLCUISIT »

We have been having issues over the past 10 days where we have received the following error:

Message: A database connection error has been detected, we are attempting to repair the server, if the repair does not resolve the issue, please contact Nagios support.

Run the following from the CLI as root to attempt to repair the DB

/usr/local/nagiosxi/scripts/repair_databases.sh

We have run the DB repair script as root and get the following error at the end of script.

===============
REPAIR COMPLETE
===============
Could not open input file: nagiosxi_dbtype.php
Stopping ndo2db: done.
Starting ndo2db: done.
Running configuration check...
Stopping nagios:. done.
Starting nagios: done.

Rebooting the server will eventually resolve the issue, but we have had this 4 times over the past 10 days.

--

Kirk
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Increasingly frequent DB connection errors

Post by tmcdonald »

Let's get some baseline information:
  • What XI version is this?
  • How many hosts and services are you monitoring?
  • Is the MySQL database local or offloaded?
Former Nagios employee
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Increasingly frequent DB connection errors

Post by rkennedy »

To add to what @tmcdonald mentioned, what amount of resources do you have allocated to the machine?
Former Nagios Employee
FLCUISIT
Posts: 93
Joined: Mon Feb 01, 2010 12:09 pm

Re: Increasingly frequent DB connection errors

Post by FLCUISIT »

XI version 5.2.9
We are monitoring around 190 hosts and just over 3300 services in DB.
MySQL is local to the nagios host.

The host currently has 7 CPU and 6GB of RAM assigned to it.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Increasingly frequent DB connection errors

Post by lmiltchev »

For such a number of services, we recommend more than 8GB of RAM. You can review our "Nagios XI - Hardware Requirements" document here:

https://assets.nagios.com/downloads/nag ... ements.pdf

What is the output of the following command?

Code: Select all

du -a /var/lib/mysql | sort -n -r | head -n 10
If some of the tables are getting too large, yo may need to truncate them. To learn how to truncate mysql tables, review the document below:

https://assets.nagios.com/downloads/nag ... tabase.pdf
Be sure to check out our Knowledgebase for helpful articles and solutions!
FLCUISIT
Posts: 93
Joined: Mon Feb 01, 2010 12:09 pm

Re: Increasingly frequent DB connection errors

Post by FLCUISIT »

[root@nagios ~]# du -a /var/lib/mysql | sort -n -r | head -n 10
429668 /var/lib/mysql
404988 /var/lib/mysql/nagios
146064 /var/lib/mysql/nagios/nagios_statehistory.MYD
96736 /var/lib/mysql/nagios/nagios_logentries.MYD
62340 /var/lib/mysql/nagios/nagios_logentries.MYI
36032 /var/lib/mysql/nagios/nagios_statehistory.MYI
14288 /var/lib/mysql/nagios/nagios_notifications.MYD
10260 /var/lib/mysql/ibdata1
8604 /var/lib/mysql/nagios/nagios_systemcommands.MYD
6136 /var/lib/mysql/nagios/nagios_notifications.MYI


Will make modifications to the memory and review the repair document
FLCUISIT
Posts: 93
Joined: Mon Feb 01, 2010 12:09 pm

Re: Increasingly frequent DB connection errors

Post by FLCUISIT »

Still getting same error after running repair and truncating the 2 tables listed in the documents. Server has 10GB of RAM now to go with the 7 CPUs. Any other ideas?

--Kirk
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Increasingly frequent DB connection errors

Post by rkennedy »

Do you have a lot of failing checks by any chance? I've seen these raise the load on the machine, which then in turn starts taking much more resources then needed.

What is the output of the following commands?

Code: Select all

top|head -5
ps -eo pcpu,args --sort=-%cpu|head
Former Nagios Employee
FLCUISIT
Posts: 93
Joined: Mon Feb 01, 2010 12:09 pm

Re: Increasingly frequent DB connection errors

Post by FLCUISIT »

top|head -5
top - 16:30:26 up 48 min, 1 user, load average: 2.94, 2.29, 2.56
Tasks: 185 total, 2 running, 183 sleeping, 0 stopped, 0 zombie
Cpu(s): 19.1%us, 8.6%sy, 0.0%ni, 70.9%id, 1.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 3107480k total, 608316k used, 2499164k free, 85236k buffers
Swap: 1048568k total, 0k used, 1048568k free, 337512k cached
FLCUISIT
Posts: 93
Joined: Mon Feb 01, 2010 12:09 pm

Re: Increasingly frequent DB connection errors

Post by FLCUISIT »

I am not seeing failing checks that I can see,but I cannot get to the web console at this point.
Locked