Page 1 of 1

A database connection error

Posted: Fri Oct 13, 2017 8:44 am
by mholtaftac
I'm continuously getting the error message below. Running the script doesn't resolve anything. Rebooting the server temporarily fixes the problem for about 12hrs.
I think I need help now...

Thanks in advance

Mike


Message: A database connection error has been detected, we are attempting to repair the server, if the repair does not resolve the issue, please contact Nagios support.

Run the following from the CLI as root to attempt to repair the DB

/usr/local/nagiosxi/scripts/repair_databases.sh

Re: A database connection error

Posted: Fri Oct 13, 2017 9:59 am
by eloyd
When you run the script, what's the output? I'm guessing that there's a database that's corrupted beyond what the repair script can do and you're not seeing it in all the output that gets generated.

Re: A database connection error

Posted: Fri Oct 13, 2017 10:27 am
by dwhitfield
eloyd wrote:I'm guessing that there's a database that's corrupted beyond what the repair script can do
This is certainly possible, but considering it works for 12 hours, I suspect you are running out of space. It would be good to see the output of the script though!

What's the output of the following? Please note that you will need to remove the "#" before each command line for the commands to work.
# df -i
# df -h
# du -hsx * | sort -rh | head -10

If you are out of space, please follow https://assets.nagios.com/downloads/nag ... M-Disk.pdf

If you are not out of space, there are some additional db repair steps at https://assets.nagios.com/downloads/nag ... tabase.pdf

***If the instructions in the document do not resolve the issue, please continue.***

Regarding the instructions below, if you do not have killall, you can install it via the following command:
# yum install psmisc

If psmisc is not in your repos, then instead you can check to make sure nagios is not running with
# ps -aef | grep nagios

If that document does not resolve your issue, please run the following commands in order and report any errors. You ***must*** use mariadb instead of mysqld in the commands below, ***if*** you have mariadb.
# service nagios stop
# service ndo2db stop
# service mysqld stop
# service crond stop
# service httpd stop
# killall -9 nagios
# killall -9 ndo2db
# rm -f /usr/local/nagios/var/rw/nagios.cmd
# rm -f /usr/local/nagios/var/nagios.lock
# rm -f /usr/local/nagios/var/ndo.sock
# rm -f /usr/local/nagios/var/ndo2db.lock
# rm -f /usr/local/nagiosxi/var/reconfigure_nagios.lock
# for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
# service mysqld start
# service ndo2db start
# service nagios start
# service httpd start
# service crond start

Re: A database connection error

Posted: Fri Oct 13, 2017 10:38 am
by mholtaftac
Here's the output.

===============
REPAIR COMPLETE
===============
Could not open input file: nagiosxi_dbtype.php
Stopping ndo2db: done.
Starting ndo2db: done.
Running configuration check...
Stopping nagios:. done.
Starting nagios: done.
You have mail in /var/spool/mail/root

Re: A database connection error

Posted: Fri Oct 13, 2017 10:40 am
by mholtaftac
[root@tmnagiosxi ~]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/VolGroup-lv_root
2441712 92123 2349589 4% /
tmpfs 1957166 1 1957165 1% /dev/shm
/dev/sda1 128016 50 127966 1% /boot
You have new mail in /var/spool/mail/root
[root@tmnagiosxi ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root
37G 12G 24G 33% /
tmpfs 7.5G 0 7.5G 0% /dev/shm
/dev/sda1 477M 66M 386M 15% /boot
[root@tmnagiosxi ~]# df -hsx* |sort -rh |head -10
df: invalid option -- 's'
Try `df --help' for more information.
[root@tmnagiosxi ~]# df -hx* |sort -rh |head -10
tmpfs 7.5G 0 7.5G 0% /dev/shm
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 477M 66M 386M 15% /boot
/dev/mapper/VolGroup-lv_root
37G 12G 24G 33% /

Re: A database connection error

Posted: Fri Oct 13, 2017 10:50 am
by eloyd
I think if it were a space concern, a reboot wouldn't fix it. Is that database output ALL of the output or just the end? I know it' asking a lot, but the next time it's broken, can you post the entire output of the repair script?

And you can use code.../code tags around the output so it looks like this:

Code: Select all

This is the start of the [code] tag.
.
.
And this is the 
tag ending
[/code]

Re: A database connection error

Posted: Fri Oct 13, 2017 11:16 am
by dwasswa
Thanks @eloyd.

@mholtaftac did you actually try what @eloyd posted?