Database error - Nagios will not start successfully ..even with db repair

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Post Reply
pg91
Posts: 2
Joined: Fri Mar 29, 2024 8:00 am

Database error - Nagios will not start successfully ..even with db repair

Post by pg91 »

Hello,
Running Nagios Core 4.4.13 on Redhat 8 VM
Ran an yum update recently to secure the system with latest patches. (8.6 to 8.9)
Nagios is now broken..page url won't come up

I went to grub and reverted to old kernel (8.6)...and now I get the Nagios XI page ..but when I click on "Access Nagios XI" button/link, I get database error (see attachment)
Database Error
A database connection error has been detected, please follow the repair prompt below. If the issue persists, please contact Nagios support.
Run the following from the CLI as root to attempt to repair the DB:
/usr/local/nagiosxi/scripts/repair_databases.sh

I ran the repair about 3 times....but nothing changes.. I still get the database error.
Found out that nagios store the database sql backups in /store/backups/mysql
I ran a restore from a 3 weeks ago via
mysql -u root -pnagiosxi nagiosql < /var/tmp/mysql_week.11.2024-03-16_07h00m.sql
mysql -u root -pnagiosxi nagios < /var/tmp/mysql_week.11.2024-03-16_07h00m.sql

Unfortunately, that didn't help either..still get Database error screen and run the /usr/local/nagiosxi/scripts/repair_databases.sh

Does anyone have any ideas on how to restore my Nagios XI server? I can try sql restore using a file from more than a month ago, and see if that helps, but for some reason, I don't think it will.
Not sure what happened in the yum updates ...but at I still have kernel for RH8.6 and just need some assistance..
Thank you
jsimon
Posts: 104
Joined: Wed Aug 23, 2023 11:27 am

Re: Database error - Nagios will not start successfully ..even with db repair

Post by jsimon »

Hi @pg91,

I think in order to diagnose this issue, we would probably need to see your system profile. I would recommend trying the following steps for a more thorough database repair:

Code: Select all

systemctl stop npcd
systemctl stop nagios
systemctl stop crond
echo 'truncate nagios_hoststatus; truncate nagios_hosts; truncate nagios_services; truncate nagios_servicestatus; truncate nagios_servicechecks; ' | mysql -u root -pnagiosxi nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
systemctl restart mysqld || systemctl restart mariadb
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /var/lib/mrtg/mrtg_l
rm -f /usr/local/nagiosxi/var/*.lock
rm -f /usr/local/nagiosxi/tmp/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill python
pkill -9 -u apache
systemctl restart httpd
systemctl restart php-fpm
systemctl start npcd
systemctl start crond
systemctl start nagios
If this is not successful in resolving your issue, I would recommend opening a case with Nagios Support for more in depth troubleshooting with regard to your system specifics.

https://answerhub.nagios.com/support/s/
pg91
Posts: 2
Joined: Fri Mar 29, 2024 8:00 am

Re: Database error - Nagios will not start successfully ..even with db repair

Post by pg91 »

Thank you..I tried the entire list of steps and the database error still comes back.

What further steps can I do to resolve this?

[s-ansible@sentry ~]$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.9 (Ootpa)

Nagios XI verson
Running Nagios Core 4.4.13 on Redhat 8 VM

I can send the mysqld.log file if needed (see attached mysqld.log)

[root@sentry s-ansible]# systemctl status nagios
● nagios.service - Nagios Core 4.4.13
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2024-03-29 12:56:00 EDT; 1min 45s ago
Docs: https://www.nagios.org/documentation
Process: 9280 ExecStopPost=/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
Process: 9683 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Process: 9681 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 9684 (nagios)
Tasks: 11 (limit: 102108)
Memory: 35.7M
CGroup: /system.slice/nagios.service
├─ 9684 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
├─ 9686 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 9687 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 9688 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 9689 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 9690 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 9691 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 9722 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
├─10560 /usr/local/nagios/libexec/check_tcp -H cert-vm-01.poly.edu -p 139
├─10562 /usr/local/nagios/libexec/check_icmp -H jira.engineering.nyu.edu -w 3000.0,80% -c 5000.0,100% -p 5
└─10563 /usr/local/nagios/libexec/check_icmp -H baahl1.poly.edu -w 3000.0,80% -c 5000.0,100% -p 5

Mar 29 12:57:35 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'ssh Service Status' on host 'baahl1.poly.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:37 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'Disk Usage on /data' on host 'baahl3.poly.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:38 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'ssh Service Status' on host 'baahl4.poly.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:39 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'abaqus Service Status' on host 'biome.poly.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:41 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'CPU Usage' on host 'rke02' may indicate this plugin doesn't exist.
Mar 29 12:57:43 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'Disk Usage on /var' on host 'cronus.poly.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:43 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'Disk Usage on /boot' on host 'cse-cappos2.poly.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:44 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'docker Service Status' on host 'cse-cappos2.poly.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:45 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'httpd' on host 'cse.engineering.nyu.edu' may indicate this plugin doesn't exist.
Mar 29 12:57:45 sentry.engineering.nyu.edu nagios[9684]: Warning: Return code of 127 for service 'Disk Usage on /home' on host 'dark-knight.poly.edu' may indicate this plugin doesn't exist.
[root@sentry s-ansible]#
You do not have the required permissions to view the files attached to this post.
jsimon
Posts: 104
Joined: Wed Aug 23, 2023 11:27 am

Re: Database error - Nagios will not start successfully ..even with db repair

Post by jsimon »

At this point I would strongly recommend contacting Nagios Support for personalized troubleshooting with regard to your unique configuration.

https://answerhub.nagios.com/support/s/
Post Reply