Weird behavior after performing repair database script

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Berto
Posts: 162
Joined: Tue Jul 01, 2014 6:12 pm

Weird behavior after performing repair database script

Post by Berto »

When attempting to log into NagiosXI yesterday, an error was received that stated the database was corrupted and to run the following:

/usr/local/nagiosxi/scripts/repair_databases.sh

That script was ran and once finished I was able to log in and also noticed that a previous issue that was being seen (data not being graphed) had then seemed to get fixed. Well today it has been noticed that /var was completely full and when looking into what filled up /var, it was noticed that /var/lib/mysql/nagiosxi/ had generated enough data in less than 24 hours to fill it up. It has also been noticed that when navigating to the hosts tab from Configure > CCM we'll receive a HTTP 500 error and when trying to apply changes to a service, we now receive the error "Backend login to the Core Config Manager failed.".

This happened after running that script but not sure if what is being seen is just a symptom of a much bigger issue that is being discovered.
You do not have the required permissions to view the files attached to this post.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Weird behavior after performing repair database script

Post by npolovenko »

Hello, @Berto. There's another script in /usr/local/nagiosxi/scripts/ folder that you can run:

Code: Select all

./reset_config_perms.sh
It resets all the config permissions. But in case it is actually a symptom of a much bigger issue, I'd like to check your system profile:
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file, upload it to a cloud storage of your choice and share a link with me via pm. After you do that please post something in this post to bring it up in the support queue.
Thank you.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Berto
Posts: 162
Joined: Tue Jul 01, 2014 6:12 pm

Re: Weird behavior after performing repair database script

Post by Berto »

I have sent a PM to you npolovenko with the link to the profile.zip
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Weird behavior after performing repair database script

Post by npolovenko »

@Berto, I did receive the file, but unfortunately, it appears to be corrupted. FTP server could be at fault. You could use a google drive instead to upload the profile and create a public download link.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Berto
Posts: 162
Joined: Tue Jul 01, 2014 6:12 pm

Re: Weird behavior after performing repair database script

Post by Berto »

I've believe the corruption of the profile is happening when downloading, as I've tried different methods to get you the profile, but each time I test to make sure you'll be able to review the files, it says corrupted. I tried running the reset_config_perms.sh script and afterwards in the admin page I now see all the red items in the screenshot.
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Weird behavior after performing repair database script

Post by scottwilkerson »

This usually has to do with one of a few things but ultimately the crons are not running.

It could be the nagios user deactivated or expired

Code: Select all

chage -l nagios
Or permissions on the directory where the crons need to write their logs to

Code: Select all

ls -la /usr/local/nagios
Or a missing cron.d

Code: Select all

cat /etc/cron.d/nagiosxi
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Berto
Posts: 162
Joined: Tue Jul 01, 2014 6:12 pm

Re: Weird behavior after performing repair database script

Post by Berto »

Here is the output for those commands. I also check the logs for cron and didn't see anything out of the ordinary. I attached the log.


# chage -l nagios
Last password change : Jul 13, 2016
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7


# ls -la /usr/local/nagios
total 36
drwxr-xr-x 9 root root 4096 Jan 6 2016 .
drwxr-xr-x. 16 root root 4096 Jan 6 2016 ..
drwxrwxr-x 2 nagios nagios 4096 Apr 19 2017 bin
drwsrwsr-x 7 apache nagios 4096 Mar 1 17:49 etc
drwxr-xr-x 2 root root 4096 Jan 6 2016 include
drwxrwsr-x 2 apache nagios 4096 Nov 29 18:09 libexec
drwxrwxr-x 2 nagios nagios 4096 Feb 12 2017 sbin
drwxrwxr-x 18 nagios nagios 4096 Feb 12 2017 share
drwxrwxr-x 6 nagios nagios 4096 Mar 12 14:10 var


# cat /etc/cron.d/nagiosxi
0 7 * * * root /root/scripts/autopostgresqlbackup > /dev/null 2>&1

* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php > /usr/local/nagiosxi/var/event_handler.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1
01 * * * * nagios /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Weird behavior after performing repair database script

Post by scottwilkerson »

sorry, 3 more commands please

Code: Select all

df -h
ls -la /usr/local/nagiosxi/
ls -la /usr/local/nagiosxi/var
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Berto
Posts: 162
Joined: Tue Jul 01, 2014 6:12 pm

Re: Weird behavior after performing repair database script

Post by Berto »

Here is the additional info.

# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg00-root 37G 17G 19G 49% /
tmpfs 4.9G 72K 4.9G 1% /dev/shm
/dev/sda1 239M 72M 154M 32% /boot
/dev/mapper/vg00-cv 9.6G 22M 9.0G 1% /cv
/dev/mapper/vg00-tmp 9.6G 109M 9.0G 2% /tmp
/dev/mapper/vg00-var 45G 43G 61M 100% /var


# ls -la /usr/local/nagiosxi/
total 76
drwxr-xr-x 10 nagios nagios 4096 Jan 6 2016 .
drwxr-xr-x. 16 root root 4096 Jan 6 2016 ..
drwxr-xr-x 2 nagios nagios 4096 Feb 12 2017 cron
drwxr-xr-x 3 nagios nagios 4096 Jan 6 2016 etc
drwxr-xr-x 19 nagios nagios 4096 Dec 13 10:43 html
drwxr-xr-x 3 nagios nagios 4096 Jan 6 2016 nom
drwxr-xr-x 2 nagios nagios 4096 Mar 12 10:43 scripts
drwsrwsr-x 2 nagios nagios 4096 Mar 12 12:00 tmp
drwxr-xr-x 2 nagios nagios 4096 Jan 15 11:15 tools
drwxr-xr-x 5 nagios nagios 36864 Mar 13 10:51 var


[root@lnsvr0370 ~]# ls -la /usr/local/nagiosxi/var
total 165120
drwxr-xr-x 5 nagios nagios 36864 Mar 13 10:51 .
drwxr-xr-x 10 nagios nagios 4096 Jan 6 2016 ..
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 cleaner.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 cmdsubsys.log
drwsrwsr-x 2 apache nagios 4096 Mar 8 18:20 components
-rw-r--r-- 1 nagios nagios 8 Mar 13 09:03 corelog.data
-rw-r--r-- 1 nagios nagios 24805 Mar 13 09:03 corelog.diff
-rw-r--r-- 1 nagios nagios 0 Mar 13 10:25 dbmaint.lock
-rw-r--r-- 1 nagios nagios 66 Mar 13 10:50 dbmaint.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:50 deadpool.log
-rw-r--r-- 1 nagios nagios 0 Mar 13 10:51 event_handler.lock
-rw-r--r-- 1 nagios nagios 72 Mar 13 10:52 event_handler.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 eventman.log
-rw-r--r-- 1 nagios nagios 1501 Mar 13 10:52 feedproc.log
-rw-r--r-- 1 nagios nagios 0 Mar 11 03:16 load_url.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 nom.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 perfdataproc.log
-rw-r--r-- 1 nagios nagios 401516 Mar 13 10:01 recurringdowntime.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 reportengine.log
drwxr-xr-x 2 nagios nagios 4096 Mar 8 17:06 subsys
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 sysstat.log
drwxr-xr-x 2 nagios nagios 4096 Jan 6 2016 upgrades
-rw-r--r-- 1 nagios nagios 12187 Feb 12 2017 xi-sys.cfg
-rw-r--r-- 1 nagios nagios 37 Feb 12 2017 xi-uuid
-rw-r--r-- 1 nagios nagios 196 Feb 12 2017 xiversion
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Weird behavior after performing repair database script

Post by scottwilkerson »

Can you go to Admin -> System Profile and PM myself or another staff member your profile.zip

Thanks
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked