Page 1 of 3
Weird behavior after performing repair database script
Posted: Mon Mar 05, 2018 2:24 pm
by Berto
When attempting to log into NagiosXI yesterday, an error was received that stated the database was corrupted and to run the following:
/usr/local/nagiosxi/scripts/repair_databases.sh
That script was ran and once finished I was able to log in and also noticed that a previous issue that was being seen (data not being graphed) had then seemed to get fixed. Well today it has been noticed that /var was completely full and when looking into what filled up /var, it was noticed that /var/lib/mysql/nagiosxi/ had generated enough data in less than 24 hours to fill it up. It has also been noticed that when navigating to the hosts tab from Configure > CCM we'll receive a HTTP 500 error and when trying to apply changes to a service, we now receive the error "Backend login to the Core Config Manager failed.".
This happened after running that script but not sure if what is being seen is just a symptom of a much bigger issue that is being discovered.
Re: Weird behavior after performing repair database script
Posted: Mon Mar 05, 2018 3:27 pm
by npolovenko
Hello,
@Berto. There's another script in /usr/local/nagiosxi/scripts/ folder that you can run:
It resets all the config permissions. But in case it is actually a symptom of a much bigger issue, I'd like to check your system profile:
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file, upload it to a cloud storage of your choice and share a link with me via pm. After you do that please post something in this post to bring it up in the support queue.
Thank you.
Re: Weird behavior after performing repair database script
Posted: Tue Mar 06, 2018 10:27 am
by Berto
I have sent a PM to you npolovenko with the link to the profile.zip
Re: Weird behavior after performing repair database script
Posted: Tue Mar 06, 2018 3:43 pm
by npolovenko
@Berto, I did receive the file, but unfortunately, it appears to be corrupted. FTP server could be at fault. You could use a google drive instead to upload the profile and create a public download link.
Re: Weird behavior after performing repair database script
Posted: Fri Mar 09, 2018 9:40 am
by Berto
I've believe the corruption of the profile is happening when downloading, as I've tried different methods to get you the profile, but each time I test to make sure you'll be able to review the files, it says corrupted. I tried running the reset_config_perms.sh script and afterwards in the admin page I now see all the red items in the screenshot.
Re: Weird behavior after performing repair database script
Posted: Fri Mar 09, 2018 11:08 am
by scottwilkerson
This usually has to do with one of a few things but ultimately the crons are not running.
It could be the nagios user deactivated or expired
Or permissions on the directory where the crons need to write their logs to
Or a missing cron.d
Re: Weird behavior after performing repair database script
Posted: Mon Mar 12, 2018 1:20 pm
by Berto
Here is the output for those commands. I also check the logs for cron and didn't see anything out of the ordinary. I attached the log.
# chage -l nagios
Last password change : Jul 13, 2016
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
# ls -la /usr/local/nagios
total 36
drwxr-xr-x 9 root root 4096 Jan 6 2016 .
drwxr-xr-x. 16 root root 4096 Jan 6 2016 ..
drwxrwxr-x 2 nagios nagios 4096 Apr 19 2017 bin
drwsrwsr-x 7 apache nagios 4096 Mar 1 17:49 etc
drwxr-xr-x 2 root root 4096 Jan 6 2016 include
drwxrwsr-x 2 apache nagios 4096 Nov 29 18:09 libexec
drwxrwxr-x 2 nagios nagios 4096 Feb 12 2017 sbin
drwxrwxr-x 18 nagios nagios 4096 Feb 12 2017 share
drwxrwxr-x 6 nagios nagios 4096 Mar 12 14:10 var
# cat /etc/cron.d/nagiosxi
0 7 * * * root /root/scripts/autopostgresqlbackup > /dev/null 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php > /usr/local/nagiosxi/var/event_handler.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1
01 * * * * nagios /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1
Re: Weird behavior after performing repair database script
Posted: Mon Mar 12, 2018 1:50 pm
by scottwilkerson
sorry, 3 more commands please
Code: Select all
df -h
ls -la /usr/local/nagiosxi/
ls -la /usr/local/nagiosxi/var
Re: Weird behavior after performing repair database script
Posted: Tue Mar 13, 2018 9:57 am
by Berto
Here is the additional info.
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg00-root 37G 17G 19G 49% /
tmpfs 4.9G 72K 4.9G 1% /dev/shm
/dev/sda1 239M 72M 154M 32% /boot
/dev/mapper/vg00-cv 9.6G 22M 9.0G 1% /cv
/dev/mapper/vg00-tmp 9.6G 109M 9.0G 2% /tmp
/dev/mapper/vg00-var 45G 43G 61M 100% /var
# ls -la /usr/local/nagiosxi/
total 76
drwxr-xr-x 10 nagios nagios 4096 Jan 6 2016 .
drwxr-xr-x. 16 root root 4096 Jan 6 2016 ..
drwxr-xr-x 2 nagios nagios 4096 Feb 12 2017 cron
drwxr-xr-x 3 nagios nagios 4096 Jan 6 2016 etc
drwxr-xr-x 19 nagios nagios 4096 Dec 13 10:43 html
drwxr-xr-x 3 nagios nagios 4096 Jan 6 2016 nom
drwxr-xr-x 2 nagios nagios 4096 Mar 12 10:43 scripts
drwsrwsr-x 2 nagios nagios 4096 Mar 12 12:00 tmp
drwxr-xr-x 2 nagios nagios 4096 Jan 15 11:15 tools
drwxr-xr-x 5 nagios nagios 36864 Mar 13 10:51 var
[root@lnsvr0370 ~]# ls -la /usr/local/nagiosxi/var
total 165120
drwxr-xr-x 5 nagios nagios 36864 Mar 13 10:51 .
drwxr-xr-x 10 nagios nagios 4096 Jan 6 2016 ..
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 cleaner.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 cmdsubsys.log
drwsrwsr-x 2 apache nagios 4096 Mar 8 18:20 components
-rw-r--r-- 1 nagios nagios 8 Mar 13 09:03 corelog.data
-rw-r--r-- 1 nagios nagios 24805 Mar 13 09:03 corelog.diff
-rw-r--r-- 1 nagios nagios 0 Mar 13 10:25 dbmaint.lock
-rw-r--r-- 1 nagios nagios 66 Mar 13 10:50 dbmaint.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:50 deadpool.log
-rw-r--r-- 1 nagios nagios 0 Mar 13 10:51 event_handler.lock
-rw-r--r-- 1 nagios nagios 72 Mar 13 10:52 event_handler.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 eventman.log
-rw-r--r-- 1 nagios nagios 1501 Mar 13 10:52 feedproc.log
-rw-r--r-- 1 nagios nagios 0 Mar 11 03:16 load_url.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 nom.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 perfdataproc.log
-rw-r--r-- 1 nagios nagios 401516 Mar 13 10:01 recurringdowntime.log
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 reportengine.log
drwxr-xr-x 2 nagios nagios 4096 Mar 8 17:06 subsys
-rw-r--r-- 1 nagios nagios 1797 Mar 13 10:52 sysstat.log
drwxr-xr-x 2 nagios nagios 4096 Jan 6 2016 upgrades
-rw-r--r-- 1 nagios nagios 12187 Feb 12 2017 xi-sys.cfg
-rw-r--r-- 1 nagios nagios 37 Feb 12 2017 xi-uuid
-rw-r--r-- 1 nagios nagios 196 Feb 12 2017 xiversion
Re: Weird behavior after performing repair database script
Posted: Tue Mar 13, 2018 10:19 am
by scottwilkerson
Can you go to Admin -> System Profile and PM myself or another staff member your profile.zip
Thanks