Page 1 of 2

Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 1:47 pm
by snapon_admin
Installed this on our test box a few weeks ago no problem, but I'm running into some issues on the production server. Not sure what the issue is. Don't know if any of this info will help, but here is the end of the installation script (it was failing when upgrading in the GUI so I did it the old fashioned way. I also don't appear to be able to download a profile, so that's a bit of an issue.

Code: Select all

*** Main program, CGIs and HTML files installed ***

You can continue with installing Nagios as follows (type 'make'
without any arguments for a list of all possible options):

  make install-init
     - This installs the init script in /etc/init.d

  make install-commandmode
     - This installs and configures permissions on the
       directory for holding the external command file

  make install-config
     - This installs sample config files in /usr/local/nagios/etc

make[1]: Leaving directory `/tmp/nagiosxi/subcomponents/nagioscore/nagios-4.4.2'
Stopping nagios: ..........................................................................................
Warning - nagios did not exit in a timely manner - Killing it!
Stopping nagios: No lock file found in /var/run/nagios.lock
Starting nagios: touch: cannot touch `/usr/local/nagios/var/nagios.configtest': Permission denied
ERROR: Could not create or update '/usr/local/nagios/var/nagios.configtest'

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 3:23 pm
by lmiltchev
Can you run the following commands and show the output?

Code: Select all

sestatus
umask
ls -lad /usr/local/nagios /usr/local/nagios/var/
grep nagios.lock /etc/init.d/nagios /usr/local/nagios/etc/nagios.cfg
grep gearman /usr/local/nagios/etc/nagios.cfg
service nagios stop
killall nagios
service nagios start
service nagios status

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 3:26 pm
by snapon_admin

Code: Select all

[root@lisl-ngos-01-pv nagiosxi]# sestatus
umask
ls -lad /usr/local/nagios /usr/local/nagios/var/
SELinux status:                 enabled
SELinuxfs mount:                /selinux
Current mode:                   permissive
Mode from config file:          permissive
Policy version:                 24
Policy from config file:        targeted
You have new mail in /var/spool/mail/root
[root@lisl-ngos-01-pv nagiosxi]# umask
0022
[root@lisl-ngos-01-pv nagiosxi]# ls -lad /usr/local/nagios /usr/local/nagios/var/
grep nagios.lock /etc/init.d/nagios /usr/local/nagios/etc/nagios.cfg
drwxr-xr-x. 10 root   root   4096 May 13  2014 /usr/local/nagios
drwxrwxr-x.  6 nagios nagios 4096 Oct  4 15:24 /usr/local/nagios/var/
[root@lisl-ngos-01-pv nagiosxi]# grep nagios.lock /etc/init.d/nagios /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios:NagiosRunFile=/var/run/nagios.lock
/usr/local/nagios/etc/nagios.cfg:lock_file=/usr/local/nagios/var/nagios.lock
[root@lisl-ngos-01-pv nagiosxi]# grep gearman /usr/local/nagios/etc/nagios.cfg
[root@lisl-ngos-01-pv nagiosxi]# service nagios stop
killall nagios
Stopping nagios: kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
service nagios start
done.
[root@lisl-ngos-01-pv nagiosxi]# killall nagios
service nagios status[root@lisl-ngos-01-pv nagiosxi]# service nagios start
Starting nagios: done.
[root@lisl-ngos-01-pv nagiosxi]# service nagios status
No lock file found in /var/run/nagios.lock
I'm also now getting this issue. This being a prod system is kind of a critical issue now. This is after attempting to restore from backup.
borked database.png

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 3:34 pm
by snapon_admin
Also, with it being near the end of business today I am content with not installing the update for now and just getting this back up and working. It's critical that I get this back to normal at minimum as soon as possible.

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 3:49 pm
by lmiltchev
Can you list your backups in the /store/backups/nagiosxi directory?

Code: Select all

cd /store/backups/nagiosxi
ls -lat
What is the backup that you tried to restore to? Are you sure that this was a "good" backup, and was not corrupted? It looks like it failed to restore...

Try fixing the mismatch in the path to the nagios.lock file. Open the "/etc/init.d/nagios" in a text editor and change this:

Code: Select all

NagiosRunFile=/var/run/nagios.lock
to this:

Code: Select all

NagiosRunFile=/usr/local/nagios/var/nagios.lock
Run the following commands to make sure that any "left over" nagios.lock files are deleted.

Code: Select all

service nagios stop
killall -9 nagios
rm -f /usr/local/nagios/var/nagios.lock /var/run/nagios.lock
then try starting nagios:

Code: Select all

service nagios start
service nagios status

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 3:53 pm
by snapon_admin

Code: Select all

[root@lisl-ngos-01-pv nagiosxi]# ls -lat
total 32608032
drwxr-xr-x.  3 root   root         4096 Oct  4 15:33 1538685195-restore
drwxr-xr-x. 13 nagios nagios       4096 Oct  4 15:33 .
-rw-r--r--.  1 root   root        10840 Oct  4 15:21 tmp_xi_vars.cfg
drwxr-xr-x.  3 root   root         4096 Oct  4 15:08 1538683711-restore
drwxr-xr-x.  6 root   root         4096 Oct  4 12:17 ..
-rw-r--r--.  1 nagios nagios 5731403744 Oct  4 10:59 autoupgrade_backup.1538665501.tar.gz
-rw-r--r--.  1 nagios nagios 5730395021 Oct  3 21:01 nagiosxi.1538614802.tar.gz
-rw-r--r--.  1 nagios nagios 5727386210 Oct  2 20:59 nagiosxi.1538528416.tar.gz
-rw-r--r--.  1 nagios nagios 5722750962 Oct  1 21:05 nagiosxi.1538442029.tar.gz
drwxr-xr-x.  2 nagios nagios       4096 Sep  1 10:25 test
-rw-r--r--.  1 nagios nagios 5373989250 May 17 13:47 autoupgrade_backup.1526579367.tar.gz
-rw-r--r--.  1 nagios nagios 5104598633 Jan 25  2018 autoupgrade_backup.1516899962.tar.gz
drwxr-xr-x.  8 nagios nagios       4096 Apr 23  2017 nagiosxi.1492995601
drwxr-xr-x.  5 nagios nagios       4096 Oct  4  2016 nagiosxi.1475629202
drwxr-xr-x.  5 nagios nagios       4096 Mar 24  2016 nagiosxi.1458867603
drwxr-xr-x.  5 nagios nagios       4096 Oct 19  2014 nagiosxi.1413694802
drwxr-xr-x.  2 nagios nagios       4096 Sep 21  2014 nagiosxi.1411275601
drwxr-xr-x.  4 nagios nagios       4096 Aug 25  2014 1408997977
drwxr-xr-x.  5 nagios nagios       4096 Jul 15  2014 nagiosxi.1405400402
drwxr-xr-x.  5 nagios nagios       4096 Jul  1  2014 nagiosxi.1404190801
I tried 2 backups, autoupgrade_backup.1538665501.tar.gz and nagiosxi.1538614802.tar.gz. It's failing because:

Code: Select all

Restoring Nagvis backups...
Restoring MySQL databases...
ERROR 1010 (HY000) at line 22: Error dropping database (can't rmdir './nagios/', errno: 17)
Error restoring MySQL database 'nagios' - check the password in this script!
We haven't changed any passwords though so I have no idea how to fix this.

Edited the files and followed your instructions, this is what I get:

Code: Select all

[root@lisl-ngos-01-pv nagiosxi]# service nagios start
Starting nagios: done.
[root@lisl-ngos-01-pv nagiosxi]# service nagios status
nagios is not running

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 4:02 pm
by tgriep
I updated the ticket you opened with some troubleshooting steps. Do you want me to add them here?

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 4:06 pm
by snapon_admin
Nah, I'm working on what you put there. I'm ok with updating either the ticket or this thread, whichever is easier. I just need to make sure this is fixed quickly, that's the only reason I opened both. Sort of a "open both and see which gets a response quickest" kind of thing. For the sake of efficiency, though, I think it'd be best to stick with one or the other. I'll just reply to the ticket when this current backup restore attempt is finished.

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 4:13 pm
by lmiltchev
Check the root password in the /root/scripts/automysqlbackup file - is this the one that you are using? Can you log in the db with this password?
Example:

Code: Select all

mysql -uroot -pnagiosxi
Can you also check the db passwords in the /usr/local/nagiosxi/html/config.inc.php file?

Run the db repair script just to rule out issues with crashed tables:

Code: Select all

/usr/local/nagiosxi/scripts/repair_databases.sh
Run the following command to fix permissions (if this is what is causing the issue) on the mysql directory/files:

Code: Select all

chown -R mysql:mysql /var/lib/mysql/*
Next, try the restore procedure again. If it fails again, you can try removing the/var/lib/mysql/nagios directory:

Code: Select all

rm -rf /var/lib/mysql/nagios 
and try restoring again.

Re: Issue installing 5.5.4 from 5.4.13

Posted: Thu Oct 04, 2018 4:13 pm
by lmiltchev
Did you fix the path to the nagios.lock file?