Page 1 of 1

since 5.5.7 : lock file is misplaced after config error

Posted: Fri Dec 07, 2018 6:01 am
by sigmainformatique
Hi nagios team,

I have a bug with 5.5.7 : each time I make an error in my configuration (CCM), paramter lock_file is changed from :
lock_file=/usr/local/nagios/var/nagios.lock
to
lock_file=/var/run/nagios.lock

Result : Nagois XI make a wrong interpretation, and think configuration is erratic even after corrections.
Its verry annoyining as I have more than one colleague that can work on configuration...

This error is repeatable in our dev as in our prod environment.
Could you please giving us a warkaround?

Regards
Guillaume

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Fri Dec 07, 2018 1:10 pm
by lmiltchev
When the lock file gets "misplaced" - after fixing the config errors or after reverting to a "known good snapshot"? What is the OS/architecture of the machine that you have Nagios XI installed on? Are you using ModGearman? What is the Nagios Core version that you are currently using?

We have seen some issues with the lock file location on upgraded systems, where in addition to the systemd unit file, there is a nagios init file in /etc/init.d directory. There could be a mismatch in the path to the nagios.lock in these files and the one, defined in the nagios.cfg file. Nagios XI 5.5.8 should be released sometime next week, and should address these issues.

Meanwhile, you can see a solution to a similar problem here:
https://support.nagios.com/forum/viewto ... le#p269279

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Thu Dec 13, 2018 11:00 am
by sigmainformatique
Thank you.

The issue is not due to a rollback. If I add a comment with
# lock_file=/usr/local/nagios/var/nagios.lock
All occurences of lock_file (commented or not) are modified. So modification did not came from an ancient nagios.cfg

I think the rollback script made this modification with sed or sometihing like that. When there are no syntax issue with nagios config, path is not changed.


I tried to make the workaround (nagios managed by systemd), but this bug stays.
I have changed rights of my /var/run as a dirty workaround.

Regard

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Thu Dec 13, 2018 11:30 am
by lmiltchev
If you were running CentOS 7, but you still had /etc/init.d/nagios file, and you had this line in it (somewhere on the top):

Code: Select all

# lock_file=/usr/local/nagios/var/nagios.lock
systemd would use, create a unit file in /var/systemd/generator.late, and use the path to the nagios.lock, listed above. It doesn't matter this line is "commented out".

See more on systemd/init compatibility here:
https://www.turnkeylinux.org/blog/debug ... nit-compat

There are a couple of ways for fixing it.

1. Change the path to match whatever you have in the nagios.cfg, leave the nagios init file, and remove the unit file. We have a guide for people who use ModGearman, and need to downgrade Nagios Core, which explains how to set the proper lock locations.

https://support.nagios.com/kb/article/n ... e-823.html

This method will have some disadvantages, especially if you used the "old" style commands, e.g.

Code: Select all

service nagios stop
service nagios start
The service may be running but in the GUI, it may show as "not running" (or vise versa). The GUI uses:

Code: Select all

systemctl status nagios.service
So, this is not ideal if you are using an init file on CentOS 7. One way around this is to use the "new" style commands:

Code: Select all

systemctl stop nagios
systemctl start nagios
2. The second method would be go create a unit file, and get rid of the nagios init file completely.

Hope this makes sense.

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Wed Dec 26, 2018 10:33 am
by sigmainformatique
Hi team,

Did not work :
- we have totally removed any init.d reference for nagios
- Nagios is successfully managed by systemd
- We have migrated to 5.5.8

The sequence is the follow :
- Start with a good config
- Modify lock location in nagios.cfg to /usr/local/nagios/var/nagios.lock
- Create something in config to create a snapshot. For example : modify a service desc
- Generate Config
- All is OK : create a new contact without any notification command to make an erratic configuration
- Generation fails : revert to last snaphot. Nagios.lock unchanged : good
- Delete this new contact + generate configuration
- Generation is good but...
nagios.lock is reverted to /var/run!!!!!

Any idea?

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Wed Dec 26, 2018 11:34 am
by lmiltchev
I discussed the issue with our developers, and I was told that from Nagios XI 5.5.8, going forward, the path to the nagios.lock will be /var/run/nagios.lock. This decision has been made for the sake of consistency, compatibility across different versions of XI, and security.

If you still wanted to change the path the the nagios.lock file to /usr/local/nagios/var, you could use the following "workaround":

1. Stop nagios service.

Code: Select all

systemctl stop nagios.service
2. Make sure nagios is not running:

Code: Select all

ps -ef | grep nagios.cfg | grep -v grep
3. Remove the lock from /var/run

Code: Select all

cd /var/run
rm -f nagios.lock
4. Set the path to the nagios.lock in ALL of the files, listed below to /usr/local/nagios/var/nagios.lock
- /usr/local/nagios/etc/nagios.cfg
- /usr/local/nagiosxi/scripts/nom_restore_nagioscore_checkpoint.sh
- /usr/local/nagiosxi/scripts/nom_restore_nagioscore_checkpoint_specific.sh

5. Start nagios

Code: Select all

systemctl start nagios.service
Important: Keep in mind that these changes will get reverted if you upgraded your Nagios XI instance!

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Wed Dec 26, 2018 4:33 pm
by sigmainformatique
Thank you,

No problem for me, at the only condition that Nagios can write in /var/run (rights).

The root cause for me to change this path was : Nagios do not have permission to write directly in /var/run in my CentOS.

I have to make chmod 777 to allow this, that is not graceful. And /var/run permission are reset after each reboot.

Regards

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Wed Dec 26, 2018 5:19 pm
by npolovenko
@sigmainformatique, Please let us know if we're good to lock the thread as resolved?

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Thu Dec 27, 2018 3:39 am
by sigmainformatique
Hello,

Yes : you can close this thread but not completly resolved : by default, nagios user do not have permissions to write in /var/run directory (double checked on this evening) under CentOS7.

The good directory should be : /var/run/nagios/nagios.lock

I will use your workaround to write lock in /usr branch, but I think you should correct this. I think others of your clients will have the same issue.

Thank you
Regards
Guillaume

Re: since 5.5.7 : lock file is misplaced after config error

Posted: Thu Dec 27, 2018 12:14 pm
by ssax
Here are the default permissions for my CentOS 7 machine:
[root@xid nagiosxi]# ls -ld /var/run
lrwxrwxrwx. 1 root root 6 Aug 14 12:19 /var/run -> ../run
[root@xid nagiosxi]# ls -ld /run
drwxr-xr-x 30 root root 900 Dec 27 11:08 /run
What is the output of these commands?

Code: Select all

umask
ls -ld /var/run
ls -ld /run