Nagios.cmd file continually goes missing
Posted: Fri Aug 18, 2017 10:38 am
Hi everyone,
I've been working on this issue for the past week. We've been running Nagios Core 3.5.1 on RHEL for some time until it seems like Core 4.3.2 was available in the RHEL package repository and it was updated unexpectedly when we were doing our standard linux patching process. After the shock of finding out Nagios wouldn't start after the patching we worked to resolve that (lock file permissions). Now that we've gotten past that, we seem to continue to have problems with the external command file where it will suddenly be deleted and we recieve this error in the web UI when trying to execute any command:
Error: Could not stat() command file '/var/spool/nagios/cmd/nagios.cmd'
Naturally, if I go look in /var/spool/nagios/cmd -the file is missing.
After plenty of research, performing the commands that I found from someone else on this support form do resolve this:
$ sudo service nagios stop
$ sudo killall nagios
$ sudo service nagios start
Upon restart- the Nagios.cmd file is recreated in /var/spool/nagios/cmd/and everything works as usual again. This will go on for maybe a day or 2 until I get the missing file error again and preform the same 3 commands to restore it. We don't have any issues writing commands to the file from the web UI (httpd/apache user, which is in our nagios group)
Permissions in the cmd directory are as follows:
$ ls -l /var/spool/nagios/cmd
total 0
prw-rw----. 1 nagios nagios 0 Aug 18 10:01 nagios.cmd
srw-rw----. 1 nagios nagios 0 Aug 18 10:01 nagios.qh
Permissions in the nagios directory:
$ ls -l /var/spool/nagios/
total 8
drwxrwsr-x. 2 nagios nagios 4096 Jun 28 11:43 checkresults
drwxrwsr-x. 2 nagios nagios 4096 Aug 18 10:01 cmd
and permissions on the nagios directory in /var/spool:
drwxrwsr-x. 4 nagios nagios 4096 Aug 12 09:08 nagios
The nagios group contains the user that nagios runs under as well as the apache user.
We never had this issue when running Core 3.5.1 so this very perplexing. Any help would be greatly appreciated!
Thanks!
I've been working on this issue for the past week. We've been running Nagios Core 3.5.1 on RHEL for some time until it seems like Core 4.3.2 was available in the RHEL package repository and it was updated unexpectedly when we were doing our standard linux patching process. After the shock of finding out Nagios wouldn't start after the patching we worked to resolve that (lock file permissions). Now that we've gotten past that, we seem to continue to have problems with the external command file where it will suddenly be deleted and we recieve this error in the web UI when trying to execute any command:
Error: Could not stat() command file '/var/spool/nagios/cmd/nagios.cmd'
Naturally, if I go look in /var/spool/nagios/cmd -the file is missing.
After plenty of research, performing the commands that I found from someone else on this support form do resolve this:
$ sudo service nagios stop
$ sudo killall nagios
$ sudo service nagios start
Upon restart- the Nagios.cmd file is recreated in /var/spool/nagios/cmd/and everything works as usual again. This will go on for maybe a day or 2 until I get the missing file error again and preform the same 3 commands to restore it. We don't have any issues writing commands to the file from the web UI (httpd/apache user, which is in our nagios group)
Permissions in the cmd directory are as follows:
$ ls -l /var/spool/nagios/cmd
total 0
prw-rw----. 1 nagios nagios 0 Aug 18 10:01 nagios.cmd
srw-rw----. 1 nagios nagios 0 Aug 18 10:01 nagios.qh
Permissions in the nagios directory:
$ ls -l /var/spool/nagios/
total 8
drwxrwsr-x. 2 nagios nagios 4096 Jun 28 11:43 checkresults
drwxrwsr-x. 2 nagios nagios 4096 Aug 18 10:01 cmd
and permissions on the nagios directory in /var/spool:
drwxrwsr-x. 4 nagios nagios 4096 Aug 12 09:08 nagios
The nagios group contains the user that nagios runs under as well as the apache user.
We never had this issue when running Core 3.5.1 so this very perplexing. Any help would be greatly appreciated!
Thanks!