systemctl does not do anything other than start the service

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
DMKatIBM
Posts: 22
Joined: Thu Jan 11, 2018 3:41 pm

systemctl does not do anything other than start the service

Post by DMKatIBM »

New install of Nagios Core (4.3.4) on Red Hat EL 7.4.
The "systemctl start nagios.service" command works fine, and Nagios runs fine when it starts.
The problem is that "systemctl stop nagios.service" and "systemctl restart nagios.service" don't do anything.
It appears to correctly execute the command, but the service doesn't stop or restart.

I've checked through a lot of posts that I could find on issues with systemctl (because this is not a native service in CentOS/RedHat 7.x), but none of the suggestions have solved my problem. I was hoping someone here may be able to provide me with a solution.

Based on what I've found so far, here is the configuration for the service:

Code: Select all

[root@dal10-build-Nagios system]# find / -name nagios\*service -print
/etc/systemd/system/multi-user.target.wants/nagios.service
/etc/systemd/system/nagios.service
/sys/fs/cgroup/systemd/system.slice/nagios.service

Code: Select all

[root@dal10-build-Nagios system]# ls -l /etc/systemd/system/multi-user.target.wants/nagios.service /etc/systemd/system/nagios.service /sys/fs/cgroup/systemd/system.slice/nagios.service
lrwxrwxrwx 1 root root  34 Jan  9 09:01 /etc/systemd/system/multi-user.target.wants/nagios.service -> /etc/systemd/system/nagios.service
-rw-r--r-- 1 root root 729 Jan  9 09:01 /etc/systemd/system/nagios.service

/sys/fs/cgroup/systemd/system.slice/nagios.service:
total 0
-rw-r--r-- 1 root root 0 Jan  9 10:34 cgroup.clone_children
--w--w--w- 1 root root 0 Jan  9 10:34 cgroup.event_control
-rw-r--r-- 1 root root 0 Jan  9 10:34 cgroup.procs
-rw-r--r-- 1 root root 0 Jan  9 10:34 notify_on_release
-rw-r--r-- 1 root root 0 Jan  9 10:34 tasks

Code: Select all

[root@dal10-build-Nagios system]# cat /etc/systemd/system/nagios.service
# Automatically generated by systemd-sysv-generator

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/rc.d/init.d/nagios
Description=LSB: Starts and stops the Nagios monitoring server
Before=runlevel2.target
Before=runlevel3.target
Before=runlevel4.target
Before=runlevel5.target
Before=shutdown.target
After=network-online.target
After=network-online.target
After=nrpe.service
Wants=network-online.target
Conflicts=shutdown.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
ExecStart=/etc/rc.d/init.d/nagios start
ExecStop=/etc/rc.d/init.d/nagios stop
ExecReload=/etc/rc.d/init.d/nagios reload

[Install]
WantedBy=multi-user.target
[root@dal10-build-Nagios system]# systemctl list-unit-files | grep nagios
nagios.service                                enabled
[root@dal10-build-Nagios system]# ls -l /etc/rc.d/init.d/nagios
-rwxr-xr-x 1 root root 8243 Jan 11 13:41 /etc/rc.d/init.d/nagios
[root@dal10-build-Nagios system]#
Last edited by dwhitfield on Sat Jan 13, 2018 8:13 pm, edited 1 time in total.
Reason: code blocks for the win, at the buzzer!
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: systemctl does not do anything other than start the serv

Post by dwhitfield »

What's the output of ps -aef | grep nagios.cfg? It could be there are multiple processes running and thus the stop and restart appear to do nothing. Please put the output in a code block. The "Code" button is the fifth from the left on the post input screen (between Quote and List).
DMKatIBM
Posts: 22
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Post by DMKatIBM »

Code: Select all

[root@dal10-build-Nagios etc]# ps -aef | grep nagios.cfg
nagios   18189     1  0 Jan11 ?        00:04:31 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   18197 18189  0 Jan11 ?        00:00:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root     26015 13217  0 13:15 pts/0    00:00:00 grep --color=auto nagios.cfg
[root@dal10-build-Nagios etc]#
It's only the single instance (18197 spawned from 18189).
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: systemctl does not do anything other than start the serv

Post by dwhitfield »

Did you compile this? If so, what instructions did you use? If not, from what repo did you install nagios core.

systemctl stop nagios works just fine on my CentOS 7 box. 4.3.4 was released in August, so I'm pretty sure we'd know if our compile instructions caused this.

What's the output of sestatus?
DMKatIBM
Posts: 22
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Post by DMKatIBM »

SELinux is disabled.

So here's the thing...I compiled this when the system was CentOS 7, and everything ran fine there. I put the entire /usr/local/nagios directory into a tarball, and flattened the VM and reinstalled it as Red Hat 7.4. I did recompile it, and ran all the make commands (including make install-init) and installed it the same as it did in CentOS. Then I untarred everything in /usr/local/nagios. Nothing in /usr/local/nagios should have impacted the systemctl files, since that's all in /usr/systemd and-or /etc/systemd.

The symlinks are all correct (that I can see), so I don't get why the service won't stop or restart with it.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: systemctl does not do anything other than start the serv

Post by dwhitfield »

What do this two commands do...anything?

Code: Select all

/etc/rc.d/init.d/nagios stop
/etc/rc.d/init.d/nagios reload
DMKatIBM
Posts: 22
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Post by DMKatIBM »

The stop command says it will stop it, but it actually doesn't do anything.

Code: Select all

[root@dal10-build-Nagios etc]# /etc/rc.d/init.d/nagios stop
Stopping nagios (via systemctl):                           [  OK  ]
In /var/log/messages I get the following:

Code: Select all

Jan 15 06:35:07 dal10-build-Nagios systemd: Stopping LSB: Starts and stops the Nagios monitoring server...
Jan 15 06:35:07 dal10-build-Nagios nagios: Stopping nagios:kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
Jan 15 06:35:07 dal10-build-Nagios nagios: done.
For the reload, it just flat-out fails (nothing get logged in /var/log/messages).

Code: Select all

[root@dal10-build-Nagios etc]# /etc/rc.d/init.d/nagios reload
Reloading nagios configuration (via systemctl):  Job for nagios.service canceled. 
                                                           [FAILED]
User avatar
tgriep
Madmin
Posts: 9181
Joined: Thu Oct 30, 2014 9:02 am

Re: systemctl does not do anything other than start the serv

Post by tgriep »

When nagios is running, did it create a nagios.lock file in this location?

Code: Select all

/usr/local/nagios/var/nagios.lock
If that file doesn't exist, the /etc/rc.d/init.d/nagios sctipt cannot get the PID number for the running nagios process and it cannot stop it.

The easiest way to fix it is to remove the nagios.service files so the system will go back to just using the init script.

So, delete these files

Code: Select all

/etc/systemd/system/nagios.service
/etc/systemd/system/multi-user.target.wants/nagios.service
Run this to reload the daemon configuration files

Code: Select all

systemctl daemon-reload
Then kill the nagios process by running

Code: Select all

killall -9 nagios
Then start / stop / restart nagios by running the collowing to see if they work.

Code: Select all

service nagios start
service nagios stop
service nagios restart
Be sure to check out our Knowledgebase for helpful articles and solutions!
DMKatIBM
Posts: 22
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Post by DMKatIBM »

The lock file does exist:

Code: Select all

[root@sjc04-build-Nagios multi-user.target.wants]# ls -l /usr/local/nagios/var/nagios.*
-rw-r--r--. 1 nagios nagios     34 Jan 18 06:55 /usr/local/nagios/var/nagios.configtest
-rw-r--r--. 1 nagios nagios  50076 Jan  2 11:42 /usr/local/nagios/var/nagios.debug
-rw-r--r--. 1 nagios nagios      5 Jan 18 06:55 /usr/local/nagios/var/nagios.lock
-rw-r--r--. 1 nagios nagios 968023 Jan 18 06:55 /usr/local/nagios/var/nagios.log
-rw-rw-r--. 1 nagios nagios  49278 Aug  6  2016 /usr/local/nagios/var/nagios.tmpRdcNCL
[root@sjc04-build-Nagios multi-user.target.wants]#
But it was worth trying what you suggested. So I deleted the files, did a systemctl daemon-reload, and tried again.

Code: Select all

[root@sjc04-build-Nagios multi-user.target.wants]# service nagios stop
Stopping nagios (via systemctl):                           [  OK  ]
[root@sjc04-build-Nagios multi-user.target.wants]#
It says it is stopping it, but it doesn't actually stop. Nothing is logged in /var/log/messages (at all) about the process change, and it remains running:

Code: Select all

[root@sjc04-build-Nagios multi-user.target.wants]# ls -l /usr/local/nagios/var/nagios.*
-rw-r--r--. 1 nagios nagios     34 Jan 18 06:55 /usr/local/nagios/var/nagios.configtest
-rw-r--r--. 1 nagios nagios  50076 Jan  2 11:42 /usr/local/nagios/var/nagios.debug
-rw-r--r--. 1 nagios nagios      5 Jan 18 06:55 /usr/local/nagios/var/nagios.lock
-rw-r--r--. 1 nagios nagios 968023 Jan 18 06:55 /usr/local/nagios/var/nagios.log
-rw-rw-r--. 1 nagios nagios  49278 Aug  6  2016 /usr/local/nagios/var/nagios.tmpRdcNCL
[root@sjc04-build-Nagios multi-user.target.wants]# ps -ef | grep nagios
nagios    4473     1  1 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    4475  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4476  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4477  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4478  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4479  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4480  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4481  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4482  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4483  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4484  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4485  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4486  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4488  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    4789  4477  0 06:56 ?        00:00:00 /usr/local/nagios/libexec/check_http -S -H 10.164.14.204 -w 10 -c 20 -p 443
root      4791  3379  0 06:56 pts/0    00:00:00 grep --color=auto nagios
[root@sjc04-build-Nagios multi-user.target.wants]#
If I kill it manually, the lock file does get cleaned up:

Code: Select all

[root@sjc04-build-Nagios multi-user.target.wants]# kill 4473
[root@sjc04-build-Nagios multi-user.target.wants]# ls -l /usr/local/nagios/var/nagios.*
-rw-r--r--. 1 nagios nagios     34 Jan 18 06:55 /usr/local/nagios/var/nagios.configtest
-rw-r--r--. 1 nagios nagios  50076 Jan  2 11:42 /usr/local/nagios/var/nagios.debug
-rw-r--r--. 1 nagios nagios 968341 Jan 18 06:56 /usr/local/nagios/var/nagios.log
-rw-rw-r--. 1 nagios nagios  49278 Aug  6  2016 /usr/local/nagios/var/nagios.tmpRdcNCL
[root@sjc04-build-Nagios multi-user.target.wants]#
I highly appreciate your help so far.

Any further thoughts as to what could be going on here?
User avatar
tgriep
Madmin
Posts: 9181
Joined: Thu Oct 30, 2014 9:02 am

Re: systemctl does not do anything other than start the serv

Post by tgriep »

If you look in the /usr/local/nagios/var/nagios.lock file when Nagios is running, does the PID number match the PID number of the Nagios Daemon?
When you run the following commands as root

Code: Select all

/etc/rc.d/init.d/nagios stop
/etc/rc.d/init.d/nagios start
Does the Nagios daemon stop and start?

If you look in the init script

Code: Select all

/etc/rc.d/init.d/nagios
does the path to the nagios.lock file match where the lock file is created?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked