systemctl does not do anything other than start the service

An open discussion forum for obtaining help with Nagios Core. Nagios Core users of all experience levels are welcome here. Subforum have been created for the discussion of Nagios Core and Nagios Plugin development.

NOTE: The SourceForge.net mailing lists have been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

systemctl does not do anything other than start the service

Postby DMKatIBM » Fri Jan 12, 2018 8:45 am

New install of Nagios Core (4.3.4) on Red Hat EL 7.4.
The "systemctl start nagios.service" command works fine, and Nagios runs fine when it starts.
The problem is that "systemctl stop nagios.service" and "systemctl restart nagios.service" don't do anything.
It appears to correctly execute the command, but the service doesn't stop or restart.

I've checked through a lot of posts that I could find on issues with systemctl (because this is not a native service in CentOS/RedHat 7.x), but none of the suggestions have solved my problem. I was hoping someone here may be able to provide me with a solution.

Based on what I've found so far, here is the configuration for the service:

Code: Select all
[root@dal10-build-Nagios system]# find / -name nagios\*service -print
/etc/systemd/system/multi-user.target.wants/nagios.service
/etc/systemd/system/nagios.service
/sys/fs/cgroup/systemd/system.slice/nagios.service

Code: Select all
[root@dal10-build-Nagios system]# ls -l /etc/systemd/system/multi-user.target.wants/nagios.service /etc/systemd/system/nagios.service /sys/fs/cgroup/systemd/system.slice/nagios.service
lrwxrwxrwx 1 root root  34 Jan  9 09:01 /etc/systemd/system/multi-user.target.wants/nagios.service -> /etc/systemd/system/nagios.service
-rw-r--r-- 1 root root 729 Jan  9 09:01 /etc/systemd/system/nagios.service

/sys/fs/cgroup/systemd/system.slice/nagios.service:
total 0
-rw-r--r-- 1 root root 0 Jan  9 10:34 cgroup.clone_children
--w--w--w- 1 root root 0 Jan  9 10:34 cgroup.event_control
-rw-r--r-- 1 root root 0 Jan  9 10:34 cgroup.procs
-rw-r--r-- 1 root root 0 Jan  9 10:34 notify_on_release
-rw-r--r-- 1 root root 0 Jan  9 10:34 tasks

Code: Select all
[root@dal10-build-Nagios system]# cat /etc/systemd/system/nagios.service
# Automatically generated by systemd-sysv-generator

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/rc.d/init.d/nagios
Description=LSB: Starts and stops the Nagios monitoring server
Before=runlevel2.target
Before=runlevel3.target
Before=runlevel4.target
Before=runlevel5.target
Before=shutdown.target
After=network-online.target
After=network-online.target
After=nrpe.service
Wants=network-online.target
Conflicts=shutdown.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
ExecStart=/etc/rc.d/init.d/nagios start
ExecStop=/etc/rc.d/init.d/nagios stop
ExecReload=/etc/rc.d/init.d/nagios reload

[Install]
WantedBy=multi-user.target
[root@dal10-build-Nagios system]# systemctl list-unit-files | grep nagios
nagios.service                                enabled
[root@dal10-build-Nagios system]# ls -l /etc/rc.d/init.d/nagios
-rwxr-xr-x 1 root root 8243 Jan 11 13:41 /etc/rc.d/init.d/nagios
[root@dal10-build-Nagios system]#
Last edited by dwhitfield on Sat Jan 13, 2018 8:13 pm, edited 1 time in total.
Reason: code blocks for the win, at the buzzer!
DMKatIBM
 
Posts: 7
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Postby dwhitfield » Fri Jan 12, 2018 2:02 pm

What's the output of ps -aef | grep nagios.cfg? It could be there are multiple processes running and thus the stop and restart appear to do nothing. Please put the output in a code block. The "Code" button is the fifth from the left on the post input screen (between Quote and List).
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
dwhitfield
The Doctor
 
Posts: 4307
Joined: Wed Sep 21, 2016 10:29 am
Location: Nagios Enterprises, LLC

Re: systemctl does not do anything other than start the serv

Postby DMKatIBM » Fri Jan 12, 2018 2:16 pm

Code: Select all
[root@dal10-build-Nagios etc]# ps -aef | grep nagios.cfg
nagios   18189     1  0 Jan11 ?        00:04:31 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   18197 18189  0 Jan11 ?        00:00:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root     26015 13217  0 13:15 pts/0    00:00:00 grep --color=auto nagios.cfg
[root@dal10-build-Nagios etc]#


It's only the single instance (18197 spawned from 18189).
DMKatIBM
 
Posts: 7
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Postby dwhitfield » Fri Jan 12, 2018 4:33 pm

Did you compile this? If so, what instructions did you use? If not, from what repo did you install nagios core.

systemctl stop nagios works just fine on my CentOS 7 box. 4.3.4 was released in August, so I'm pretty sure we'd know if our compile instructions caused this.

What's the output of sestatus?
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
dwhitfield
The Doctor
 
Posts: 4307
Joined: Wed Sep 21, 2016 10:29 am
Location: Nagios Enterprises, LLC

Re: systemctl does not do anything other than start the serv

Postby DMKatIBM » Sat Jan 13, 2018 10:26 am

SELinux is disabled.

So here's the thing...I compiled this when the system was CentOS 7, and everything ran fine there. I put the entire /usr/local/nagios directory into a tarball, and flattened the VM and reinstalled it as Red Hat 7.4. I did recompile it, and ran all the make commands (including make install-init) and installed it the same as it did in CentOS. Then I untarred everything in /usr/local/nagios. Nothing in /usr/local/nagios should have impacted the systemctl files, since that's all in /usr/systemd and-or /etc/systemd.

The symlinks are all correct (that I can see), so I don't get why the service won't stop or restart with it.
DMKatIBM
 
Posts: 7
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Postby dwhitfield » Sat Jan 13, 2018 8:19 pm

What do this two commands do...anything?
Code: Select all
/etc/rc.d/init.d/nagios stop
/etc/rc.d/init.d/nagios reload
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
dwhitfield
The Doctor
 
Posts: 4307
Joined: Wed Sep 21, 2016 10:29 am
Location: Nagios Enterprises, LLC

Re: systemctl does not do anything other than start the serv

Postby DMKatIBM » Mon Jan 15, 2018 7:38 am

The stop command says it will stop it, but it actually doesn't do anything.

Code: Select all
[root@dal10-build-Nagios etc]# /etc/rc.d/init.d/nagios stop
Stopping nagios (via systemctl):                           [  OK  ]


In /var/log/messages I get the following:

Code: Select all
Jan 15 06:35:07 dal10-build-Nagios systemd: Stopping LSB: Starts and stops the Nagios monitoring server...
Jan 15 06:35:07 dal10-build-Nagios nagios: Stopping nagios:kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
Jan 15 06:35:07 dal10-build-Nagios nagios: done.


For the reload, it just flat-out fails (nothing get logged in /var/log/messages).

Code: Select all
[root@dal10-build-Nagios etc]# /etc/rc.d/init.d/nagios reload
Reloading nagios configuration (via systemctl):  Job for nagios.service canceled.
                                                           [FAILED]
DMKatIBM
 
Posts: 7
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Postby tgriep » Mon Jan 15, 2018 4:29 pm

When nagios is running, did it create a nagios.lock file in this location?
Code: Select all
/usr/local/nagios/var/nagios.lock


If that file doesn't exist, the /etc/rc.d/init.d/nagios sctipt cannot get the PID number for the running nagios process and it cannot stop it.

The easiest way to fix it is to remove the nagios.service files so the system will go back to just using the init script.

So, delete these files
Code: Select all
/etc/systemd/system/nagios.service
/etc/systemd/system/multi-user.target.wants/nagios.service


Run this to reload the daemon configuration files
Code: Select all
systemctl daemon-reload


Then kill the nagios process by running
Code: Select all
killall -9 nagios


Then start / stop / restart nagios by running the collowing to see if they work.
Code: Select all
service nagios start
service nagios stop
service nagios restart
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
tgriep
Madmin
 
Posts: 6337
Joined: Thu Oct 30, 2014 9:02 am

Re: systemctl does not do anything other than start the serv

Postby DMKatIBM » Thu Jan 18, 2018 8:01 am

The lock file does exist:

Code: Select all
[root@sjc04-build-Nagios multi-user.target.wants]# ls -l /usr/local/nagios/var/nagios.*
-rw-r--r--. 1 nagios nagios     34 Jan 18 06:55 /usr/local/nagios/var/nagios.configtest
-rw-r--r--. 1 nagios nagios  50076 Jan  2 11:42 /usr/local/nagios/var/nagios.debug
-rw-r--r--. 1 nagios nagios      5 Jan 18 06:55 /usr/local/nagios/var/nagios.lock
-rw-r--r--. 1 nagios nagios 968023 Jan 18 06:55 /usr/local/nagios/var/nagios.log
-rw-rw-r--. 1 nagios nagios  49278 Aug  6  2016 /usr/local/nagios/var/nagios.tmpRdcNCL
[root@sjc04-build-Nagios multi-user.target.wants]#


But it was worth trying what you suggested. So I deleted the files, did a systemctl daemon-reload, and tried again.

Code: Select all
[root@sjc04-build-Nagios multi-user.target.wants]# service nagios stop
Stopping nagios (via systemctl):                           [  OK  ]
[root@sjc04-build-Nagios multi-user.target.wants]#


It says it is stopping it, but it doesn't actually stop. Nothing is logged in /var/log/messages (at all) about the process change, and it remains running:

Code: Select all
[root@sjc04-build-Nagios multi-user.target.wants]# ls -l /usr/local/nagios/var/nagios.*
-rw-r--r--. 1 nagios nagios     34 Jan 18 06:55 /usr/local/nagios/var/nagios.configtest
-rw-r--r--. 1 nagios nagios  50076 Jan  2 11:42 /usr/local/nagios/var/nagios.debug
-rw-r--r--. 1 nagios nagios      5 Jan 18 06:55 /usr/local/nagios/var/nagios.lock
-rw-r--r--. 1 nagios nagios 968023 Jan 18 06:55 /usr/local/nagios/var/nagios.log
-rw-rw-r--. 1 nagios nagios  49278 Aug  6  2016 /usr/local/nagios/var/nagios.tmpRdcNCL
[root@sjc04-build-Nagios multi-user.target.wants]# ps -ef | grep nagios
nagios    4473     1  1 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    4475  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4476  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4477  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4478  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4479  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4480  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4481  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4482  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4483  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4484  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4485  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4486  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    4488  4473  0 06:55 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    4789  4477  0 06:56 ?        00:00:00 /usr/local/nagios/libexec/check_http -S -H 10.164.14.204 -w 10 -c 20 -p 443
root      4791  3379  0 06:56 pts/0    00:00:00 grep --color=auto nagios
[root@sjc04-build-Nagios multi-user.target.wants]#


If I kill it manually, the lock file does get cleaned up:

Code: Select all
[root@sjc04-build-Nagios multi-user.target.wants]# kill 4473
[root@sjc04-build-Nagios multi-user.target.wants]# ls -l /usr/local/nagios/var/nagios.*
-rw-r--r--. 1 nagios nagios     34 Jan 18 06:55 /usr/local/nagios/var/nagios.configtest
-rw-r--r--. 1 nagios nagios  50076 Jan  2 11:42 /usr/local/nagios/var/nagios.debug
-rw-r--r--. 1 nagios nagios 968341 Jan 18 06:56 /usr/local/nagios/var/nagios.log
-rw-rw-r--. 1 nagios nagios  49278 Aug  6  2016 /usr/local/nagios/var/nagios.tmpRdcNCL
[root@sjc04-build-Nagios multi-user.target.wants]#


I highly appreciate your help so far.

Any further thoughts as to what could be going on here?
DMKatIBM
 
Posts: 7
Joined: Thu Jan 11, 2018 3:41 pm

Re: systemctl does not do anything other than start the serv

Postby tgriep » Thu Jan 18, 2018 10:36 am

If you look in the /usr/local/nagios/var/nagios.lock file when Nagios is running, does the PID number match the PID number of the Nagios Daemon?
When you run the following commands as root
Code: Select all
/etc/rc.d/init.d/nagios stop
/etc/rc.d/init.d/nagios start

Does the Nagios daemon stop and start?

If you look in the init script
Code: Select all
/etc/rc.d/init.d/nagios

does the path to the nagios.lock file match where the lock file is created?
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
tgriep
Madmin
 
Posts: 6337
Joined: Thu Oct 30, 2014 9:02 am

Next

Return to Nagios Core

Who is online

Users browsing this forum: DMKatIBM and 31 guests