Nagios Support Forum

Posted: **Thu Aug 25, 2016 3:32 am**

Hi all,
I have been searching and reading few posts not only on this forum but also around the web on issues regarding Nagios restart/reload that has been effected at least a couple of versions.
I am running Nagios Core 4.2.0 (just updated from 4.1.1) on a new Debian Jessie machine; I have installed everything form source.
I am also successfully running Nagiosgraph and MRTG Stats in the same installation.
My only issue which I can't resolve and I would like some help with is that the reload does not actually do anything at all.
I have both tried to trig the reload with:

Code: Select all

/etc/init.d/nagios reload

Code: Select all

service nagios reload

and while tail -f the nagios.log I can see that nothing gets logged hence I think that nothing gets triggered.
At the moment the only option I have is to re-start Nagios every time I make a configuration change or adding a new host. This is for now a not too bad workaround since I am running 3 hosts but I am not going to invest a lot of time using this version if I can't get the reload working.
Thank you in advance for your help and let me know if you need further details in order to help me out or point my struggles to a possible solution.
Regards

Posted: **Thu Aug 25, 2016 3:52 pm**

Please run the reload like this and post the last 20 lines of output:

bash -x /etc/init.d/nagios reload

Posted: **Fri Aug 26, 2016 3:19 am**

Thank you for your reply tmcdonald.
I have just added another Host going from 4 to 5 and run the reload that you have provided; tail -f /usr/local/nagios/var/nagios.log produces nothing.
The web interface still not updated showing 4 hosts.

Here you go:

Code: Select all

+ TMPFILE=/tmp/.configtest.vglw3ydh
+ /usr/local/nagios/bin/nagios -vp /usr/local/nagios/etc/nagios.cfg
++ grep '^Total Warnings:' /tmp/.configtest.vglw3ydh
++ sed 's/ //g'
++ awk -F: '{print $2}'
+ WARN=0
++ grep '^Total Errors:' /tmp/.configtest.vglw3ydh
++ awk -F: '{print $2}'
++ sed 's/ //g'
+ ERR=0
+ test 0 = 0
+ test 0 = 0
+ echo 'OK - Configuration check verified'
+ chmod 0644 /usr/local/nagios/var/nagios.configtest
+ chown nagios:nagios /usr/local/nagios/var/nagios.configtest
+ /bin/rm /tmp/.configtest.vglw3ydh
+ return 0
+ test '!' -f /usr/local/nagios/var/nagios.lock
+ /etc/init.d/nagios start
[ ok ] Starting nagios (via systemctl): nagios.service.

Posted: **Fri Aug 26, 2016 1:57 pm**

Since this is a Core thing and our C dev will need to look at it, we might be better off filing this on GitHub so it's on his radar sooner:

https://github.com/NagiosEnterprises/nagioscore/issues

If you have an account please make a post there referencing this thread, otherwise let me know and I can file on your behalf.

Posted: **Tue Aug 30, 2016 3:48 am**

As per advise I have opened the following on GitHub:
4.2.0 not reloading #158 https://github.com/NagiosEnterprises/na ... issues/158
Will keep this thread updated.

Posted: **Tue Aug 30, 2016 9:40 am**

<Yoda voice>On my radar, it is.</Yoda voice>

Run a tail -f /usr/local/nagios/var/nagios.log and do a kill -HUP <pid-of-nagios> and let me know if the reload happens.

Posted: **Tue Aug 30, 2016 10:40 am**

Hi Yoda

,
Thank you for your reply. I have added a new host to test the reload:

Nagios process before the test

Code: Select all

ps -aux | grep -i nagios
nagios    8705  0.0  0.1  14972  2708 ?        Ss   13:06   0:02 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
nagios    8710  0.0  0.1  10696  2312 ?        S    13:06   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    8711  0.0  0.1  10696  2200 ?        S    13:06   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    8712  0.0  0.1  10696  2312 ?        S    13:06   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    8713  0.0  0.1  10696  2204 ?        S    13:06   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios    8721  0.0  0.0  14456  1444 ?        S    13:06   0:01 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg

Kill and reload

Code: Select all

root@dev-nagios-01: kill -HUP 8705
root@dev-nagios-01: /etc/init.d/nagios reload
Running configuration check...
[ ok ] Starting nagios (via systemctl): nagios.service.

Nagios PID after the reload (to notice the <defunct>)

Code: Select all

root@dev-nagios-01: ps -aux | grep -i nagios
nagios    8705  0.0  0.1  15032  3344 ?        Ss   13:06   0:02 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
nagios    8712  0.0  0.0      0     0 ?        Z    13:06   0:00 [nagios] <defunct>
nagios    8713  0.0  0.0      0     0 ?        Z    13:06   0:00 [nagios] <defunct>
nagios    8721  0.0  0.0  14456  1444 ?        S    13:06   0:01 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
root     13083  0.0  0.0   5844   676 pts/1    S+   16:16   0:00 tail -f /usr/local/nagios/var/nagios.log
nagios   13303  0.0  0.1  10696  2232 ?        S    16:25   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   13304  0.0  0.1  10696  2276 ?        S    16:25   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   13305  0.0  0.1  10696  2216 ?        S    16:25   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   13306  0.0  0.1  10696  2220 ?        S    16:25   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh

meanwhile the nagios.log

Code: Select all

root@dev-nagios-01: tail -f /usr/local/nagios/var/nagios.log
[1472558813] nerd: Fully initialized and ready to rock!
[1472558813] wproc: Successfully registered manager as @wproc with query handler
[1472558813] wproc: Registry request: name=Core Worker 8710;pid=8710
[1472558813] wproc: Registry request: name=Core Worker 8712;pid=8712
[1472558813] wproc: Registry request: name=Core Worker 8713;pid=8713
[1472558813] wproc: Registry request: name=Core Worker 8711;pid=8711
[1472558813] Successfully launched command file worker with pid 8721
[1472562413] Auto-save of retention data completed successfully.
[1472566013] Auto-save of retention data completed successfully.
[1472569612] Auto-save of retention data completed successfully.
[1472570703] Caught SIGHUP, restarting...
[1472570703] Event broker module 'NERD' deinitialized successfully.
[1472570703] Nagios 4.2.0 starting... (PID=8705)
[1472570703] Local time is Tue Aug 30 16:25:03 BST 2016
[1472570703] LOG VERSION: 2.0
[1472570703] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1472570703] qh: core query handler registered
[1472570703] nerd: Channel hostchecks registered successfully
[1472570703] nerd: Channel servicechecks registered successfully
[1472570703] nerd: Channel opathchecks registered successfully
[1472570703] nerd: Fully initialized and ready to rock!
[1472570703] wproc: Successfully registered manager as @wproc with query handler
[1472570703] wproc: Registry request: name=Core Worker 13303;pid=13303
[1472570703] wproc: Registry request: name=Core Worker 13305;pid=13305
[1472570703] wproc: Registry request: name=Core Worker 13306;pid=13306
[1472570703] wproc: Registry request: name=Core Worker 13304;pid=13304

Had to edit my last statement because I don't get a reload but a re-start since on the web interface all the hosts are grayed out and PENDING.

Posted: **Tue Aug 30, 2016 11:07 am**

Pitone_Maledetto wrote: Kill and reload

Code: Select all

root@dev-nagios-01: kill -HUP 8705
root@dev-nagios-01: /etc/init.d/nagios reload
Running configuration check...
[ ok ] Starting nagios (via systemctl): nagios.service.

meanwhile the nagios.log

Code: Select all

root@dev-nagios-01: tail -f /usr/local/nagios/var/nagios.log
[1472570703] Caught SIGHUP, restarting...
[1472570703] Event broker module 'NERD' deinitialized successfully.
[1472570703] Nagios 4.2.0 starting... (PID=8705)
[1472570703] Local time is Tue Aug 30 16:25:03 BST 2016
[1472570703] LOG VERSION: 2.0

Reload successful!

I see you did both the kill -HUP 8705 and /etc/init.d/nagios reload. Did the reload happen from the kill or the /etc/init.d/nagios reload?

I also noticed it said Starting nagios (via systemctl): nagios.service. I didn't think Jessie was running systemd. Please attach /etc/init.d/nagios and /usr/lib/systemd/system/nagios.service files.

Since you're running systemd, maybe also try systemctl reload nagios and see if that works.

Does kill -HUP 8705 work?
Does /etc/init.d/nagios reload work?
Does systemctl reload nagios work?

Posted: **Wed Aug 31, 2016 12:54 am**

Thanks jfrickson,

Attached you will find the files requested but nagios.service is not inside the directory that you have mentioned but in two different locations:
/etc/systemd/system/nagios.service (attached)
/etc/systemd/system/multi-user.target.wants/nagios.service

I have stopped Nagios and started afresh. I have added a check to test the reload:

NAGIOS STARTED:

Code: Select all

[1472620492] Nagios 4.2.0 starting... (PID=1916)
[1472620492] Local time is Wed Aug 31 06:14:52 BST 2016
[1472620492] LOG VERSION: 2.0
[1472620492] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1472620492] qh: core query handler registered
[1472620492] nerd: Channel hostchecks registered successfully
[1472620492] nerd: Channel servicechecks registered successfully
[1472620492] nerd: Channel opathchecks registered successfully
[1472620492] nerd: Fully initialized and ready to rock!
[1472620492] wproc: Successfully registered manager as @wproc with query handler
[1472620492] wproc: Registry request: name=Core Worker 1921;pid=1921
[1472620492] wproc: Registry request: name=Core Worker 1925;pid=1925
[1472620492] wproc: Registry request: name=Core Worker 1924;pid=1924
[1472620492] wproc: Registry request: name=Core Worker 1922;pid=1922
[1472620492] Successfully launched command file worker with pid 1934

kill -HUP (did not reload but re-started the service; all hosts are in PENDING status)

Code: Select all

[1472620680] Caught SIGHUP, restarting...
[1472620680] Event broker module 'NERD' deinitialized successfully.
[1472620680] Nagios 4.2.0 starting... (PID=1916)
[1472620680] Local time is Wed Aug 31 06:18:00 BST 2016
[1472620680] LOG VERSION: 2.0
[1472620680] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1472620680] qh: core query handler registered
[1472620680] nerd: Channel hostchecks registered successfully
[1472620680] nerd: Channel servicechecks registered successfully
[1472620680] nerd: Channel opathchecks registered successfully
[1472620680] nerd: Fully initialized and ready to rock!
[1472620680] wproc: Successfully registered manager as @wproc with query handler
[1472620680] wproc: Registry request: name=Core Worker 2043;pid=2043
[1472620680] wproc: Registry request: name=Core Worker 2046;pid=2046
[1472620680] wproc: Registry request: name=Core Worker 2044;pid=2044
[1472620680] wproc: Registry request: name=Core Worker 2045;pid=2045

/etc/init.d/nagios reload (did not reload not changes in log output)

systemctl reload nagios (Failed)
root@dev-nagios-01: systemctl reload nagios
Failed to reload nagios.service: Job type reload is not applicable for unit nagios.service.

Thank you for your time

Posted: **Wed Aug 31, 2016 9:45 am**

Now that I have a better idea of what's going on, all further conversation will take place on the github issue - https://github.com/NagiosEnterprises/na ... issues/158

Nagios Support Forum

4.2.0 not reloading

4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading

Re: 4.2.0 not reloading