Issue..NDO3/Nagios
Posted: Tue Mar 30, 2021 12:09 pm
We are running 5.8.2 XI. While watching /usr/local/nagios/var/nagios.log, we are constantly seeing in the logs like someone keeps pressing the 'Apply Configuration' button. We see this:
[1617112105] NDO-3: Started acknowledgement thread
[1617112105] NDO-3: Started notification thread
[1617112105] NDO-3: Started service_status thread
[1617112105] NDO-3: Started statechange thread
[1617112105] NDO-3: Started downtime thread
[1617112105] Successfully launched command file worker with pid 652150
[1617112105] Successfully shutdown... (PID=651953)
[1617112105] NDO-3: Callbacks deregistered
[1617112105] NDO-3: NDO - Shutdown complete
[1617112105] Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
We also see this sometimes:
[1617112988] Caught SIGTERM, shutting down...
[1617112988] Caught SIGTERM, shutting down...
[1617112988] Caught SIGTERM, shutting down...
[1617112988] Successfully shutdown... (PID=658264)
[1617112988] NDO-3: Callbacks deregistered
[1617112988] NDO-3: NDO - Shutdown complete
[1617112988] Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
[1617112988] Nagios 4.4.6 starting... (PID=659313)
[1617112988] Local time is Tue Mar 30 09:03:08 CDT 2021
[1617112988] LOG VERSION: 2.0
[1617112988] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1617112988] qh: core query handler registered
[1617112988] qh: echo service query handler registered
[1617112988] qh: help for the query handler registered
[1617112988] wproc: Successfully registered manager as @wproc with query handler
[1617112988] wproc: Registry request: name=Core Worker 659317;pid=659317
[1617112988] wproc: Registry request: name=Core Worker 659318;pid=659318
[1617112988] wproc: Registry request: name=Core Worker 659319;pid=659319
[1617112988] wproc: Registry request: name=Core Worker 659321;pid=659321
[1617112988] wproc: Registry request: name=Core Worker 659324;pid=659324
[1617112988] wproc: Registry request: name=Core Worker 659323;pid=659323
[1617112988] wproc: Registry request: name=Core Worker 659325;pid=659325
[1617112988] wproc: Registry request: name=Core Worker 659331;pid=659331
[1617112988] wproc: Registry request: name=Core Worker 659332;pid=659332
[1617112988] wproc: Registry request: name=Core Worker 659327;pid=659327
[1617112988] wproc: Registry request: name=Core Worker 659330;pid=659330
[1617112988] wproc: Registry request: name=Core Worker 659328;pid=659328
[1617112988] NDO-3: NDO 3.0.6RC1 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1617112988] NDO-3: Callbacks registered
[1617112988] NDO-3: Callbacks registered
After so many times, the nagios service fails:
[root@nagios-new log]# systemctl status nagios
● nagios.service - Nagios Core 4.4.6
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit-hit) since Tue 2021-03-30 09:07:47 CDT; 2min 58s ago
Docs: https://www.nagios.org/documentation
Process: 671339 ExecStopPost=/usr/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
Process: 671315 ExecStop=/usr/bin/kill -s TERM ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 671133 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Process: 671130 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 671134 (code=exited, status=0/SUCCESS)
Mar 30 09:07:46 nagios-new.acentek.net nagios[671134]: Warning: Contact 'testing' has no service notification time period defined!
Mar 30 09:07:46 nagios-new.acentek.net nagios[671134]: Warning: Contact 'testing' has no host notification time period defined!
Mar 30 09:07:47 nagios-new.acentek.net nagios[671134]: Successfully launched command file worker with pid 671337
Mar 30 09:07:47 nagios-new.acentek.net nagios[671134]: Successfully shutdown... (PID=671134)
Mar 30 09:07:47 nagios-new.acentek.net nagios[671134]: Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: nagios.service: Succeeded.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: Stopped Nagios Core 4.4.6.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: nagios.service: Start request repeated too quickly.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: nagios.service: Failed with result 'start-limit-hit'.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: Failed to start Nagios Core 4.4.6.
Server OS:
Linux nagios-new.acentek.net 4.18.0-240.1.1.el8_3.x86_64 #1 SMP Thu Nov 19 17:20:08 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[1617112105] NDO-3: Started acknowledgement thread
[1617112105] NDO-3: Started notification thread
[1617112105] NDO-3: Started service_status thread
[1617112105] NDO-3: Started statechange thread
[1617112105] NDO-3: Started downtime thread
[1617112105] Successfully launched command file worker with pid 652150
[1617112105] Successfully shutdown... (PID=651953)
[1617112105] NDO-3: Callbacks deregistered
[1617112105] NDO-3: NDO - Shutdown complete
[1617112105] Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
We also see this sometimes:
[1617112988] Caught SIGTERM, shutting down...
[1617112988] Caught SIGTERM, shutting down...
[1617112988] Caught SIGTERM, shutting down...
[1617112988] Successfully shutdown... (PID=658264)
[1617112988] NDO-3: Callbacks deregistered
[1617112988] NDO-3: NDO - Shutdown complete
[1617112988] Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
[1617112988] Nagios 4.4.6 starting... (PID=659313)
[1617112988] Local time is Tue Mar 30 09:03:08 CDT 2021
[1617112988] LOG VERSION: 2.0
[1617112988] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1617112988] qh: core query handler registered
[1617112988] qh: echo service query handler registered
[1617112988] qh: help for the query handler registered
[1617112988] wproc: Successfully registered manager as @wproc with query handler
[1617112988] wproc: Registry request: name=Core Worker 659317;pid=659317
[1617112988] wproc: Registry request: name=Core Worker 659318;pid=659318
[1617112988] wproc: Registry request: name=Core Worker 659319;pid=659319
[1617112988] wproc: Registry request: name=Core Worker 659321;pid=659321
[1617112988] wproc: Registry request: name=Core Worker 659324;pid=659324
[1617112988] wproc: Registry request: name=Core Worker 659323;pid=659323
[1617112988] wproc: Registry request: name=Core Worker 659325;pid=659325
[1617112988] wproc: Registry request: name=Core Worker 659331;pid=659331
[1617112988] wproc: Registry request: name=Core Worker 659332;pid=659332
[1617112988] wproc: Registry request: name=Core Worker 659327;pid=659327
[1617112988] wproc: Registry request: name=Core Worker 659330;pid=659330
[1617112988] wproc: Registry request: name=Core Worker 659328;pid=659328
[1617112988] NDO-3: NDO 3.0.6RC1 (c) Copyright 2009-2020 Nagios - Nagios Core Development Team
[1617112988] NDO-3: Callbacks registered
[1617112988] NDO-3: Callbacks registered
After so many times, the nagios service fails:
[root@nagios-new log]# systemctl status nagios
● nagios.service - Nagios Core 4.4.6
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit-hit) since Tue 2021-03-30 09:07:47 CDT; 2min 58s ago
Docs: https://www.nagios.org/documentation
Process: 671339 ExecStopPost=/usr/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
Process: 671315 ExecStop=/usr/bin/kill -s TERM ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 671133 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Process: 671130 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 671134 (code=exited, status=0/SUCCESS)
Mar 30 09:07:46 nagios-new.acentek.net nagios[671134]: Warning: Contact 'testing' has no service notification time period defined!
Mar 30 09:07:46 nagios-new.acentek.net nagios[671134]: Warning: Contact 'testing' has no host notification time period defined!
Mar 30 09:07:47 nagios-new.acentek.net nagios[671134]: Successfully launched command file worker with pid 671337
Mar 30 09:07:47 nagios-new.acentek.net nagios[671134]: Successfully shutdown... (PID=671134)
Mar 30 09:07:47 nagios-new.acentek.net nagios[671134]: Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: nagios.service: Succeeded.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: Stopped Nagios Core 4.4.6.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: nagios.service: Start request repeated too quickly.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: nagios.service: Failed with result 'start-limit-hit'.
Mar 30 09:07:47 nagios-new.acentek.net systemd[1]: Failed to start Nagios Core 4.4.6.
Server OS:
Linux nagios-new.acentek.net 4.18.0-240.1.1.el8_3.x86_64 #1 SMP Thu Nov 19 17:20:08 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux