This support forum board is for support questions relating to
Nagios XI , our flagship commercial network monitoring solution.
Johnsmit
Posts: 95 Joined: Thu Apr 19, 2018 2:03 pm
Post
by Johnsmit » Tue May 29, 2018 3:49 pm
I'm working on replication with Nagios databases, mean while I got this error.
nagiosxi 5.4.13 version, RHEL 7.3 environment.
from Nagios Log:
Code: Select all
Successfully launched command file worker with pid 29149
[1527615723] Caught SIGTERM, shutting down...
[1527615723] Successfully shutdown... (PID=29107)
[1527615723] Event broker module 'NERD' deinitialized successfully.
[1527615723] ndomod: Shutdown complete.
[1527615723] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1527615724] Nagios 4.2.4 starting... (PID=29611)
[1527615724] Local time is Tue May 29 13:42:04 EDT 2018
[1527615724] LOG VERSION: 2.0
[1527615724] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1527615724] qh: core query handler registered
[1527615724] nerd: Channel hostchecks registered successfully
[1527615724] nerd: Channel servicechecks registered successfully
[1527615724] nerd: Channel opathchecks registered successfully
[1527615724] nerd: Fully initialized and ready to rock!
[1527615724] wproc: Successfully registered manager as @wproc with query handler
[1527615724] wproc: Registry request: name=Core Worker 29612;pid=29612
[1527615724] wproc: Registry request: name=Core Worker 29613;pid=29613
[1527615724] wproc: Registry request: name=Core Worker 29615;pid=29615
[1527615724] wproc: Registry request: name=Core Worker 29614;pid=29614
[1527615724] ndomod: NDOMOD 2.1.2 (11-14-2016) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1527615724] ndomod: Successfully connected to data sink. 0 queued items to flush.
[1527615724] ndomod registered for process data
[1527615724] ndomod registered for log data'
[1527615724] ndomod registered for system command data'
[1527615724] ndomod registered for event handler data'
[1527615724] ndomod registered for notification data'
[1527615724] ndomod registered for comment data'
[1527615724] ndomod registered for downtime data'
[1527615724] ndomod registered for flapping data'
[1527615724] ndomod registered for program status data'
[1527615724] ndomod registered for host status data'
[1527615724] ndomod registered for service status data'
[1527615724] ndomod registered for adaptive program data'
[1527615724] ndomod registered for adaptive host data'
[1527615724] ndomod registered for adaptive service data'
[1527615724] ndomod registered for external command data'
[1527615724] ndomod registered for aggregated status data'
[1527615724] ndomod registered for retention data'
[1527615724] ndomod registered for contact data'
[1527615724] ndomod registered for contact notification data'
[1527615724] ndomod registered for acknowledgement data'
[1527615724] ndomod registered for state change data'
[1527615724] ndomod registered for contact status data'
[1527615724] ndomod registered for adaptive contact data'
[1527615724] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1527615730] Successfully launched command file worker with pid 29651
[1527617301] Caught SIGTERM, shutting down...
Any thoughts?
Last edited by
tmcdonald on Tue May 29, 2018 3:53 pm, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
scottwilkerson
DevOps Engineer
Posts: 19396 Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:
Post
by scottwilkerson » Tue May 29, 2018 4:39 pm
This happens when the nagios service stops or is restarted (this includes someone Applying configuration, or running a wizard)
Johnsmit
Posts: 95 Joined: Thu Apr 19, 2018 2:03 pm
Post
by Johnsmit » Wed May 30, 2018 8:42 am
Hello,
none of us is applying configuration or running a wizard, while replicating master to slave, i saw the error on both master and slave. its breaking replication in my case. do we need to block it?
Thanks,
scottwilkerson
DevOps Engineer
Posts: 19396 Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:
Post
by scottwilkerson » Wed May 30, 2018 8:52 am
Johnsmit wrote: Hello,
none of us is applying configuration or running a wizard, while replicating master to slave, i saw the error on both master and slave. its breaking replication in my case. do we need to block it?
Thanks,
There is nothing to block. This is the program restarting, you CANNOT block that.
Johnsmit
Posts: 95 Joined: Thu Apr 19, 2018 2:03 pm
Post
by Johnsmit » Wed May 30, 2018 10:01 am
I was reading
https://support.nagios.com/forum/viewto ... =6&t=37769 and checked the nagios.cfg file and saw the following:-
grep broker nagios.cfg
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
event_broker_options=-1
This broker module seems to be causing the SIGTERM database issue which means it is impossible to use database replication for Nagios. The article is from 2014 i can't believe that there isn't a fix for this problem. SIGTERM is not a normal Linux System process. Please Help we have been spinning our wheels for days. The replication works one time then we get errors caused by the SIGTERM.
Thanks
scottwilkerson
DevOps Engineer
Posts: 19396 Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:
Post
by scottwilkerson » Wed May 30, 2018 10:58 am
SIGTERM isn't a process it is a signal TO as process to tell it to terminate. Seeing this in the nagios log has nothing to do with database replication.
Johnsmit
Posts: 95 Joined: Thu Apr 19, 2018 2:03 pm
Post
by Johnsmit » Wed May 30, 2018 11:16 am
whats the function of "broker_module", what does it do? while replication it stops and shutting down the ndo2db, so the databases are out of sync and breaking replication
scottwilkerson
DevOps Engineer
Posts: 19396 Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:
Post
by scottwilkerson » Wed May 30, 2018 11:57 am
the broker_module you are referencing is what extends Nagios Core to be able to put data into the database.
Without it, you would have no data to replicate.