Page 2 of 2

Re: Problem with "host escalation" config and monitoring eng

Posted: Mon Jan 07, 2019 9:38 am
by scottwilkerson
Are you seeing anything in the nagios.log that could be a clue to why it is dying?

Code: Select all

tail -50 /usr/local/nagios/var/nagios.log
Also can we verify there isn't a current process running

Code: Select all

ps -ef|grep nagios.cfg

Re: Problem with "host escalation" config and monitoring eng

Posted: Mon Jan 07, 2019 10:27 am
by morabanc
Hi Scott,

Thanks for your time.

Always we detect this lines in nagios.log

It seems to be all correct.

tail -50 /usr/local/nagios/var/nagios.log

Code: Select all

[1546871745] wproc: Registry request: name=Core Worker 31872;pid=31872
[1546871745] wproc: Registry request: name=Core Worker 31873;pid=31873
[1546871745] wproc: Registry request: name=Core Worker 31874;pid=31874
[1546871745] wproc: Registry request: name=Core Worker 31875;pid=31875
[1546871745] wproc: Registry request: name=Core Worker 31876;pid=31876
[1546871745] wproc: Registry request: name=Core Worker 31878;pid=31878
[1546871745] wproc: Registry request: name=Core Worker 31879;pid=31879
[1546871745] wproc: Registry request: name=Core Worker 31881;pid=31881
[1546871745] wproc: Registry request: name=Core Worker 31882;pid=31882
[1546871745] wproc: Registry request: name=Core Worker 31883;pid=31883
[1546871745] wproc: Registry request: name=Core Worker 31884;pid=31884
[1546871745] wproc: Registry request: name=Core Worker 31885;pid=31885
[1546871745] wproc: Registry request: name=Core Worker 31886;pid=31886
[1546871745] wproc: Registry request: name=Core Worker 31887;pid=31887
[1546871745] wproc: Registry request: name=Core Worker 31888;pid=31888
[1546871745] wproc: Registry request: name=Core Worker 31889;pid=31889
[1546871745] wproc: Registry request: name=Core Worker 31890;pid=31890
[1546871745] wproc: Registry request: name=Core Worker 31891;pid=31891
[1546871745] wproc: Registry request: name=Core Worker 31892;pid=31892
[1546871745] wproc: Registry request: name=Core Worker 31893;pid=31893
[1546871745] wproc: Registry request: name=Core Worker 31894;pid=31894
[1546871745] wproc: Registry request: name=Core Worker 31895;pid=31895
[1546871745] wproc: Registry request: name=Core Worker 31896;pid=31896
[1546871745] wproc: Registry request: name=Core Worker 31898;pid=31898
[1546871745] ndomod: NDOMOD 2.1.2 (11-14-2016) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1546871745] ndomod: Successfully connected to data sink.  0 queued items to flush.
[1546871745] ndomod registered for process data
[1546871745] ndomod registered for log data'
[1546871745] ndomod registered for system command data'
[1546871745] ndomod registered for event handler data'
[1546871745] ndomod registered for notification data'
[1546871745] ndomod registered for comment data'
[1546871745] ndomod registered for downtime data'
[1546871745] ndomod registered for flapping data'
[1546871745] ndomod registered for program status data'
[1546871745] ndomod registered for host status data'
[1546871745] ndomod registered for service status data'
[1546871745] ndomod registered for adaptive program data'
[1546871745] ndomod registered for adaptive host data'
[1546871745] ndomod registered for adaptive service data'
[1546871745] ndomod registered for external command data'
[1546871745] ndomod registered for aggregated status data'
[1546871745] ndomod registered for retention data'
[1546871745] ndomod registered for contact data'
[1546871745] ndomod registered for contact notification data'
[1546871745] ndomod registered for acknowledgement data'
[1546871745] ndomod registered for state change data'
[1546871745] ndomod registered for contact status data'
[1546871745] ndomod registered for adaptive contact data'
[1546871745] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
No process active for nagios.cfg

ps -ef|grep nagios.cfg

Code: Select all

root      9788  7411  0 16:02 pts/1    00:00:00 grep --color=auto nagios.cfg

After few days, now i'm able to apply config. But Monitoring Engine still down and don't become up.

I attach you 2 pic to show you our state. After try to "start" processs we receive this logs at nagios.log:

Code: Select all

[1546873693] Nagios 4.4.2 starting... (PID=13135)
[1546873693] Local time is Mon Jan 07 16:08:13 CET 2019
[1546873693] LOG VERSION: 2.0
[1546873693] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1546873693] qh: core query handler registered
[1546873693] qh: echo service query handler registered
[1546873693] qh: help for the query handler registered
[1546873693] wproc: Successfully registered manager as @wproc with query handler
[1546873693] wproc: Registry request: name=Core Worker 13136;pid=13136
[1546873693] wproc: Registry request: name=Core Worker 13138;pid=13138
[1546873693] wproc: Registry request: name=Core Worker 13139;pid=13139
[1546873693] wproc: Registry request: name=Core Worker 13140;pid=13140
[1546873693] wproc: Registry request: name=Core Worker 13141;pid=13141
[1546873693] wproc: Registry request: name=Core Worker 13143;pid=13143
[1546873693] wproc: Registry request: name=Core Worker 13142;pid=13142
[1546873693] wproc: Registry request: name=Core Worker 13144;pid=13144
[1546873693] wproc: Registry request: name=Core Worker 13145;pid=13145
[1546873693] wproc: Registry request: name=Core Worker 13146;pid=13146
[1546873693] wproc: Registry request: name=Core Worker 13147;pid=13147
[1546873693] wproc: Registry request: name=Core Worker 13148;pid=13148
[1546873693] wproc: Registry request: name=Core Worker 13149;pid=13149
[1546873693] wproc: Registry request: name=Core Worker 13150;pid=13150
[1546873693] wproc: Registry request: name=Core Worker 13151;pid=13151
[1546873693] wproc: Registry request: name=Core Worker 13152;pid=13152
[1546873693] wproc: Registry request: name=Core Worker 13153;pid=13153
[1546873693] wproc: Registry request: name=Core Worker 13155;pid=13155
[1546873693] wproc: Registry request: name=Core Worker 13154;pid=13154
[1546873693] wproc: Registry request: name=Core Worker 13156;pid=13156
[1546873693] wproc: Registry request: name=Core Worker 13157;pid=13157
[1546873693] wproc: Registry request: name=Core Worker 13158;pid=13158
[1546873693] wproc: Registry request: name=Core Worker 13160;pid=13160
[1546873693] wproc: Registry request: name=Core Worker 13161;pid=13161
[1546873693] wproc: Registry request: name=Core Worker 13162;pid=13162
[1546873693] wproc: Registry request: name=Core Worker 13163;pid=13163
[1546873693] wproc: Registry request: name=Core Worker 13164;pid=13164
[1546873693] wproc: Registry request: name=Core Worker 13165;pid=13165
[1546873693] wproc: Registry request: name=Core Worker 13168;pid=13168
[1546873693] wproc: Registry request: name=Core Worker 13169;pid=13169
[1546873693] wproc: Registry request: name=Core Worker 13170;pid=13170
[1546873693] wproc: Registry request: name=Core Worker 13171;pid=13171
[1546873693] wproc: Registry request: name=Core Worker 13172;pid=13172
[1546873693] wproc: Registry request: name=Core Worker 13173;pid=13173
[1546873693] wproc: Registry request: name=Core Worker 13175;pid=13175
[1546873693] wproc: Registry request: name=Core Worker 13176;pid=13176
[1546873693] ndomod: NDOMOD 2.1.2 (11-14-2016) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1546873693] ndomod: Successfully connected to data sink.  0 queued items to flush.
[1546873693] ndomod registered for process data
[1546873693] ndomod registered for log data'
[1546873693] ndomod registered for system command data'
[1546873693] ndomod registered for event handler data'
[1546873693] ndomod registered for notification data'
[1546873693] ndomod registered for comment data'
[1546873693] ndomod registered for downtime data'
[1546873693] ndomod registered for flapping data'
[1546873693] ndomod registered for program status data'
[1546873693] ndomod registered for host status data'
[1546873693] ndomod registered for service status data'
[1546873693] ndomod registered for adaptive program data'
[1546873693] ndomod registered for adaptive host data'
[1546873693] ndomod registered for adaptive service data'
[1546873693] ndomod registered for external command data'
[1546873693] ndomod registered for aggregated status data'
[1546873693] ndomod registered for retention data'
[1546873693] ndomod registered for contact data'
[1546873693] ndomod registered for contact notification data'
[1546873693] ndomod registered for acknowledgement data'
[1546873693] ndomod registered for state change data'
[1546873693] ndomod registered for contact status data'
[1546873693] ndomod registered for adaptive contact data'
[1546873693] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
After that, I tried to upgrade from version 5.5.7 to 5.5.8 successfully and it seems to be all OK.

I compare nagios.log with new information and it's similar (only we have 1 new line about nagios.cfg process).

After close the ticket, we would like to know why success this fail to prevent in the future. Can we know that information ?


Thanks for all

Re: Problem with "host escalation" config and monitoring eng

Posted: Mon Jan 07, 2019 10:44 am
by scottwilkerson
SO I am clear after upgrading to 5.5.8 everything was working successfully?

If so, you could have been affected by a bug that was updated in 5.5.8 that had to do with the ndoutils integration that could cause nagios to segfault on startup.

Code: Select all

- Fixed issue with specific configurations in ndoutils causing Core to crash by updating ndoutils to 2.1.3 -JO

Re: Problem with "host escalation" config and monitoring eng

Posted: Mon Jan 07, 2019 11:46 am
by morabanc
Thanks for all Scott !

Re: Problem with "host escalation" config and monitoring eng

Posted: Mon Jan 07, 2019 3:10 pm
by scottwilkerson
morabanc wrote:Thanks for all Scott !
No problem Locking thread