Nagios XI host check orphaned and duplicate nagios process

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by lmiltchev »

I'm not sure if it helped. I won't know until it happens again.
Let us know if it happens again. I will keep the post open.
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

So this just happened again. Here is what I caught from the audit log.

Code: Select all

2016-02-29 11:02:02	54876	Nagios XI	INFO		localhost	User submitted a command to the subsystem (ID=1117)
2016-02-29 11:02:01	54875	Nagios XI	INFO		localhost	User submitted a command to the subsystem (ID=1119)
2016-02-29 10:54:22	54874	Nagios CCM	SECURITY	nagiosxi	localhost	nagiosxi successfully logged into Nagios CCM
2016-02-29 10:54:21	54873	Nagios CCM	SECURITY	nagiosxi	localhost	nagiosxi successfully logged into Nagios CCM
2016-02-29 10:54:21	54872	Nagios XI	INFO		localhost	cmdsubsys: User applied a new configuration to Nagios Core
2016-02-29 10:54:20	54871	Nagios XI	INFO	user1	userworkstationip	User submitted a command to the subsystem (ID=17)
2016-02-29 10:54:20	54870	Nagios XI	INFO	user1	userworkstationip	User applied a new monitoring configuration
2016-02-29 10:54:16	54869	Nagios CCM	DELETE	user1	localhost	2 items deleted from database
2016-02-29 10:54:16	54868	Nagios CCM	MODIFY	user1	localhost	Host file deleted: server3.cfg
2016-02-29 10:54:16	54867	Nagios CCM	MODIFY	user1	localhost	Host file deleted: server1.cfg
2016-02-29 10:53:59	54866	Nagios CCM	DELETE	user1	localhost	28 items deleted from database
2016-02-29 10:53:58	54865	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:58	54864	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:58	54863	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:58	54862	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:57	54861	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:57	54860	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:57	54859	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:57	54858	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:57	54857	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:56	54856	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:56	54855	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:56	54854	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:56	54853	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:56	54852	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:55	54851	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:55	54850	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:55	54849	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server3.cfg
2016-02-29 10:53:55	54848	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:55	54847	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:54	54846	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:54	54845	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:54	54844	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:54	54843	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:54	54842	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:53	54841	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:53	54840	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:53	54839	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:53:53	54838	Nagios CCM	MODIFY	user1	localhost	Service file deleted: server1.cfg
2016-02-29 10:52:52	54837	Nagios CCM	MODIFY	user1	localhost	Auto-login via Nagios XI successful

/var/log/messages

Code: Select all

Feb 29 10:54:24 nagiosxi nagios: SERVICE ALERT: someserver2;TCP 443;CRITICAL;SOFT;1;connect to address <IP ADDRESS> and port 443: Connection refused
Feb 29 10:54:24 nagiosxi nagios: GLOBAL SERVICE EVENT HANDLER: someserver99;TCP 443;CRITICAL;SOFT;1;xi_service_event_handler
Feb 29 10:54:50 nagiosxi xinetd[3634]: START: nrpe pid=63690 from=<xitestworkerfromtestenvironment242>
Feb 29 10:54:50 nagiosxi xinetd[63690]: FAIL: nrpe address from=<xitestworkerfromtestenvironment242>
Feb 29 10:54:50 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=63690 duration=0(sec)
Feb 29 10:54:59 nagiosxi nagios: Nagios 4.0.8 starting... (PID=63735)
Feb 29 10:54:59 nagiosxi nagios: Local time is Mon Feb 29 10:54:59 CST 2016
Feb 29 10:54:59 nagiosxi nagios: LOG VERSION: 2.0
Feb 29 10:54:59 nagiosxi nagios: qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
Feb 29 10:54:59 nagiosxi nagios: qh: core query handler registered
Feb 29 10:54:59 nagiosxi nagios: nerd: Channel hostchecks registered successfully
Feb 29 10:54:59 nagiosxi nagios: nerd: Channel servicechecks registered successfully
Feb 29 10:54:59 nagiosxi nagios: nerd: Channel opathchecks registered successfully
Feb 29 10:54:59 nagiosxi nagios: nerd: Fully initialized and ready to rock!
Feb 29 10:54:59 nagiosxi nagios: wproc: Successfully registered manager as @wproc with query handler
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63738;pid=63738
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63739;pid=63739
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63737;pid=63737
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63742;pid=63742
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63743;pid=63743
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63783;pid=63783
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63744;pid=63744
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63745;pid=63745
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63746;pid=63746
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63748;pid=63748
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63747;pid=63747
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63751;pid=63751
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63749;pid=63749
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63752;pid=63752
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63753;pid=63753
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63786;pid=63786
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63756;pid=63756
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63754;pid=63754
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63755;pid=63755
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63757;pid=63757
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63760;pid=63760
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63758;pid=63758
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63759;pid=63759
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63763;pid=63763
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63761;pid=63761
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63765;pid=63765
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63762;pid=63762
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63766;pid=63766
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63764;pid=63764
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63768;pid=63768
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63771;pid=63771
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63769;pid=63769
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63767;pid=63767
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63772;pid=63772
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63776;pid=63776
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63777;pid=63777
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63775;pid=63775
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63774;pid=63774
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63779;pid=63779
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63778;pid=63778
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63782;pid=63782
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63741;pid=63741
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63780;pid=63780
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63781;pid=63781
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63785;pid=63785
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63787;pid=63787
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63784;pid=63784
Feb 29 10:54:59 nagiosxi nagios: wproc: Registry request: name=Core Worker 63750;pid=63750
Feb 29 10:54:59 nagiosxi nagios: mod_gearman: initialized version 1.5.0b1 (libgearman 1.1.8)
Feb 29 10:54:59 nagiosxi nagios: Event broker module '/usr/lib64/mod_gearman/mod_gearman.o' initialized successfully.
Feb 29 10:54:59 nagiosxi nagios: ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Feb 29 10:54:59 nagiosxi nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for process data
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for log data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for system command data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for event handler data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for notification data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for comment data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for downtime data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for flapping data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for program status data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for host status data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for service status data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for adaptive program data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for adaptive host data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for adaptive service data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for external command data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for aggregated status data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for retention data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for contact data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for contact notification data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for acknowledgement data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for state change data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for contact status data'
Feb 29 10:54:59 nagiosxi nagios: ndomod registered for adaptive contact data'
Feb 29 10:54:59 nagiosxi nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Feb 29 10:55:03 nagiosxi nagios: Successfully launched command file worker with pid 63972
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>;CPU Load;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>Independent Management Architecture Service;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>Print Manager Service;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>;Disk - C;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>; Service;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>; Service;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>;Manager Service;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>;Manager Service;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>;Manager Service;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>;Monitoring Agent;STARTED; Service has entered a period of scheduled downtime
Feb 29 10:55:03 nagiosxi nagios: SERVICE DOWNTIME ALERT: <randomserver>;Eventlog;STARTED; Service has entered a period of scheduled downtime
<twent more has entered scheduled downtime
Feb 29 10:55:11 nagiosxi nagios: HOST ALERT: <somelinuxserver>;UP;SOFT;2;OK - somelinuxserverip: rta 0.160ms, lost 0%
Feb 29 10:55:11 nagiosxi nagios: GLOBAL HOST EVENT HANDLER: ucmcbns01;UP;SOFT;2;xi_host_event_handler
Feb 29 10:55:52 nagiosxi nagios: SERVICE ALERT: <someserver>;ServiceP;CRITICAL;SOFT;1;ADT_I_P: Stopped
Feb 29 10:55:52 nagiosxi nagios: GLOBAL SERVICE EVENT HANDLER: someserver;ServiceP;CRITICAL;SOFT;1;xi_service_event_handler
Feb 29 10:55:53 nagiosxi xinetd[3634]: START: nrpe pid=64116 from=<TestxiserverIPADDRESS>
Feb 29 10:55:53 nagiosxi xinetd[64116]: FAIL: nrpe address from=<TestxiserverIPADDRESS>
Feb 29 10:55:53 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64116 duration=0(sec)
Feb 29 10:55:59 nagiosxi xinetd[3634]: START: nrpe pid=64128 from=<gearmanworker01>
Feb 29 10:55:59 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64128 duration=0(sec)
Feb 29 10:56:02 nagiosxi nagios: Warning: The results of service 'Eventlog' on host 'somehost' are stale by 0d 0h 38m 22s (threshold=0d 0h 5m 0s).  I'm forcing an immediate check of the service.
<A LOT OF FORCED CHECKS later.....>
Feb 29 10:56:02 nagiosxi rsyslogd-2177: imuxsock begins to drop messages from pid 63735 due to rate-limiting
Feb 29 10:56:09 nagiosxi xinetd[3634]: START: nrpe pid=64254 from=<TestxiserverIPADDRESS>
Feb 29 10:56:09 nagiosxi xinetd[64254]: FAIL: nrpe address from=<TestxiserverIPADDRESS>
Feb 29 10:56:09 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64254 duration=0(sec)
Feb 29 10:56:12 nagiosxi rsyslogd-2177: imuxsock lost 11012 messages from pid 63735 due to rate-limiting
Feb 29 10:56:19 nagiosxi xinetd[3634]: START: nrpe pid=64552 from=<nagiosxi dr IP>
Feb 29 10:56:20 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64552 duration=1(sec)
Feb 29 10:56:23 nagiosxi xinetd[3634]: START: nrpe pid=64603 from=<nagiosxi testserver IP>
Feb 29 10:56:23 nagiosxi xinetd[64603]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 10:56:23 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64603 duration=0(sec)
Feb 29 10:56:24 nagiosxi xinetd[3634]: START: nrpe pid=64604 from=<nagiosxi dr IP>
Feb 29 10:56:24 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64604 duration=0(sec)
Feb 29 10:56:26 nagiosxi xinetd[3634]: START: nrpe pid=64613 from=<nagiosxi dr IP>
Feb 29 10:56:26 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64613 duration=0(sec)
Feb 29 10:56:31 nagiosxi xinetd[3634]: START: nrpe pid=64628 from=<nagiosxi dr IP>
Feb 29 10:56:31 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64628 duration=0(sec)
Feb 29 10:56:33 nagiosxi xinetd[3634]: START: nrpe pid=64636 from=<nagiosxi dr IP>
Feb 29 10:56:33 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64636 duration=0(sec)
Feb 29 10:56:36 nagiosxi xinetd[3634]: START: nrpe pid=64647 from=<gearmanworker2 IP>
Feb 29 10:56:36 nagiosxi xinetd[3634]: START: nrpe pid=64648 from=<gearmanworker1 IP>
Feb 29 10:56:36 nagiosxi xinetd[3634]: START: nrpe pid=64649 from=<gearmanworker3 IP>
Feb 29 10:56:36 nagiosxi xinetd[3634]: START: nrpe pid=64650 from=10.165.145.102
Feb 29 10:56:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64647 duration=0(sec)
Feb 29 10:56:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64648 duration=0(sec)
Feb 29 10:56:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64650 duration=0(sec)
Feb 29 10:56:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64649 duration=0(sec)
Feb 29 10:56:36 nagiosxi xinetd[3634]: START: nrpe pid=64666 from=<nagiosxi dr IP>
Feb 29 10:56:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64666 duration=0(sec)
Feb 29 10:56:40 nagiosxi xinetd[3634]: START: nrpe pid=64685 from=<nagiosxi dr IP>
Feb 29 10:56:40 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64685 duration=0(sec)
Feb 29 10:56:59 nagiosxi xinetd[3634]: START: nrpe pid=64751 from=<gearmanworker1 IP>
Feb 29 10:56:59 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64751 duration=0(sec)
Feb 29 10:57:09 nagiosxi xinetd[3634]: START: nrpe pid=64880 from=<nagiosxi testserver IP>
Feb 29 10:57:09 nagiosxi xinetd[64880]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 10:57:09 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=64880 duration=0(sec)
Feb 29 10:57:49 nagiosxi xinetd[3634]: START: nrpe pid=65039 from=<nagiosxi testserver IP>
Feb 29 10:57:49 nagiosxi xinetd[65039]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 10:57:49 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65039 duration=0(sec)
Feb 29 10:58:45 nagiosxi xinetd[3634]: START: nrpe pid=65310 from=<gearmanworker2 IP>
Feb 29 10:58:45 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65310 duration=0(sec)
Feb 29 10:58:52 nagiosxi xinetd[3634]: START: nrpe pid=65324 from=<nagiosxi testserver IP>
Feb 29 10:58:52 nagiosxi xinetd[65324]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 10:58:52 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65324 duration=0(sec)
Feb 29 10:59:01 nagiosxi auditd[3125]: Audit daemon rotating log files
Feb 29 10:59:06 nagiosxi xinetd[3634]: START: nrpe pid=65455 from=<nagiosxi testserver IP>
Feb 29 10:59:06 nagiosxi xinetd[65455]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 10:59:06 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65455 duration=0(sec)
Feb 29 10:59:19 nagiosxi xinetd[3634]: START: nrpe pid=65470 from=<nagiosxi dr IP>
Feb 29 10:59:20 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65470 duration=1(sec)
Feb 29 10:59:22 nagiosxi xinetd[3634]: START: nrpe pid=65500 from=<nagiosxi testserver IP>
Feb 29 10:59:22 nagiosxi xinetd[65500]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 10:59:22 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65500 duration=0(sec)
Feb 29 10:59:24 nagiosxi xinetd[3634]: START: nrpe pid=65526 from=<nagiosxi dr IP>
Feb 29 10:59:25 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65526 duration=1(sec)
Feb 29 10:59:26 nagiosxi xinetd[3634]: START: nrpe pid=65535 from=<nagiosxi dr IP>
Feb 29 10:59:26 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=65535 duration=0(sec)
Feb 29 10:59:31 nagiosxi xinetd[3634]: START: nrpe pid=655 from=<nagiosxi dr IP>
Feb 29 10:59:31 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=655 duration=0(sec)
Feb 29 10:59:33 nagiosxi xinetd[3634]: START: nrpe pid=662 from=<nagiosxi dr IP>
Feb 29 10:59:33 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=662 duration=0(sec)
Feb 29 10:59:36 nagiosxi xinetd[3634]: START: nrpe pid=675 from=<nagiosxi dr IP>
Feb 29 10:59:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=675 duration=0(sec)
Feb 29 10:59:40 nagiosxi xinetd[3634]: START: nrpe pid=693 from=<nagiosxi dr IP>
Feb 29 10:59:40 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=693 duration=0(sec)
Feb 29 10:59:52 nagiosxi nagios: HOST DOWNTIME ALERT: <somehost>;STOPPED; Host has exited from a period of scheduled downtime
Feb 29 10:59:53 nagiosxi xinetd[3634]: START: nrpe pid=757 from=<gearmanworker1 IP>
Feb 29 10:59:53 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=757 duration=0(sec)
Feb 29 10:59:53 nagiosxi xinetd[3634]: START: nrpe pid=761 from=<gearmanworker1 IP>
Feb 29 10:59:53 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=761 duration=0(sec)
<A LOT OF FORCED CHECKS later.....>
Feb 29 11:00:06 nagiosxi xinetd[916]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:00:06 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=916 duration=0(sec)
Feb 29 11:00:29 nagiosxi xinetd[3634]: START: nrpe pid=1041 from=<gearmanworker2 IP>
Feb 29 11:00:29 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1041 duration=0(sec)
Feb 29 11:00:47 nagiosxi xinetd[3634]: START: nrpe pid=1111 from=<gearmantestworker1 IP>
Feb 29 11:00:47 nagiosxi xinetd[1111]: FAIL: nrpe address from=<gearmantestworker1 IP>
Feb 29 11:00:47 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1111 duration=0(sec)
Feb 29 11:01:39 nagiosxi xinetd[3634]: START: nrpe pid=1370 from=<gearmanworker2 IP>
Feb 29 11:01:39 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1370 duration=0(sec)
Feb 29 11:01:50 nagiosxi xinetd[3634]: START: nrpe pid=1439 from=<nagiosxi testserver IP>
Feb 29 11:01:50 nagiosxi xinetd[1439]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:01:50 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1439 duration=0(sec)
<A LOT OF FORCED CHECKS later.....>
Feb 29 11:02:04 nagiosxi xinetd[3634]: START: nrpe pid=1582 from=<nagiosxi testserver IP>
Feb 29 11:02:04 nagiosxi xinetd[1582]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:02:04 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1582 duration=0(sec)
Feb 29 11:02:19 nagiosxi xinetd[3634]: START: nrpe pid=1597 from=<gearmantestworker1 IP>
Feb 29 11:02:19 nagiosxi xinetd[1597]: FAIL: nrpe address from=<gearmantestworker1 IP>
Feb 29 11:02:19 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1597 duration=0(sec)
Feb 29 11:02:19 nagiosxi xinetd[3634]: START: nrpe pid=1603 from=<nagiosxi dr IP>
Feb 29 11:02:20 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1603 duration=1(sec)
Feb 29 11:02:24 nagiosxi xinetd[3634]: START: nrpe pid=1654 from=<nagiosxi dr IP>
Feb 29 11:02:24 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1654 duration=0(sec)
Feb 29 11:02:26 nagiosxi xinetd[3634]: START: nrpe pid=1661 from=<nagiosxi dr IP>
Feb 29 11:02:26 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1661 duration=0(sec)
Feb 29 11:02:31 nagiosxi xinetd[3634]: START: nrpe pid=1666 from=<nagiosxi dr IP>
Feb 29 11:02:31 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1666 duration=0(sec)
Feb 29 11:02:33 nagiosxi xinetd[3634]: START: nrpe pid=1678 from=<nagiosxi dr IP>
Feb 29 11:02:33 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1678 duration=0(sec)
Feb 29 11:02:36 nagiosxi xinetd[3634]: START: nrpe pid=1686 from=<nagiosxi dr IP>
Feb 29 11:02:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1686 duration=0(sec)
Feb 29 11:02:40 nagiosxi xinetd[3634]: START: nrpe pid=1701 from=<nagiosxi dr IP>
Feb 29 11:02:40 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1701 duration=0(sec)
Feb 29 11:03:00 nagiosxi xinetd[3634]: START: nrpe pid=1769 from=<gearmanworker1 IP>
Feb 29 11:03:00 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1769 duration=0(sec)
Feb 29 11:03:03 nagiosxi xinetd[3634]: START: nrpe pid=1901 from=<nagiosxi testserver IP>
Feb 29 11:03:03 nagiosxi xinetd[1901]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:03:03 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=1901 duration=0(sec)
Feb 29 11:03:46 nagiosxi xinetd[3634]: START: nrpe pid=2042 from=<nagiosxi testserver IP>
Feb 29 11:03:46 nagiosxi xinetd[2042]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:03:46 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2042 duration=0(sec)
Feb 29 11:04:47 nagiosxi xinetd[3634]: START: nrpe pid=2329 from=<gearmantestworker1 IP>
Feb 29 11:04:47 nagiosxi xinetd[2329]: FAIL: nrpe address from=<gearmantestworker1 IP>
Feb 29 11:04:47 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2329 duration=0(sec)
Feb 29 11:05:00 nagiosxi xinetd[3634]: START: nrpe pid=2345 from=<nagiosxi testserver IP>
Feb 29 11:05:00 nagiosxi xinetd[2345]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:05:00 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2345 duration=0(sec)
Feb 29 11:05:16 nagiosxi xinetd[3634]: START: nrpe pid=2735 from=<nagiosxi testserver IP>
Feb 29 11:05:16 nagiosxi xinetd[2735]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:05:16 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2735 duration=0(sec)
Feb 29 11:05:19 nagiosxi xinetd[3634]: START: nrpe pid=2745 from=<nagiosxi dr IP>
Feb 29 11:05:20 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2745 duration=1(sec)
Feb 29 11:05:24 nagiosxi xinetd[3634]: START: nrpe pid=2800 from=<nagiosxi dr IP>
Feb 29 11:05:25 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2800 duration=1(sec)
Feb 29 11:05:26 nagiosxi xinetd[3634]: START: nrpe pid=2809 from=<nagiosxi dr IP>
Feb 29 11:05:26 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2809 duration=0(sec)
Feb 29 11:05:31 nagiosxi xinetd[3634]: START: nrpe pid=2813 from=<nagiosxi dr IP>
Feb 29 11:05:31 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2813 duration=0(sec)
Feb 29 11:05:33 nagiosxi xinetd[3634]: START: nrpe pid=2821 from=<nagiosxi dr IP>
Feb 29 11:05:33 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2821 duration=0(sec)
Feb 29 11:05:36 nagiosxi xinetd[3634]: START: nrpe pid=2837 from=<nagiosxi dr IP>
Feb 29 11:05:36 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2837 duration=0(sec)
Feb 29 11:05:40 nagiosxi xinetd[3634]: START: nrpe pid=2854 from=<nagiosxi dr IP>
Feb 29 11:05:40 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2854 duration=0(sec)
Feb 29 11:06:00 nagiosxi xinetd[3634]: START: nrpe pid=2921 from=<nagiosxi testserver IP>
Feb 29 11:06:00 nagiosxi xinetd[2921]: FAIL: nrpe address from=<nagiosxi testserver IP>
Feb 29 11:06:00 nagiosxi xinetd[3634]: EXIT: nrpe status=0 pid=2921 duration=0(sec)

<A few forced checks...>

Feb 29 11:06:01 nagiosxi nagios: Warning: The check of host 'someserver' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the host...
<A lot of orphaned checks like this>

Then a lot of orphaned checks like this...

Feb 29 11:06:02 nagiosxi nagios: HOST ALERT: <someserver>;DOWN;SOFT;1;(host check orphaned, is the mod-gearman worker on queue 'host' running?)
Last edited by tmcdonald on Mon Feb 29, 2016 5:17 pm, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by tgriep »

Do you see any errors in the Mod Gearman log files?
Can you check those to see if there are any errors that could help on this.

We do have an update to the Mod Gearman. It will install a newer version which might help out on this issue.
Below is the link to the instructions. Take a look and see if that is am option for you.
https://assets.nagios.com/downloads/nag ... ios_XI.pdf
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

gearman server was set to log errors only and no errors occurred so the log file was empty.

This is all I have from the workers during this time frame.

[2016-02-29 10:56:43][64736][DEBUG] child started with pid: 64736
[2016-02-29 10:56:43][64737][DEBUG] child started with pid: 64737
[2016-02-29 10:56:43][64738][DEBUG] child started with pid: 64738
[2016-02-29 10:57:01][64755][DEBUG] child started with pid: 64755
[2016-02-29 10:57:12][64736][DEBUG] got eventhandler job
[2016-02-29 10:57:12][64735][DEBUG] got eventhandler job
[2016-02-29 10:57:12][64737][DEBUG] got eventhandler job
[2016-02-29 10:57:14][64902][DEBUG] child started with pid: 64902
[2016-02-29 10:57:14][64903][DEBUG] child started with pid: 64903
[2016-02-29 10:57:32][64972][DEBUG] child started with pid: 64972
[2016-02-29 10:57:42][64903][DEBUG] got eventhandler job
[2016-02-29 10:57:42][64996][DEBUG] child started with pid: 64996
[2016-02-29 10:57:42][64997][DEBUG] child started with pid: 64997
[2016-02-29 10:57:42][64998][DEBUG] child started with pid: 64998
[2016-02-29 10:57:45][65034][DEBUG] child started with pid: 65034
[2016-02-29 10:58:03][65167][DEBUG] child started with pid: 65167
[2016-02-29 10:58:12][65169][DEBUG] child started with pid: 65169
[2016-02-29 10:58:13][65170][DEBUG] child started with pid: 65170
[2016-02-29 10:58:13][65171][DEBUG] child started with pid: 65171
[2016-02-29 10:58:13][65172][DEBUG] child started with pid: 65172
[2016-02-29 10:58:16][65181][DEBUG] child started with pid: 65181
[2016-02-29 10:58:34][65250][DEBUG] child started with pid: 65250
[2016-02-29 10:58:43][65301][DEBUG] child started with pid: 65301
[2016-02-29 10:58:44][65307][DEBUG] child started with pid: 65307
[2016-02-29 10:58:44][65308][DEBUG] child started with pid: 65308
[2016-02-29 10:58:44][65309][DEBUG] child started with pid: 65309
[2016-02-29 10:58:47][65317][DEBUG] child started with pid: 65317
[2016-02-29 10:59:05][65452][DEBUG] child started with pid: 65452
[2016-02-29 10:59:14][65457][DEBUG] child started with pid: 65457
[2016-02-29 10:59:15][65462][DEBUG] child started with pid: 65462
[2016-02-29 10:59:15][65463][DEBUG] child started with pid: 65463
[2016-02-29 10:59:15][65464][DEBUG] child started with pid: 65464
[2016-02-29 10:59:18][65469][DEBUG] child started with pid: 65469
[2016-02-29 10:59:36][674][DEBUG] child started with pid: 674
[2016-02-29 10:59:43][6867][INFO ] no checks in 2minutes, restarting all workers
[2016-02-29 10:59:47][748][DEBUG] child started with pid: 748
[2016-02-29 10:59:47][749][DEBUG] child started with pid: 749
[2016-02-29 10:59:47][750][DEBUG] child started with pid: 750
[2016-02-29 10:59:47][751][DEBUG] child started with pid: 751
[2016-02-29 10:59:47][752][DEBUG] child started with pid: 752
[2016-02-29 11:00:07][917][DEBUG] child started with pid: 917
[2016-02-29 11:00:18][929][DEBUG] child started with pid: 929
[2016-02-29 11:00:18][930][DEBUG] child started with pid: 930
[2016-02-29 11:00:18][931][DEBUG] child started with pid: 931
[2016-02-29 11:00:18][932][DEBUG] child started with pid: 932
[2016-02-29 11:00:18][933][DEBUG] child started with pid: 933
[2016-02-29 11:00:38][1054][DEBUG] child started with pid: 1054
[2016-02-29 11:00:49][1112][DEBUG] child started with pid: 1112
[2016-02-29 11:00:49][1113][DEBUG] child started with pid: 1113
[2016-02-29 11:00:49][1114][DEBUG] child started with pid: 1114
[2016-02-29 11:00:49][1115][DEBUG] child started with pid: 1115
[2016-02-29 11:00:49][1116][DEBUG] child started with pid: 1116
[2016-02-29 11:01:09][1284][DEBUG] child started with pid: 1284
[2016-02-29 11:01:20][1303][DEBUG] child started with pid: 1303
[2016-02-29 11:01:20][1304][DEBUG] child started with pid: 1304
[2016-02-29 11:01:20][1305][DEBUG] child started with pid: 1305
[2016-02-29 11:01:20][1306][DEBUG] child started with pid: 1306
[2016-02-29 11:01:20][1307][DEBUG] child started with pid: 1307
[2016-02-29 11:01:40][1381][DEBUG] child started with pid: 1381
[2016-02-29 11:01:44][6867][INFO ] no checks in 2minutes, restarting all workers
[2016-02-29 11:01:48][1434][DEBUG] child started with pid: 1434
[2016-02-29 11:01:48][1435][DEBUG] child started with pid: 1435
[2016-02-29 11:01:48][1436][DEBUG] child started with pid: 1436
... same stuff over and over again.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by tgriep »

Was there anything in the log files around this time?

Code: Select all

Feb 29 11:06:02 nagiosxi nagios: HOST ALERT: <someserver>;DOWN;SOFT;1;(host check orphaned, is the mod-gearman worker on queue 'host' running?)
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

Same stuff over and over again until I manually killed the nagios pid, stopped gearman worker, stopped gearman server and restarted httpd, and started nagios, gearmand and gearman worker. After everything was ok.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by tgriep »

Have you done the update to Mod Gearman?
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

Which version and where can I get it? ...More like where should I get it from?
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by lmiltchev »

Use our ModGearmanInstall.sh script. Please follow our documentation:

https://assets.nagios.com/downloads/nag ... ios_XI.pdf

Let us know if you run into some issues.
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

Thanks I was able to upgrade the server and worker on the main XI test server and I was in the process of upgrading further workers when I ran into a problem with one.

I installed per the instruction but I am getting the following error when trying to start the worker.

Starting mod_gearman2_worker...stat: No such file or directory
failed

cd
/etc/init.d/
-rwxr-xr-x. 1 root root 2760 Mar 30 10:22 mod-gearman2-worker*


Seems like the script doesn't stop the previous version of the gearman worker which causes an error during the installation. (I didn't have gearman worker running on the main server which is why it was susccessful) I had to stop gearman worker and redo the upgrade.
Locked