Unable to start nagios - no errors
Re: Unable to start nagios - no errors
yum list installed | grep gearman
gearmand.x86_64 1:0.33-2 @/gearmand-0.33-2.rhel6.x86_64
gearmand-devel.x86_64 1:0.33-2 @/gearmand-devel-0.33-2.rhel6.x86_64
gearmand-server.x86_64 1:0.33-2 @/gearmand-server-0.33-2.rhel6.x86_64
mod_gearman2.x86_64 2.1.1-1.el6 @/mod_gearman2-2.1.1-1.rhel6.x86_64
Worker is on server and 2 other hosts. I have disabled the 2 external hosts for the meantime.
What are we looking for at level 3?
gearmand.x86_64 1:0.33-2 @/gearmand-0.33-2.rhel6.x86_64
gearmand-devel.x86_64 1:0.33-2 @/gearmand-devel-0.33-2.rhel6.x86_64
gearmand-server.x86_64 1:0.33-2 @/gearmand-server-0.33-2.rhel6.x86_64
mod_gearman2.x86_64 2.1.1-1.el6 @/mod_gearman2-2.1.1-1.rhel6.x86_64
Worker is on server and 2 other hosts. I have disabled the 2 external hosts for the meantime.
What are we looking for at level 3?
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Unable to start nagios - no errors
I'd just like to see more information regarding this line in your original worker log:
Code: Select all
[ERROR] worker error: flush(Broken pipe) lost connection to server during send -> libgearman/connection.cc:761As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer
Re: Unable to start nagios - no errors
I stopped and started the worker. I didn't see that error.
[2016-04-08 14:00:10][8647][TRACE] 428 +++>
HqYZpEe5+0bHzBLfuGtrhnos3YHRMgPZHEFYN33EISP0kGyVwojWlEikHuTPND9UgRD+/fcb2/k/D70uvw890EZdtlPzQsOtt6Z62Dc0vln6fb/QnZ20p1mwTofhdiLLjuxezMwGklrS+Q67TSWg9mILJ+SIK5n9Y5uM1FPrD/gDc1fIEOeAmz4P+XK3Pr6BQQtP6Z9RmieOt4xd4kYhjzs=
<+++
[2016-04-08 14:00:10][8647][TRACE] add_job_to_queue() finished successfully: 0 0
[2016-04-08 14:00:10][8647][TRACE] send_result_back() finished successfully
[2016-04-08 14:00:10][8647][TRACE] send_result_back() has no duplicate servers to send to.
[2016-04-08 14:00:10][8647][TRACE] set_state(1)
[2016-04-08 14:00:10][8641][TRACE] idle_sighandler(14)
[2016-04-08 14:00:10][8641][TRACE] clean_worker_exit(0)
[2016-04-08 14:00:10][8641][TRACE] cleaning worker
[2016-04-08 14:00:10][8641][TRACE] cleaning client
[2016-04-08 14:00:11][8640][TRACE] waitpid() worker exited with: 0
[2016-04-08 14:00:11][8640][TRACE] make_new_child(2)
[2016-04-08 14:00:11][8640][TRACE] forking status worker
[2016-04-08 14:00:11][8892][DEBUG] child started with pid: 8892
[2016-04-08 14:00:11][8892][TRACE] status worker client started
[2016-04-08 14:00:11][8892][TRACE] set_worker()
[2016-04-08 14:00:11][8892][TRACE] create_client()
[2016-04-08 14:00:11][8645][TRACE] set_state(0)
[2016-04-08 14:00:11][8645][TRACE] get_job()
[2016-04-08 14:00:11][8645][TRACE] got new job H:<NAGIOSXIHOSTNAME>:36092
[2016-04-08 14:00:11][8645][TRACE] 384 +++>
wuyIXMz+16Yrsk+RkDoGE1fxNhEeIx8oHzgKi1lpVip+V0wkVINNHVNzDLRIQRIJb1kssVve1EZo9eZifaVJ2zU7WU7rydYDb+LIKjDoU1CtwfGa0a4yDuWNWtjzhG+WiE7/GGKdnoLnL5wbHwnd6xn6HTOvABcLTiPoQlepFZH+ipmw3c0SFzMz2yetFBfuP7uBALTxHSCzK2Q8V5De7zi9Kr32EE9aMooEcnfC8s+qpOBN7us1wT0iDtxV1n7bMX75mT182FuKPaWWrcKqr7N6miqzRxrqLYISlb0xt9qL0
<+++
[2016-04-08 14:00:11][8645][TRACE] 287 --->
type=service
result_queue=check_results
host_name=SERVER1
service_description=Disk - C
start_time=1460142011.0
next_check=1460142011.0
core_time=1460142011.965547
timeout=60
command_line=/usr/local/nagios/libexec/check_nt -H SERVER1 -p 1248 -v USEDDISKSPACE -l C -w 85 -c 95
[2016-04-08 14:00:10][8647][TRACE] 428 +++>
HqYZpEe5+0bHzBLfuGtrhnos3YHRMgPZHEFYN33EISP0kGyVwojWlEikHuTPND9UgRD+/fcb2/k/D70uvw890EZdtlPzQsOtt6Z62Dc0vln6fb/QnZ20p1mwTofhdiLLjuxezMwGklrS+Q67TSWg9mILJ+SIK5n9Y5uM1FPrD/gDc1fIEOeAmz4P+XK3Pr6BQQtP6Z9RmieOt4xd4kYhjzs=
<+++
[2016-04-08 14:00:10][8647][TRACE] add_job_to_queue() finished successfully: 0 0
[2016-04-08 14:00:10][8647][TRACE] send_result_back() finished successfully
[2016-04-08 14:00:10][8647][TRACE] send_result_back() has no duplicate servers to send to.
[2016-04-08 14:00:10][8647][TRACE] set_state(1)
[2016-04-08 14:00:10][8641][TRACE] idle_sighandler(14)
[2016-04-08 14:00:10][8641][TRACE] clean_worker_exit(0)
[2016-04-08 14:00:10][8641][TRACE] cleaning worker
[2016-04-08 14:00:10][8641][TRACE] cleaning client
[2016-04-08 14:00:11][8640][TRACE] waitpid() worker exited with: 0
[2016-04-08 14:00:11][8640][TRACE] make_new_child(2)
[2016-04-08 14:00:11][8640][TRACE] forking status worker
[2016-04-08 14:00:11][8892][DEBUG] child started with pid: 8892
[2016-04-08 14:00:11][8892][TRACE] status worker client started
[2016-04-08 14:00:11][8892][TRACE] set_worker()
[2016-04-08 14:00:11][8892][TRACE] create_client()
[2016-04-08 14:00:11][8645][TRACE] set_state(0)
[2016-04-08 14:00:11][8645][TRACE] get_job()
[2016-04-08 14:00:11][8645][TRACE] got new job H:<NAGIOSXIHOSTNAME>:36092
[2016-04-08 14:00:11][8645][TRACE] 384 +++>
wuyIXMz+16Yrsk+RkDoGE1fxNhEeIx8oHzgKi1lpVip+V0wkVINNHVNzDLRIQRIJb1kssVve1EZo9eZifaVJ2zU7WU7rydYDb+LIKjDoU1CtwfGa0a4yDuWNWtjzhG+WiE7/GGKdnoLnL5wbHwnd6xn6HTOvABcLTiPoQlepFZH+ipmw3c0SFzMz2yetFBfuP7uBALTxHSCzK2Q8V5De7zi9Kr32EE9aMooEcnfC8s+qpOBN7us1wT0iDtxV1n7bMX75mT182FuKPaWWrcKqr7N6miqzRxrqLYISlb0xt9qL0
<+++
[2016-04-08 14:00:11][8645][TRACE] 287 --->
type=service
result_queue=check_results
host_name=SERVER1
service_description=Disk - C
start_time=1460142011.0
next_check=1460142011.0
core_time=1460142011.965547
timeout=60
command_line=/usr/local/nagios/libexec/check_nt -H SERVER1 -p 1248 -v USEDDISKSPACE -l C -w 85 -c 95
Re: Unable to start nagios - no errors
Try setting result_workers=1 in your mod_gearman_neb.conf, we had another customer make this change and it allowed it to work for a similar issue.
Re: Unable to start nagios - no errors
I can't seem to find mod_gearman_neb.conf anywhere.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Unable to start nagios - no errors
it should be on your XI server under /etc/mod_gearman/mod_gearman_neb.conf or it will be /etc/mod_gearman2/module.conf
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Unable to start nagios - no errors
found /etc/mod_gearman2/module.conf thanks.
result_workers=1 was already set.
result_workers=1 was already set.
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Unable to start nagios - no errors
You can turn down the debug verbosity now (reset it to 0 or 1).
So after all this stop/start of all these services, are you still having the same issues? A lot of times, just getting them stopped and started in the right order, as funny as it sounds, can resolve a lot of issues.
If not, can I see the output of
and
So after all this stop/start of all these services, are you still having the same issues? A lot of times, just getting them stopped and started in the right order, as funny as it sounds, can resolve a lot of issues.
If not, can I see the output of
Code: Select all
ls -alh /usr/local/nagios/var/rwCode: Select all
cat /usr/local/nagiosxi/var/cmdsubsys.logAs of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer
Re: Unable to start nagios - no errors
Not sure what I did that it now started working. Rebooting the server... started the engine from the web interface vs commandline? In any case I will be restarting the server again shortly. What priority should the services have so that they come up at startup properly?
ls -alh /usr/local/nagios/var/rw
prw-rw---- 1 nagios nagios 0 Apr 11 10:46 nagios.cmd
srw-rw---- 1 nagios nagios 0 Apr 11 10:46 nagios.qh
cat /usr/local/nagiosxi/var/cmdsubsys.log
.................
Priority settings right now...
# chkconfig: 2345 85 15
# description: Mod-Gearman2 Worker Daemon
# gearmand Startup script for the Gearman server
# chkconfig: - 85 15
# chkconfig: 345 99 01
# description: Nagios network monitor
ls -alh /usr/local/nagios/var/rw
prw-rw---- 1 nagios nagios 0 Apr 11 10:46 nagios.cmd
srw-rw---- 1 nagios nagios 0 Apr 11 10:46 nagios.qh
cat /usr/local/nagiosxi/var/cmdsubsys.log
.................
Priority settings right now...
# chkconfig: 2345 85 15
# description: Mod-Gearman2 Worker Daemon
# gearmand Startup script for the Gearman server
# chkconfig: - 85 15
# chkconfig: 345 99 01
# description: Nagios network monitor
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Unable to start nagios - no errors
Those look fine. The order goes gearmand -> nagios -> worker.
Glad its all working!
Glad its all working!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer