Unable to start nagios - no errors

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Unable to start nagios - no errors

Post by emartine »

yum list installed | grep gearman
gearmand.x86_64 1:0.33-2 @/gearmand-0.33-2.rhel6.x86_64
gearmand-devel.x86_64 1:0.33-2 @/gearmand-devel-0.33-2.rhel6.x86_64
gearmand-server.x86_64 1:0.33-2 @/gearmand-server-0.33-2.rhel6.x86_64
mod_gearman2.x86_64 2.1.1-1.el6 @/mod_gearman2-2.1.1-1.rhel6.x86_64



Worker is on server and 2 other hosts. I have disabled the 2 external hosts for the meantime.

What are we looking for at level 3?
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: Unable to start nagios - no errors

Post by bheden »

I'd just like to see more information regarding this line in your original worker log:

Code: Select all

[ERROR] worker error: flush(Broken pipe) lost connection to server during send -> libgearman/connection.cc:761
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Unable to start nagios - no errors

Post by emartine »

I stopped and started the worker. I didn't see that error.

[2016-04-08 14:00:10][8647][TRACE] 428 +++>
HqYZpEe5+0bHzBLfuGtrhnos3YHRMgPZHEFYN33EISP0kGyVwojWlEikHuTPND9UgRD+/fcb2/k/D70uvw890EZdtlPzQsOtt6Z62Dc0vln6fb/QnZ20p1mwTofhdiLLjuxezMwGklrS+Q67TSWg9mILJ+SIK5n9Y5uM1FPrD/gDc1fIEOeAmz4P+XK3Pr6BQQtP6Z9RmieOt4xd4kYhjzs=
<+++
[2016-04-08 14:00:10][8647][TRACE] add_job_to_queue() finished successfully: 0 0
[2016-04-08 14:00:10][8647][TRACE] send_result_back() finished successfully
[2016-04-08 14:00:10][8647][TRACE] send_result_back() has no duplicate servers to send to.
[2016-04-08 14:00:10][8647][TRACE] set_state(1)
[2016-04-08 14:00:10][8641][TRACE] idle_sighandler(14)
[2016-04-08 14:00:10][8641][TRACE] clean_worker_exit(0)
[2016-04-08 14:00:10][8641][TRACE] cleaning worker
[2016-04-08 14:00:10][8641][TRACE] cleaning client
[2016-04-08 14:00:11][8640][TRACE] waitpid() worker exited with: 0
[2016-04-08 14:00:11][8640][TRACE] make_new_child(2)
[2016-04-08 14:00:11][8640][TRACE] forking status worker
[2016-04-08 14:00:11][8892][DEBUG] child started with pid: 8892
[2016-04-08 14:00:11][8892][TRACE] status worker client started
[2016-04-08 14:00:11][8892][TRACE] set_worker()
[2016-04-08 14:00:11][8892][TRACE] create_client()
[2016-04-08 14:00:11][8645][TRACE] set_state(0)
[2016-04-08 14:00:11][8645][TRACE] get_job()
[2016-04-08 14:00:11][8645][TRACE] got new job H:<NAGIOSXIHOSTNAME>:36092
[2016-04-08 14:00:11][8645][TRACE] 384 +++>
wuyIXMz+16Yrsk+RkDoGE1fxNhEeIx8oHzgKi1lpVip+V0wkVINNHVNzDLRIQRIJb1kssVve1EZo9eZifaVJ2zU7WU7rydYDb+LIKjDoU1CtwfGa0a4yDuWNWtjzhG+WiE7/GGKdnoLnL5wbHwnd6xn6HTOvABcLTiPoQlepFZH+ipmw3c0SFzMz2yetFBfuP7uBALTxHSCzK2Q8V5De7zi9Kr32EE9aMooEcnfC8s+qpOBN7us1wT0iDtxV1n7bMX75mT182FuKPaWWrcKqr7N6miqzRxrqLYISlb0xt9qL0
<+++
[2016-04-08 14:00:11][8645][TRACE] 287 --->
type=service
result_queue=check_results
host_name=SERVER1
service_description=Disk - C
start_time=1460142011.0
next_check=1460142011.0
core_time=1460142011.965547
timeout=60
command_line=/usr/local/nagios/libexec/check_nt -H SERVER1 -p 1248 -v USEDDISKSPACE -l C -w 85 -c 95
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Unable to start nagios - no errors

Post by ssax »

Try setting result_workers=1 in your mod_gearman_neb.conf, we had another customer make this change and it allowed it to work for a similar issue.
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Unable to start nagios - no errors

Post by emartine »

I can't seem to find mod_gearman_neb.conf anywhere.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Unable to start nagios - no errors

Post by Box293 »

it should be on your XI server under /etc/mod_gearman/mod_gearman_neb.conf or it will be /etc/mod_gearman2/module.conf
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Unable to start nagios - no errors

Post by emartine »

found /etc/mod_gearman2/module.conf thanks.

result_workers=1 was already set.
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: Unable to start nagios - no errors

Post by bheden »

You can turn down the debug verbosity now (reset it to 0 or 1).

So after all this stop/start of all these services, are you still having the same issues? A lot of times, just getting them stopped and started in the right order, as funny as it sounds, can resolve a lot of issues.

If not, can I see the output of

Code: Select all

ls -alh /usr/local/nagios/var/rw
and

Code: Select all

cat /usr/local/nagiosxi/var/cmdsubsys.log
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Unable to start nagios - no errors

Post by emartine »

Not sure what I did that it now started working. Rebooting the server... started the engine from the web interface vs commandline? In any case I will be restarting the server again shortly. What priority should the services have so that they come up at startup properly?



ls -alh /usr/local/nagios/var/rw
prw-rw---- 1 nagios nagios 0 Apr 11 10:46 nagios.cmd
srw-rw---- 1 nagios nagios 0 Apr 11 10:46 nagios.qh


cat /usr/local/nagiosxi/var/cmdsubsys.log
.................


Priority settings right now...



# chkconfig: 2345 85 15
# description: Mod-Gearman2 Worker Daemon


# gearmand Startup script for the Gearman server
# chkconfig: - 85 15


# chkconfig: 345 99 01
# description: Nagios network monitor
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: Unable to start nagios - no errors

Post by bheden »

Those look fine. The order goes gearmand -> nagios -> worker.

Glad its all working!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
Locked