Page 1 of 1
Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 7:42 am
by quental
2 machines with centos 6 , x86_64
manual installation.
Hi,
I have a problem with nagios and mod__gearman. I install in two nodes, and when i look the status of workers, i only see one node:
> gearman_top
Code: Select all
2013-03-21 13:33:56 - localhost:4730 - v0.25
Queue Name | Worker Available | Jobs Waiting | Jobs Running
--------------------------------------------------------------------------------
check_results | 1 | 0 | 0
eventhandler | 11 | 0 | 0
host | 11 | 0 | 0
service | 11 | 0 | 6
worker_nagiossp01.sanit.dom | 1 | 0 | 0
--------------------------------------------------------------------------------
I not see the second workers, which is in another machine. (nagiossp02.sanit.dom)
In the configuration file, I have set the value of the Master node:
server=10.4.235.101:4730
Can you help me see what happens?
thanks.
Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 9:20 am
by slansing
Is the worker running on the remote server?:
Have you made sure that the keyfile line has been changed on the worker? By default it will not be compatible with the Gearman Server.
Please see the Security section of the following document:
http://assets.nagios.com/downloads/nagi ... ios_XI.pdf
Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 9:24 am
by mguthrie
There should be logs you can look up for both the gearman server, and the workers. I would start by increasing the logging output on both ends and then do running tails on both of them to see what's going wrong on the second worker machine. If I remember correctly there should be a gearman specific log or directory somewhere in /var/log. You can increase the logging output by editing settings in /etc/mod_gearman .conf files.
Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 10:38 am
by quental
Hi,
i do :
And the proccess is working OK
The keyfile is created in both machines and have permissions...
in /var/log/mod_gearman/, in logs file there isn`t any significant. I changed the trace level to value 3 and nothing....
I atached the log of slave node and master node...
from worker node, if I do:
Code: Select all
netstat -anp | grep 4730
tcp 0 1 10.4.235.102:47932 10.4.235.101:4730 SYN_SENT 7074/mod_gearman_wo
tcp 0 1 10.4.235.102:47933 10.4.235.101:4730 SYN_SENT 7073/mod_gearman_wo
tcp 0 1 10.4.235.102:47935 10.4.235.101:4730 SYN_SENT 7077/mod_gearman_wo
tcp 0 1 10.4.235.102:47936 10.4.235.101:4730 SYN_SENT 7076/mod_gearman_wo
tcp 0 1 10.4.235.102:47934 10.4.235.101:4730 SYN_SENT 7078/mod_gearman_wo
tcp 0 1 10.4.235.102:47937 10.4.235.101:4730 SYN_SENT 7075/mod_gearman_wo
is this OK?
any suggestions?
thanks
Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 11:15 am
by lmiltchev
After you set up the key file, did you run:
Code: Select all
service nagios restart
service gearmand restart
service mod_gearman_worker restart
Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 11:40 am
by quental
Hi,
In master node:
Code: Select all
service nagios start
Starting nagios:[2013-03-21 17:32:43][19459][TRACE] parse_args_line(logfile=/var/log/mod_gearman/mod_gearman_neb.log, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(server=localhost:4730, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(eventhandler=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(services=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(hosts=no, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(do_hostchecks=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(encryption=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(keyfile=/etc/mod_gearman/gearman_key.txt, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(use_uniq_jobs=on, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(localhostgroups=, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(localservicegroups=, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(result_workers=1, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(perfdata=no, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(perfdata_mode=1, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(orphan_host_checks=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(orphan_service_checks=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(accept_clear_results=no, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(eventhandler=no, 0)
done.
Code: Select all
service gearmand start
Starting gearmand: [ OK ]
Code: Select all
service mod_gearman_worker start
Starting mod_gearman_worker...OK
In the slave node:
Code: Select all
service mod_gearman_worker start
Starting mod_gearman_worker...OK
then,, if i run in master node:
Code: Select all
2013-03-21 17:33:56 - localhost:4730 - v0.25
Queue Name | Worker Available | Jobs Waiting | Jobs Running
--------------------------------------------------------------------------------
check_results | 1 | 0 | 0
eventhandler | 11 | 0 | 0
host | 11 | 0 | 0
service | 11 | 0 | 6
worker_nagiossp01.sanit.dom | 1 | 0 | 0
--------------------------------------------------------------------------------
the slave node not appear....
Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 11:44 am
by lmiltchev
Also, can you run the following command on the worker, and show the output:
Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 1:44 pm
by quental
hi all,
Problem solved!!!
The problem was in the filtering port in worker node.
I executed a nmap and:
Code: Select all
nmap -p 4730 10.4.235.101
Starting Nmap 5.21 (<<http://nmap.org> > ) at 2013-03-21 18:23 CET Nmap scan
report for nagiossp01.sanitas.dom (10.4.235.101) Host is up (0.00064s latency).
PORT STATE SERVICE
4730/tcp [b]filtered[/b] unknown
MAC Address: 00:50:56:83:41:B2 (VMware)
Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds
in worker node the firewall was active.
i executed:
system-config-firewall-tui
and disabled the internal firewall.
thank you all for your time.
You can close the case.
.

Re: Problem mod_gearman distributed nodes
Posted: Thu Mar 21, 2013 2:01 pm
by lmiltchev
Great! I am glad it works.

I am locking this topic.