Problem mod_gearman distributed nodes

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Problem mod_gearman distributed nodes

Post by quental »

2 machines with centos 6 , x86_64
manual installation.

Hi,
I have a problem with nagios and mod__gearman. I install in two nodes, and when i look the status of workers, i only see one node:

> gearman_top

Code: Select all

2013-03-21 13:33:56  -  localhost:4730   -  v0.25

 Queue Name                    | Worker Available | Jobs Waiting | Jobs Running
--------------------------------------------------------------------------------
 check_results                 |               1  |           0  |           0
 eventhandler                  |              11  |           0  |           0
 host                          |              11  |           0  |           0
 service                       |              11  |           0  |           6
 worker_nagiossp01.sanit.dom   |               1  |           0  |           0
--------------------------------------------------------------------------------
I not see the second workers, which is in another machine. (nagiossp02.sanit.dom)
In the configuration file, I have set the value of the Master node:
server=10.4.235.101:4730

Can you help me see what happens?

thanks.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Problem mod_gearman distributed nodes

Post by slansing »

Is the worker running on the remote server?:

Code: Select all

service mod_gearman_worker status
Have you made sure that the keyfile line has been changed on the worker? By default it will not be compatible with the Gearman Server.

Please see the Security section of the following document:

http://assets.nagios.com/downloads/nagi ... ios_XI.pdf
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Problem mod_gearman distributed nodes

Post by mguthrie »

There should be logs you can look up for both the gearman server, and the workers. I would start by increasing the logging output on both ends and then do running tails on both of them to see what's going wrong on the second worker machine. If I remember correctly there should be a gearman specific log or directory somewhere in /var/log. You can increase the logging output by editing settings in /etc/mod_gearman .conf files.
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: Problem mod_gearman distributed nodes

Post by quental »

Hi,

i do :

Code: Select all

service mod_gearman_worker status
And the proccess is working OK

The keyfile is created in both machines and have permissions...

in /var/log/mod_gearman/, in logs file there isn`t any significant. I changed the trace level to value 3 and nothing....

I atached the log of slave node and master node...

from worker node, if I do:

Code: Select all

 netstat -anp | grep 4730
tcp        0      1 10.4.235.102:47932          10.4.235.101:4730           SYN_SENT    7074/mod_gearman_wo
tcp        0      1 10.4.235.102:47933          10.4.235.101:4730           SYN_SENT    7073/mod_gearman_wo
tcp        0      1 10.4.235.102:47935          10.4.235.101:4730           SYN_SENT    7077/mod_gearman_wo
tcp        0      1 10.4.235.102:47936          10.4.235.101:4730           SYN_SENT    7076/mod_gearman_wo
tcp        0      1 10.4.235.102:47934          10.4.235.101:4730           SYN_SENT    7078/mod_gearman_wo
tcp        0      1 10.4.235.102:47937          10.4.235.101:4730           SYN_SENT    7075/mod_gearman_wo
is this OK?


any suggestions?

thanks
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Problem mod_gearman distributed nodes

Post by lmiltchev »

After you set up the key file, did you run:

Code: Select all

service nagios restart
service gearmand restart
service mod_gearman_worker restart
Be sure to check out our Knowledgebase for helpful articles and solutions!
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: Problem mod_gearman distributed nodes

Post by quental »

Hi,
In master node:

Code: Select all

service nagios start
Starting nagios:[2013-03-21 17:32:43][19459][TRACE] parse_args_line(logfile=/var/log/mod_gearman/mod_gearman_neb.log, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(server=localhost:4730, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(eventhandler=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(services=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(hosts=no, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(do_hostchecks=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(encryption=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(keyfile=/etc/mod_gearman/gearman_key.txt, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(use_uniq_jobs=on, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(localhostgroups=, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(localservicegroups=, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(result_workers=1, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(perfdata=no, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(perfdata_mode=1, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(orphan_host_checks=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(orphan_service_checks=yes, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(accept_clear_results=no, 1)
[2013-03-21 17:32:43][19459][TRACE] parse_args_line(eventhandler=no, 0)
 done.

Code: Select all

service gearmand start
Starting gearmand:                                         [  OK  ]

Code: Select all

service mod_gearman_worker start
Starting mod_gearman_worker...OK
In the slave node:

Code: Select all

service mod_gearman_worker start
Starting mod_gearman_worker...OK
then,, if i run in master node:

Code: Select all

gearman_top

Code: Select all

2013-03-21 17:33:56  -  localhost:4730   -  v0.25

Queue Name                    | Worker Available | Jobs Waiting | Jobs Running
--------------------------------------------------------------------------------
check_results                 |               1  |           0  |           0
eventhandler                  |              11  |           0  |           0
host                          |              11  |           0  |           0
service                       |              11  |           0  |           6
worker_nagiossp01.sanit.dom   |               1  |           0  |           0
--------------------------------------------------------------------------------
the slave node not appear....
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Problem mod_gearman distributed nodes

Post by lmiltchev »

Also, can you run the following command on the worker, and show the output:

Code: Select all

iptables -L -n | grep 4730
Be sure to check out our Knowledgebase for helpful articles and solutions!
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: Problem mod_gearman distributed nodes

Post by quental »

hi all,
Problem solved!!!

The problem was in the filtering port in worker node.

I executed a nmap and:

Code: Select all

nmap -p 4730 10.4.235.101

Starting Nmap 5.21 (<<http://nmap.org> > ) at 2013-03-21 18:23 CET Nmap scan
report for nagiossp01.sanitas.dom (10.4.235.101) Host is up (0.00064s latency).

PORT     STATE    SERVICE

4730/tcp [b]filtered[/b] unknown

MAC Address: 00:50:56:83:41:B2 (VMware)

Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds
in worker node the firewall was active.

i executed:
system-config-firewall-tui
and disabled the internal firewall.

thank you all for your time.
You can close the case.

. :D
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Problem mod_gearman distributed nodes

Post by lmiltchev »

Great! I am glad it works. :D I am locking this topic.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked