Nagios XI 5.5.3 and Mod_Gearman compatibility

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
salami
Posts: 30
Joined: Tue Jun 26, 2018 4:36 am

Re: Nagios XI 5.5.3 and Mod_Gearman compatibility

Post by salami »

Thanks for your reply.
I run your suggested command and attached the result in a text file.
seems all php processes are zombie processes. I need to know what is the parent of this processes.

Also, I test two kind of gearman Architecture as follow:

1. local worker handle event handlers (please see the attached picture)
2.png
2. remote worker handle eventhandlers (please see the attached picture)
1.png
and none of them differ in this case.
Already I set second architecture and config files is as below:

Local Worker:

Code: Select all

###############################################################################
#
#  Mod-Gearman - distribute checks with gearman
#
#  Copyright (c) 2010 Sven Nierlein
#
#  Worker Module Config
#
###############################################################################

# Identifier, hostname will be used if undefined
#identifier=hostname

# use debug to increase the verbosity of the module.
# Possible values are:
#     0 = only errors
#     1 = debug messages
#     2 = trace messages
#     3 = trace and all gearman related logs are going to stdout.
# Default is 0.
debug=0

# Path to the logfile.
logfile=/var/log/mod_gearman2/mod_gearman_worker.log

# sets the addess of your gearman job server. Can be specified
# more than once to add more server.
server=localhost:4730


# sets the address of your 2nd (duplicate) gearman job server. Can
# be specified more than once o add more servers.
#dupserver=<host>:<port>


# defines if the worker should execute eventhandlers.
eventhandler=no


# defines if the worker should execute
# service checks.
services=no


# defines if the worker should execute
# host checks.
hosts=no


# sets a list of hostgroups which this worker will work
# on. Either specify a comma seperated list or use
# multiple lines.
hostgroups=linux-server
#hostgroups=name2,name3


# sets a list of servicegroups which this worker will
# work on.
#servicegroups=name1,name2,name3

# enables or disables encryption. It is strongly
# advised to not disable encryption. Anybody will be
# able to inject packages to your worker.
# Encryption is enabled by default and you have to
# explicitly disable it.
# When using encryption, you will either have to
# specify a shared password with key=... or a
# keyfile with keyfile=...
# Default is On.
encryption=yes


# A shared password which will be used for
# encryption of data pakets. Should be at least 8
# bytes long. Maximum length is 32 characters.
key=oss_key_nagios


# The shared password will be read from this file.
# Use either key or keyfile. Only the first 32
# characters will be used.
#keyfile=/path/to/secret.file

# Path to the pidfile. Usually set by the init script
#pidfile=/var/mod_gearman2/mod_gearman_worker.pid

# Default job timeout in seconds. Currently this value is only used for
# eventhandler. The worker will use the values from the core for host and
# service checks.
job_timeout=30

# Minimum number of worker processes which should
# run at any time.
min-worker=10

# Maximum number of worker processes which should
# run at any time. You may set this equal to
# min-worker setting to disable dynamic starting of
# workers. When setting this to 1, all services from
# this worker will be executed one after another.
max-worker=1024

# Time after which an idling worker exists
# This parameter controls how fast your waiting workers will
# exit if there are no jobs waiting.
idle-timeout=60

# Controls the amount of jobs a worker will do before he exits
# Use this to control how fast the amount of workers will go down
# after high load times
max-jobs=1000

# max-age is the threshold for discarding too old jobs. When a new job is older
# than this amount of seconds it will not be executed and just discarded. Set to
# zero to disable this check.
#max-age=0

# defines the rate of spawned worker per second as long
# as there are jobs waiting
spawn-rate=5

# Use this option to disable an extra fork for each plugin execution. Disabling
# this option will reduce the load on the worker host but can lead to problems with
# unclean plugin. Default: yes
fork_on_exec=no

# Set a limit based on the 1min load average. When exceding the load limit,
# no new worker will be started until the current load is below the limit.
# No limit will be used when set to 0.
load_limit1=0

# Same as load_limit1 but for the 5min load average.
load_limit5=0

# Same as load_limit1 but for the 15min load average.
load_limit15=0

# Use this option to show stderr output of plugins too.
# Default: yes
show_error_output=yes

# Use dup_results_are_passive to set if the duplicate result send to the dupserver
# will be passive or active.
# Default is yes (passive).
#dup_results_are_passive=yes

# When embedded perl has been compiled in, you can use this
# switch to enable or disable the embedded perl interpreter.
enable_embedded_perl=on

# Default value used when the perl script does not have a
# "nagios: +epn" or "nagios: -epn" set.
# Perl scripts not written for epn support usually fail with epn,
# so its better to set the default to off.
use_embedded_perl_implicitly=off

# Cache compiled perl scripts. This makes the worker process a little
# bit bigger but makes execution of perl scripts even faster.
# When turned off, Mod-Gearman will still use the embedded perl
# interpreter, but will not cache the compiled script.
use_perl_cache=on

# path to p1 file which is used to execute and cache the
# perl scripts run by the embedded perl interpreter
p1_file=/usr/share/mod_gearman2/mod_gearman_p1.pl


# Security
# restrict_path allows you to restrict this worker to only execute plugins
# from these particular folders. Can be used multiple times to specify more
# than one folder.
# Note that when this restriction is active, no shell will be spawned and
# no shell characters ($`'"()|) are allowed in the command line itself.
#restrict_path=/usr/local/plugins/

# Workarounds

# workaround for rc 25 bug
# duplicate jobs from gearmand result in exit code 25 of plugins
# because they are executed twice and get killed because of using
# the same ressource.
# Sending results (when exit code is 25 ) will be skipped with this
# enabled.
workaround_rc_25=off
module .conf

Code: Select all

###############################################################################
#
#  Mod-Gearman - distribute checks with gearman
#
#  Copyright (c) 2010 Sven Nierlein
#
#  Mod-Gearman NEB Module Config
#
###############################################################################

# use debug to increase the verbosity of the module.
# Possible values are:
#     0 = only errors
#     1 = debug messages
#     2 = trace messages
#     3 = trace and all gearman related logs are going to stdout.
# Default is 0.
debug=2

# Path to the logfile.
logfile=/var/log/mod_gearman2/mod_gearman_neb.log

# sets the addess of your gearman job server. Can be specified
# more than once to add more server.
server=localhost:4730


# sets the address of your 2nd (duplicate) gearman job server. Can
# be specified more than once o add more servers.
#dupserver=<host>:<port>


# defines if the module should distribute execution of
# eventhandlers.
eventhandler=yes


# defines if the module should distribute execution of
# service checks.
services=yes


# defines if the module should distribute execution of
# host checks.
hosts=yes


# sets a list of hostgroups which will go into seperate
# queues. Either specify a comma seperated list or use
# multiple lines.
hostgroups=ayandeh,Day,eghtesad_novin,fereshtegan,gardeshgari,gostaresh,hekmat,shahrdari,tejarat,karafarin,karsazan_ayandeh,naji,sina,melal,maskan,borse_kala,mellat,ansar,mahak,kosar,zamzam,ofogh_kurosh,kpec,refah,vahdat,khavarmiyaneh,other



# sets a list of servicegroups which will go into seperate
# queues.
#servicegroups=name1,name2,name3

# Set this to 'no' if you want Mod-Gearman to only take care of
# servicechecks. No hostchecks will be processed by Mod-Gearman. Use
# this option to disable hostchecks and still have the possibility to
# use hostgroups for easy configuration of your services.
# If set to yes, you still have to define which hostchecks should be
# processed by either using 'hosts' or the 'hostgroups' option.
# Default is Yes.
do_hostchecks=yes

# This settings determines if all eventhandlers go into a single
# 'eventhandlers' queue or into the same queue like normal checks
# would do.
route_eventhandler_like_checks=no

# enables or disables encryption. It is strongly
# advised to not disable encryption. Anybody will be
# able to inject packages to your worker.
# Encryption is enabled by default and you have to
# explicitly disable it.
# When using encryption, you will either have to
# specify a shared password with key=... or a
# keyfile with keyfile=...
# Default is On.
encryption=yes


# A shared password which will be used for
# encryption of data pakets. Should be at least 8
# bytes long. Maximum length is 32 characters.
key=oss_key_nagios


# The shared password will be read from this file.
# Use either key or keyfile. Only the first 32
# characters will be used.
#keyfile=/path/to/secret.file


# use_uniq_jobs
# Using uniq keys prevents the gearman queues from filling up when there
# is no worker. However, gearmand seems to have problems with the uniq
# key and sometimes jobs get stuck in the queue. Set this option to 'off'
# when you run into problems with stuck jobs but make sure your worker
# are running.
use_uniq_jobs=on



###############################################################################
#
# NEB Module Config
#
# the following settings are for the neb module only and
# will be ignored by the worker.
#
###############################################################################

# sets a list of hostgroups which will not be executed
# by gearman. They are just passed through.
# Default is none
localhostgroups=


# sets a list of servicegroups which will not be executed
# by gearman. They are just passed through.
# Default is none
localservicegroups=

# The queue_custom_variable can be used to define the target queue
# by a custom variable in addition to host/servicegroups. When set
# for ex. to 'WORKER' you then could define a '_WORKER' custom
# variable for your hosts and services to directly set the worker
# queue. The host queue is inherited unless overwritten
# by a service custom variable. Set the value of your custom
# variable to 'local' to bypass Mod-Gearman (Same behaviour as in
# localhostgroups/localservicegroups).
#queue_custom_variable=WORKER

# Number of result worker threads. Usually one is
# enough. You may increase the value if your
# result queue is not processed fast enough.
# Default: 1
result_workers=1


# defines if the module should distribute perfdata
# to gearman.
# Note: processing of perfdata is not part of
# mod_gearman. You will need additional worker for
# handling performance data. For example: pnp4nagios
# Performance data is just written to the gearman
# queue.
# Default: no
perfdata=no

# perfdata mode overwrite helps preventing the perdata queue getting to big
# 1 = overwrote
# 2 = append
perfdata_mode=1

# The Mod-Gearman NEB module will submit a fake result for orphaned host
# checks with a message saying there is no worker running for this
# queue. Use this option to get better reporting results, otherwise your
# hosts will keep their last state as long as there is no worker
# running.
# Default: yes
orphan_host_checks=yes

# Same like 'orphan_host_checks' but for services.
# Default: yes
orphan_service_checks=yes

# When accept_clear_results is enabled, the NEB module will accept unencrypted
# results too. This is quite useful if you have lots of passive checks and make
# use of send_gearman/send_multi where you would have to spread the shared key to
# all clients using these tools.
# Default is no.
accept_clear_results=no

Remote worker1 config file: (events handled by this worker)

Code: Select all

###############################################################################
#
#  Mod-Gearman - distribute checks with gearman
#
#  Copyright (c) 2010 Sven Nierlein
#
#  Worker Module Config
#
###############################################################################

# Identifier, hostname will be used if undefined
#identifier=hostname

# use debug to increase the verbosity of the module.
# Possible values are:
#     0 = only errors
#     1 = debug messages
#     2 = trace messages
#     3 = trace and all gearman related logs are going to stdout.
# Default is 0.
debug=0

# Path to the logfile.
logfile=/var/log/mod_gearman2/mod_gearman_worker.log

# sets the addess of your gearman job server. Can be specified
# more than once to add more server.
server=10.47.12.95:4730


# sets the address of your 2nd (duplicate) gearman job server. Can
# be specified more than once o add more servers.
#dupserver=<host>:<port>


# defines if the worker should execute eventhandlers.
eventhandler=yes


# defines if the worker should execute
# service checks.
services=yes


# defines if the worker should execute
# host checks.
hosts=yes


# sets a list of hostgroups which this worker will work
# on. Either specify a comma seperated list or use
# multiple lines.
hostgroups=ayandeh,Day,eghtesad_novin,fereshtegan,gardeshgari,gostaresh,hekmat,shahrdari,tejarat,karafarin,karsazan_ayandeh,naji,sina,melal,maskan,borse_kala,mellat,ansar,mahak,kosar,zamzam,ofogh_kurosh,kpec,refah,vahdat,khavarmiyaneh,other


# sets a list of servicegroups which this worker will
# work on.
#servicegroups=name1,name2,name3

# enables or disables encryption. It is strongly
# advised to not disable encryption. Anybody will be
# able to inject packages to your worker.
# Encryption is enabled by default and you have to
# explicitly disable it.
# When using encryption, you will either have to
# specify a shared password with key=... or a
# keyfile with keyfile=...
# Default is On.
encryption=yes


# A shared password which will be used for
# encryption of data pakets. Should be at least 8
# bytes long. Maximum length is 32 characters.
key=oss_key_nagios


# The shared password will be read from this file.
# Use either key or keyfile. Only the first 32
# characters will be used.
#keyfile=/path/to/secret.file

# Path to the pidfile. Usually set by the init script
#pidfile=/var/mod_gearman2/mod_gearman_worker.pid

# Default job timeout in seconds. Currently this value is only used for
# eventhandler. The worker will use the values from the core for host and
# service checks.
job_timeout=120

# Minimum number of worker processes which should
# run at any time.
min-worker=200

# Maximum number of worker processes which should
# run at any time. You may set this equal to
# min-worker setting to disable dynamic starting of
# workers. When setting this to 1, all services from
# this worker will be executed one after another.
max-worker=1024

# Time after which an idling worker exists
# This parameter controls how fast your waiting workers will
# exit if there are no jobs waiting.
idle-timeout=150

# Controls the amount of jobs a worker will do before he exits
# Use this to control how fast the amount of workers will go down
# after high load times
max-jobs=1000

# max-age is the threshold for discarding too old jobs. When a new job is older
# than this amount of seconds it will not be executed and just discarded. Set to
# zero to disable this check.
#max-age=0

# defines the rate of spawned worker per second as long
# as there are jobs waiting
spawn-rate=10

# Use this option to disable an extra fork for each plugin execution. Disabling
# this option will reduce the load on the worker host but can lead to problems with
# unclean plugin. Default: yes
fork_on_exec=no

# Set a limit based on the 1min load average. When exceding the load limit,
# no new worker will be started until the current load is below the limit.
# No limit will be used when set to 0.
load_limit1=0

# Same as load_limit1 but for the 5min load average.
load_limit5=0

# Same as load_limit1 but for the 15min load average.
load_limit15=0

# Use this option to show stderr output of plugins too.
# Default: yes
show_error_output=yes

# Use dup_results_are_passive to set if the duplicate result send to the dupserver
# will be passive or active.
# Default is yes (passive).
#dup_results_are_passive=yes

# When embedded perl has been compiled in, you can use this
# switch to enable or disable the embedded perl interpreter.
enable_embedded_perl=on

# Default value used when the perl script does not have a
# "nagios: +epn" or "nagios: -epn" set.
# Perl scripts not written for epn support usually fail with epn,
# so its better to set the default to off.
use_embedded_perl_implicitly=off

# Cache compiled perl scripts. This makes the worker process a little
# bit bigger but makes execution of perl scripts even faster.
# When turned off, Mod-Gearman will still use the embedded perl
# interpreter, but will not cache the compiled script.
use_perl_cache=on

# path to p1 file which is used to execute and cache the
# perl scripts run by the embedded perl interpreter
p1_file=/usr/share/mod_gearman2/mod_gearman_p1.pl


# Security
# restrict_path allows you to restrict this worker to only execute plugins
# from these particular folders. Can be used multiple times to specify more
# than one folder.
# Note that when this restriction is active, no shell will be spawned and
# no shell characters ($`'"()|) are allowed in the command line itself.
#restrict_path=/usr/local/plugins/

# Workarounds

# workaround for rc 25 bug
# duplicate jobs from gearmand result in exit code 25 of plugins
# because they are executed twice and get killed because of using
# the same ressource.
# Sending results (when exit code is 25 ) will be skipped with this
# enabled.
workaround_rc_25=off
remote worker 2 config file:

Code: Select all

###############################################################################
#
#  Mod-Gearman - distribute checks with gearman
#
#  Copyright (c) 2010 Sven Nierlein
#
#  Worker Module Config
#
###############################################################################

# Identifier, hostname will be used if undefined
#identifier=hostname

# use debug to increase the verbosity of the module.
# Possible values are:
#     0 = only errors
#     1 = debug messages
#     2 = trace messages
#     3 = trace and all gearman related logs are going to stdout.
# Default is 0.
debug=0

# Path to the logfile.
logfile=/var/log/mod_gearman2/mod_gearman_worker.log

# sets the addess of your gearman job server. Can be specified
# more than once to add more server.
server=10.47.12.95:4730


# sets the address of your 2nd (duplicate) gearman job server. Can
# be specified more than once o add more servers.
#dupserver=<host>:<port>


# defines if the worker should execute eventhandlers.
eventhandler=no


# defines if the worker should execute
# service checks.
services=no


# defines if the worker should execute
# host checks.
hosts=no


# sets a list of hostgroups which this worker will work
# on. Either specify a comma seperated list or use
# multiple lines.

hostgroups=ayandeh,Day,eghtesad_novin,fereshtegan,gardeshgari,gostaresh,hekmat,shahrdari,tejarat,karafarin,karsazan_ayandeh,naji,sina,melal,maskan,borse_kala,mellat,ansar,mahak,kosar,zamzam,ofogh_kurosh,kpec,refah,vahdat,khavarmiyaneh,other

# sets a list of servicegroups which this worker will
# work on.
#servicegroups=name1,name2,name3
servicegroups=

# enables or disables encryption. It is strongly
# advised to not disable encryption. Anybody will be
# able to inject packages to your worker.
# Encryption is enabled by default and you have to
# explicitly disable it.
# When using encryption, you will either have to
# specify a shared password with key=... or a
# keyfile with keyfile=...
# Default is On.
encryption=yes


# A shared password which will be used for
# encryption of data pakets. Should be at least 8
# bytes long. Maximum length is 32 characters.
key=oss_key_nagios


# The shared password will be read from this file.
# Use either key or keyfile. Only the first 32
# characters will be used.
#keyfile=/path/to/secret.file

# Path to the pidfile. Usually set by the init script
#pidfile=/var/mod_gearman2/mod_gearman_worker.pid

# Default job timeout in seconds. Currently this value is only used for
# eventhandler. The worker will use the values from the core for host and
# service checks.
job_timeout=120

# Minimum number of worker processes which should
# run at any time.
min-worker=50

# Maximum number of worker processes which should
# run at any time. You may set this equal to
# min-worker setting to disable dynamic starting of
# workers. When setting this to 1, all services from
# this worker will be executed one after another.
max-worker=1024

# Time after which an idling worker exists
# This parameter controls how fast your waiting workers will
# exit if there are no jobs waiting.
idle-timeout=150

# Controls the amount of jobs a worker will do before he exits
# Use this to control how fast the amount of workers will go down
# after high load times
max-jobs=1000

# max-age is the threshold for discarding too old jobs. When a new job is older
# than this amount of seconds it will not be executed and just discarded. Set to
# zero to disable this check.
#max-age=0

# defines the rate of spawned worker per second as long
# as there are jobs waiting
spawn-rate=10

# Use this option to disable an extra fork for each plugin execution. Disabling
# this option will reduce the load on the worker host but can lead to problems with
# unclean plugin. Default: yes
fork_on_exec=yes

# Set a limit based on the 1min load average. When exceding the load limit,
# no new worker will be started until the current load is below the limit.
# No limit will be used when set to 0.
load_limit1=0

# Same as load_limit1 but for the 5min load average.
load_limit5=0

# Same as load_limit1 but for the 15min load average.
load_limit15=0

# Use this option to show stderr output of plugins too.
# Default: yes
show_error_output=yes

# Use dup_results_are_passive to set if the duplicate result send to the dupserver
# will be passive or active.
# Default is yes (passive).
#dup_results_are_passive=yes

# When embedded perl has been compiled in, you can use this
# switch to enable or disable the embedded perl interpreter.
enable_embedded_perl=on

# Default value used when the perl script does not have a
# "nagios: +epn" or "nagios: -epn" set.
# Perl scripts not written for epn support usually fail with epn,
# so its better to set the default to off.
use_embedded_perl_implicitly=off

# Cache compiled perl scripts. This makes the worker process a little
# bit bigger but makes execution of perl scripts even faster.
# When turned off, Mod-Gearman will still use the embedded perl
# interpreter, but will not cache the compiled script.
use_perl_cache=on

# path to p1 file which is used to execute and cache the
# perl scripts run by the embedded perl interpreter
p1_file=/usr/share/mod_gearman2/mod_gearman_p1.pl


# Security
# restrict_path allows you to restrict this worker to only execute plugins
# from these particular folders. Can be used multiple times to specify more
# than one folder.
# Note that when this restriction is active, no shell will be spawned and
# no shell characters ($`'"()|) are allowed in the command line itself.
#restrict_path=/usr/local/plugins/

# Workarounds

# workaround for rc 25 bug
# duplicate jobs from gearmand result in exit code 25 of plugins
# because they are executed twice and get killed because of using
# the same ressource.
# Sending results (when exit code is 25 ) will be skipped with this
# enabled.
workaround_rc_25=off
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI 5.5.3 and Mod_Gearman compatibility

Post by tgriep »

The Gearman config files look fairly standard and should work.

We need to see what is causing the zombie processes.
Lets stop all of the processes and and the zombie processes by running this block of commands as root.

Code: Select all

service nagios stop
service ndo2db stop
service npcd stop
service crond stop
pkill -9 nagios
killall -9 nagios
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
service gearmand restart
service npcd start
service crond start
service ndo2db start
service nagios start
If you start to see the zombie PHP processes, take a look at the log files in the various locations for any errors.

Code: Select all

/usr/local/nagios/var/nagios.log
The log files in these folders

Code: Select all

/usr/local/nagiosxi/var folder
/var/log/httpd
the /var/log/messages file.
And the Gearman log files

Code: Select all

/var/log/mod_gearman2/mod_gearman_neb.log
/var/log/mod_gearman2/mod_gearman_worker.log
If you want to rule out Gearman, remove the broker line from the nagios.cfg file and restart nagios.
Be sure to check out our Knowledgebase for helpful articles and solutions!
salami
Posts: 30
Joined: Tue Jun 26, 2018 4:36 am

Re: Nagios XI 5.5.3 and Mod_Gearman compatibility

Post by salami »

before running the commands you suggest me, I run 'tail -f' for all log files and then I run suggested commands. I share all logs in below:

/var/log/mod_gearman2/mod_gearman_worker.log

Code: Select all

[2018-09-22 09:30:56][6164][ERROR] worker error: connect_poll(GEARMAN_TIMEOUT) timeout occurred while trying to connect -> libgearman/connection.cc:109
[2018-09-22 09:30:56][6160][ERROR] worker error: connect_poll(GEARMAN_TIMEOUT) timeout occurred while trying to connect -> libgearman/connection.cc:109
[2018-09-22 09:30:56][6162][ERROR] worker error: connect_poll(GEARMAN_TIMEOUT) timeout occurred while trying to connect -> libgearman/connection.cc:109
[2018-09-22 09:30:56][6169][ERROR] worker error: connect_poll(GEARMAN_TIMEOUT) timeout occurred while trying to connect -> libgearman/connection.cc:109
[2018-09-22 09:30:56][6163][ERROR] worker error: connect_poll(GEARMAN_TIMEOUT) timeout occurred while trying to connect -> libgearman/connection.cc:109
[2018-09-22 09:30:56][6168][ERROR] worker error: connect_poll(GEARMAN_TIMEOUT) timeout occurred while trying to connect -> libgearman/connection.cc:109
[2018-09-22 09:30:56][6171][ERROR] worker error: connect_poll(GEARMAN_TIMEOUT) timeout occurred while trying to connect -> libgearman/connection.cc:109

/usr/local/nagiosxi/var/event_handler.log

after less than a minute I face below logs:

Code: Select all

Array
(
    [eventqueue_id] => 1494004
    [event_time] => 2018-09-22 09:30:51
    [event_source] => 2
    [event_type] => 2
    [event_meta] => YToyMzp7czoxNzoibm90aWZpY2F0aW9uLXR5cGUiO3M6NDoiaG9zdCI7czo3OiJjb250YWN0IjtzOjg6ImtheWFuZGVoIjtzOjEyOiJjb250YWN0ZW1haWwiO3M6MTM6InRlc3RAdGVzdC5jb20iO3M6NDoidHlwZSI7czo3OiJQUk9CTEVNIjtzOjk6ImVzY2FsYXRlZCI7czoxOiIwIjtzOjY6ImF1dGhvciI7YjowO3M6ODoiY29tbWVudHMiO2I6MDtzOjQ6Imhvc3QiO3M6MjM6IlNoYWhyZGFyaS0yMDAwMDIwNTUyMjc5IjtzOjExOiJob3N0YWRkcmVzcyI7czoxNToiMTkyLjE2OC4xMzIuMTc3IjtzOjk6Imhvc3RhbGlhcyI7czo5OiJ2cG5fbm9kZXMiO3M6MTU6Imhvc3RkaXNwbGF5bmFtZSI7czoyMzoiU2hhaHJkYXJpLTIwMDAwMjA1NTIyNzkiO3M6OToiaG9zdHN0YXRlIjtzOjQ6IkRPV04iO3M6MTE6Imhvc3RzdGF0ZWlkIjtzOjE6IjEiO3M6MTM6Imxhc3Rob3N0c3RhdGUiO3M6NDoiRE9XTiI7czoxNToibGFzdGhvc3RzdGF0ZWlkIjtzOjE6IjEiO3M6MTM6Imhvc3RzdGF0ZXR5cGUiO3M6NDoiSEFSRCI7czoxNDoiY3VycmVudGF0dGVtcHQiO3M6MToiMSI7czoxMToibWF4YXR0ZW1wdHMiO3M6MToiMSI7czoxMToiaG9zdGV2ZW50aWQiO3M6NDoiMzU3MSI7czoxMzoiaG9zdHByb2JsZW1pZCI7czozOiIzMTEiO3M6MTA6Imhvc3RvdXRwdXQiO3M6NDY6IkNSSVRJQ0FMIC0gMTkyLjE2OC4xMzIuMTc3OiBydGEgbmFuLCBsb3N0IDEwMCUiO3M6MTQ6Imxvbmdob3N0b3V0cHV0IjtiOjA7czo4OiJkYXRldGltZSI7czozMDoiU2F0IFNlcCAyMiAwOTozMTozNSArMDMzMCAyMDE4Ijt9
)
Array
(
    [eventqueue_id] => 1494005
    [event_time] => 2018-09-22 09:30:51
    [event_source] => 2
    [event_type] => 2
    [event_meta] => YToyMzp7czoxNzoibm90aWZpY2F0aW9uLXR5cGUiO3M6NDoiaG9zdCI7czo3OiJjb250YWN0IjtzOjY6InphbXphbSI7czoxMjoiY29udGFjdGVtYWlsIjtzOjEzOiJ0ZXN0QHRlc3QuY29tIjtzOjQ6InR5cGUiO3M6NzoiUFJPQkxFTSI7czo5OiJlc2NhbGF0ZWQiO3M6MToiMCI7czo2OiJhdXRob3IiO2I6MDtzOjg6ImNvbW1lbnRzIjtiOjA7czo0OiJob3N0IjtzOjIzOiJTaGFocmRhcmktMjAwMDAyMDU1MjI4MCI7czoxMToiaG9zdGFkZHJlc3MiO3M6MTU6IjE5Mi4xNjguMTMyLjE3MiI7czo5OiJob3N0YWxpYXMiO3M6OToidnBuX25vZGVzIjtzOjE1OiJob3N0ZGlzcGxheW5hbWUiO3M6MjM6IlNoYWhyZGFyaS0yMDAwMDIwNTUyMjgwIjtzOjk6Imhvc3RzdGF0ZSI7czo0OiJET1dOIjtzOjExOiJob3N0c3RhdGVpZCI7czoxOiIxIjtzOjEzOiJsYXN0aG9zdHN0YXRlIjtzOjQ6IkRPV04iO3M6MTU6Imxhc3Rob3N0c3RhdGVpZCI7czoxOiIxIjtzOjEzOiJob3N0c3RhdGV0eXBlIjtzOjQ6IkhBUkQiO3M6MTQ6ImN1cnJlbnRhdHRlbXB0IjtzOjE6IjEiO3M6MTE6Im1heGF0dGVtcHRzIjtzOjE6IjEiO3M6MTE6Imhvc3RldmVudGlkIjtzOjQ6IjM1NDUiO3M6MTM6Imhvc3Rwcm9ibGVtaWQiO3M6MzoiMjg4IjtzOjEwOiJob3N0b3V0cHV0IjtzOjQ2OiJDUklUSUNBTCAtIDE5Mi4xNjguMTMyLjE3MjogcnRhIG5hbiwgbG9zdCAxMDAlIjtzOjE0OiJsb25naG9zdG91dHB1dCI7YjowO3M6ODoiZGF0ZXRpbWUiO3M6MzA6IlNhdCBTZXAgMjIgMDk6MzE6MzUgKzAzMzAgMjAxOCI7fQ==
)
DELETED LOCKFILE '/usr/local/nagiosxi/var/event_handler.lock'
EVENT HANDLER EXITING
LOCKFILE '/usr/local/nagiosxi/var/event_handler.lock' CREATED
DELETED LOCKFILE '/usr/local/nagiosxi/var/event_handler.lock'
EVENT HANDLER EXITING
LOCKFILE '/usr/local/nagiosxi/var/event_handler.lock' CREATED
DELETED LOCKFILE '/usr/local/nagiosxi/var/event_handler.lock'
EVENT HANDLER EXITING
/var/log/mod_gearman2/mod_gearman_neb.log
generating below logs till event handler exiting. after exiting Event handler, the log has not been generated.

Code: Select all

host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801
[2018-09-22 09:31:39][6349][TRACE] handle_host_check(7)
[2018-09-22 09:31:39][6349][TRACE] ---------------
host Job -> 7, 801

other log files just mentioned that no processes has been processed after exiting event handler.

I'm following the issue and will share you the result ASAP.

thanks
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI 5.5.3 and Mod_Gearman compatibility

Post by tgriep »

The timeout errors from the Worker log files suggest that the worker cannot connect to the Gearman server.

Make sure port 4730 is open on the Gearman server and that the IP address in the worker configuration file is the Gearman server.
Be sure to check out our Knowledgebase for helpful articles and solutions!
salami
Posts: 30
Joined: Tue Jun 26, 2018 4:36 am

Re: Nagios XI 5.5.3 and Mod_Gearman compatibility

Post by salami »

I'm sure about connection and port and IP address in config files.

I reinstall OS by centos 6 and install nagios XI 5.4.13 instead of 5.5.3 and integrate mod-gearman with it again and everything is OK in this situation. seems when we downgrade core on Nagios XI 5.5.3 from 4.4.2 to 4.2.4, something goes wrong and seems Nagios XI 5.5.3 is not compatible with mod-gearman (after downgrading core to 4.2.4) so I found that I should use old versions of Nagios XI if I need to have a large scale of monitoring and need to use mod-gearman for distribution solutions.

suggestions:
OS: CentOS 6
nagios XI version: 5.4.13
mod-gearman version: 2.1.1-1 and gearman server 0.33

for the last question, is there a way to upgrade old versions such as Nagios XI 5.4.13 to the latest one without upgrading core?


thanks
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI 5.5.3 and Mod_Gearman compatibility

Post by tgriep »

When upgrading to XI 5.5.4. the upgrade script looks in the nagios.cfg file for the Mod Gearman broker module settings and if it finds it, it will not upgrade Core during the upgrade.
When upgrading in the Web interface, this happens automatically. During a manual upgrade, it prompts you if you still want to upgrade Core.
Be sure to check out our Knowledgebase for helpful articles and solutions!
salami
Posts: 30
Joined: Tue Jun 26, 2018 4:36 am

Re: Nagios XI 5.5.3 and Mod_Gearman compatibility

Post by salami »

Thumbs up.
Issue recovered. I upgrade nagios XI 5.4.13 to 5.5.4 and now everything is working fine.


So we can have a result:
If we have a large scale environment and needs mod-gearman, we should install old versions that compatible with mod-gearman (such as 5.4.13) and integrate with latest version of mod-gearman then we can upgrade to latest version of Nagios XI and we shouldn't install latest version of nagios XI and downgrade core to the old versions.

thanks you so much. please lock the topic.
Locked