Page 1 of 1

mod_gearman orphaned status on nagios 4.1

Posted: Wed Nov 25, 2015 7:53 am
by yesilyurtav
Hello all,
We are testing mod_gearman and nagios 4.1. Now I'm testing it now with only a few hosts. The scenerio is,

sometimes monitored hosts are not responding snmp or get temporary down. After sometimes, 10-15 minutes, worker can not handle result of these service checks, and I see messages on nagios gui, service is orphnaed. It's normal, because worker can not reach host due to snmp problem.

After a while, monitored host comes back, snmp is responding well on this host again, but status of these services on nagios are still "orphaned" unless I delete retantion.dat and restart nagios or click "Re-schedule the next check of this service" on nagios gui manually.

What should the problem be ?

From nagios config,

check_for_orphaned_services=1
check_for_orphaned_hosts=1

check_service_freshness=1
retain_state_information=1
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1

but

obsess_over_services=0
obsess_over_hosts=0

maybe I have to enable obsess check ?
If I enable them, what should be ocsp_command option ?

Regards.

Re: mod_gearman orphaned status on nagios 4.1

Posted: Wed Nov 25, 2015 2:19 pm
by jolson
Are you certain that you don't have multiple Nagios instances running on your primary Nagios box?

Code: Select all

ps -ef | grep /bin/nag
You should only see one process with a PPID of 1, multiple nagios instances can cause the orphan issue.

Re: mod_gearman orphaned status on nagios 4.1

Posted: Thu Nov 26, 2015 6:50 am
by yesilyurtav
Output is,

nagios 26090 1 0 Nov25 ? 00:01:11 /appl/nagios/bin/nagios -d /appl/nagios/etc/nagios.cfg
nagios 26092 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26093 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26094 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26095 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26096 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26097 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26098 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26099 26090 0 Nov25 ? 00:00:06 /appl/nagios/bin/nagios -d /appl/nagios/etc/nagios.cfg

how can I run only one nagios instance ? Is there any setting in angios.cfg ?

Regards.

Re: mod_gearman orphaned status on nagios 4.1

Posted: Thu Nov 26, 2015 7:29 pm
by Box293
yesilyurtav wrote:Output is,

nagios 26090 1 0 Nov25 ? 00:01:11 /appl/nagios/bin/nagios -d /appl/nagios/etc/nagios.cfg
nagios 26092 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26093 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26094 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26095 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26096 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26097 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26098 26090 0 Nov25 ? 00:00:00 /appl/nagios/bin/nagios --worker /appl/nagios/var/rw/nagios.qh
nagios 26099 26090 0 Nov25 ? 00:00:06 /appl/nagios/bin/nagios -d /appl/nagios/etc/nagios.cfg

how can I run only one nagios instance ? Is there any setting in angios.cfg ?

Regards.
This output it normal, you can see the first result has the pid 26090 and all the others are children of the 26090 pid.

Can you post your mod-gearman server and worker config files please.

Re: mod_gearman orphaned status on nagios 4.1

Posted: Fri Nov 27, 2015 5:13 am
by yesilyurtav
In nagios server, neb module configuration,

eventhandler=yes
hosts=yes
do_hostchecks=yes
encryption=no
use_uniq_jobs=on
### neb module config ###
localhostgroups=localhost-server
result_workers=10
perfdata=yes
perfdata_mode=1
orphan_host_checks=yes
orphan_service_checks=yes
accept_clear_results=no

workers config ;

debug-result=yes
eventhandler=yes
services=yes
hosts=yes
encryption=no
job_timeout=60
min-worker=5
max-worker=500
idle-timeout=30
max-jobs=1000
spawn-rate=1
fork_on_exec=no
load_limit1=0
load_limit5=0
load_limit15=0
show_error_output=yes
enable_embedded_perl=on
use_embedded_perl_implicitly=off
use_perl_cache=on
/appl/mod_gearman/share/mod_gearman2/mod_gearman_p1.pl
workaround_rc_25=off

I installed in nagios server mod_gearman 1.4, which is compitable with nagios 4 only, but in workers, I installed mod_gearman worker 2.1.5.
and gearmand version 0.25 is installed.

Regards.

Re: mod_gearman orphaned status on nagios 4.1

Posted: Sun Nov 29, 2015 9:46 pm
by Box293
yesilyurtav wrote:I installed in nagios server mod_gearman 1.4, which is compitable with nagios 4 only, but in workers, I installed mod_gearman worker 2.1.5.
and gearmand version 0.25 is installed.
Where did you get the 1.4 version from? Are you following a specific guide?

The version used in this guide is the minimum I would use on core 4:
http://sites.box293.com/nagios/guides/m ... -nagios-xi