host check orphaned, is the mod-gearman worker on queue

rtsupport · Post by **rtsupport** » Mon Apr 02, 2018 8:08 am

OS == Linux Server release 6.9
Nagios == Nagios XI 5.4.13

we are facing issue with Windows servers Only, while adding windows server we are getting error.

-- Adding server using CCM > host > New Host

Using this Method we are getting error on host (host check orphaned, is the mod-gearman worker on queue 'hostgroup_admin_locale_wb_xgi' running?) and all services check are scheduled but last check is not working.

Have attached Error Screen shot.

-- Adding server using Config Wizard.

Using this Method Server is in OK, state but all services are in UNKNOWN state with Error (Server port must be an integer)

Have attached screen shot for same.

we have checked using terminal and GUI able to check the NRPE Version --
[nagios@usa** libexec]$ ./check_nt -H usa**** -p 12489 -v CLIENTVERSION
NSClient++ 0.4.3.88 2015-01-11

[nagios@us***** libexec]$ ./check_nrpe -H ********
I (0.4.3.88 2015-01-11) seem to be doing fine...

Collector server status in Gearman_top status is working fine --

2018-04-02 08:54:01 - localhost:4730 - v1.1.12

Queue Name | Worker Available | Jobs Waiting | Jobs Running
-----------------------------------------------------------------------------------------
check_results | 5 | 0 | 0
hostgroup_admin_locale_lv_xgi | 368 | 0 | 0
hostgroup_admin_locale_rm_xgi | 368 | 0 | 0
hostgroup_admin_locale_wb_epn | 368 | 0 | 0
hostgroup_admin_locale_wb_xgi | 368 | 0 | 0
hostgroup_admin_locale_wv_epn | 368 | 0 | 0
hostgroup_admin_locale_wv_xgi | 368 | 0 | 0
hostgroup_nagios_infrastructure_XGI | 368 | 0 | 0
hostgroup_nagios_infrastructure_xgi_wb | 368 | 0 | 0
hostgroup_nagios_infrastructure_xgi_wv | 368 | 0 | 0
worker_************************ | 1 | 0 | 0
-----------------------------------------------------------------------------------------

Have checked below suggestion as well --
https://support.nagios.com/kb/article/n ... ng-19.html

1. The check is failing to be scheduled or executed -- Verified

2. ndo2db is failing to insert the check result into the "nagios" mysql database. -- No Error in mysql.d logs

3 Check For Multiple Nagios Processes -- Multiple Process are running

Please advise..

Post by **cdienger** » Mon Apr 02, 2018 1:37 pm

Can you provide the output of the "ps -ef | grep nagios.cfg | grep -v grep" command please? If you are seeing multiple parent processes then you should stop one with:

service nagios stop
killall -9 nagios
service nagios start

rtsupport · Post by **rtsupport** » Tue Apr 03, 2018 4:53 am

only two process are running parent and child process...

[nagios@usa0******** ~]$ ps -ef | grep nagios.cfg | grep -v grep
nagios 6407 1 0 Apr02 ? 00:01:57 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 6525 6407 0 Apr02 ? 00:00:05 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

Post by **cdienger** » Tue Apr 03, 2018 11:29 am

Should the check be sent to a remote worker or should it be executed on the local XI machine?

The orphan message means that the command is sent to the hostgroup_admin_locale_wb_xgi queue but there isn't a worker configured to pull from it.

The port message means that the check_nt command isn't configured properly. Specifically the port passed with the -p option isn't an integer.

https://support.nagios.com/kb/article/n ... s-484.html covers mod gearman workers and queues. Please review this and if you still have problems, please PM me a profile and also provide the worker.conf(from all servers) and module.conf. Please also indicate the name of the hosts you're configuring these checks for and which state they're currently in(orphan or bad integer).

gormank · Post by **gormank** » Wed Apr 04, 2018 11:15 am

All this is just FYI. No need to respond.

I ran into the following message this morning, which is basically the same as that in post #1 above: "host check orphaned, is the mod-gearman worker on queue 'host' running?"

I restarted gearmand, then made a monthly cron.

cat /etc/cron.monthly/nagios
service gearmand stop 2>&1 > /dev/null
service gearmand start 2>&1 > /dev/null
service nagios restart 2>&1 > /dev/null

Then set the perms to executable
chmod 755 /etc/cron.monthly/nagios

The Nagios host had been up for 98 days, and all checks were failing for 13 days. Apparently this is due to germand being half dead and not forwarding messages to Nagios. The nagios restart isn't really needed (as far as I know) in the cron, but I put it there just to be sure.

This system is old at 2014R2.6, and is the only one using gearman...

Post by **cdienger** » Wed Apr 04, 2018 2:51 pm

Thanks for the input @gormank.

@rtsupport, please let us know if you have any updates or questions.

rtsupport · Post by **rtsupport** » Thu Apr 05, 2018 7:18 am

thank you @gormank for suggestion.

we have already restarted full Nagios in below sequence which covered gearmand restart as well but that did not resolve the issue. and from yesterday we have observed that 1 Linux server also goes in same error. " Host check orphaned, is the mod-gearman worker on queue 'host' running? " earlier it was specific to Windows server only.

Full restart -
service nagios stop
sudo /sbin/service httpd stop
sudo /etc/init.d/npcd stop
service ndo2db stop
sudo /sbin/service mysqld stop
sudo /etc/init.d/postgresql stop
sudo /sbin/service gearmand stop

Start all service in reverse order..

@ cdienger

Just to add we have recently updated our system from Nagios XI 2014R2.7 to Nagios XI 5.4.13 version. also we have compared module.conf and worker.conf with PRD and both are identical. for your reference have attached both config file in your PM.

Post by **cdienger** » Thu Apr 05, 2018 3:47 pm

hostgroup_admin_locale_wb_xgi isn't found in the mod_gearman_woker.conf. Set the hostgroups line to:

hostgroups=nagios_infrastructure_xgi_wv,nagios_infrastructure_xgi_wb,nagios_infrastructure_XGI,admin_locale_wb_epn,admin_locale_wb_xgi,admin_locale_wv_epn,admin_locale_wv_xgi,admin_locale_rm_xgi,admin_locale_lv_xgi, hostgroup_admin_locale_wb_xgi

then restart the worker and let us know your results.

rtsupport · Post by **rtsupport** » Fri Apr 06, 2018 9:51 am

Hi,

I can see the group already added in hostgroup line in Line No. 60.... ( admin_locale_wb_xgi )

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# sets a list of hostgroups which this worker will work
# on. Either specify a comma seperated list or use
# multiple lines.
#hostgroups={comma delimited list of hostgroups that are checked by gearman}
#hostgroups=nagios_infrastructure_XGI,admin_locale_wb_epn,admin_locale_wb_xgi,admin_locale_wv_epn,admin_locale_wv_xgi
hostgroups=nagios_infrastructure_xgi_wv,nagios_infrastructure_xgi_wb,nagios_infrastructure_XGI,admin_locale_wb_epn,admin_locale_wb_xgi,admin_locale_wv_epn,admin_locale_wv_xgi,admin_locale_rm_xgi,admin_locale_lv_xgi
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Have attached gearman_top output as well which also confirm that group is added..

Also I noticed that All Servers which are working fine having host group group "nagios_infrastructure" and when I added this group on server which was not working started working and when I add any Host group mentioned on worker file getting error on Host ( host check orphaned..) and all services checks stop.

Also noticed that X server was having Host group - admin_locale_wb_xgi and when I changed it other admin_locale_wv_xgi its throwing old group error. ( Please see attached error for more better understanding ) and when I add host group " nagios_infrastructure " it started working fine.

Post by **cdienger** » Fri Apr 06, 2018 3:52 pm

The worker is configure with two servers:

server=xx.xx.xx.31:4730
server=xx.xxx.xxx.37:4730

Was the neb config provided from xx.xx.xxx.31 ? Is there only one worker ? Try editing the worker config so that it only has one server line and then restart both the server and worker services.

Nagios Support Forum

host check orphaned, is the mod-gearman worker on queue

host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue

Re: host check orphaned, is the mod-gearman worker on queue