Page 4 of 10

Re: host check orphaned

Posted: Fri Mar 13, 2015 1:50 pm
by bosecorp
the only way I know how to run it from the gearman server is using the GUI, by going to home==>host details and then go into the host and force and check

is there a way to do it from the command line?

Re: host check orphaned

Posted: Fri Mar 13, 2015 1:56 pm
by jdalrymple
jdalrymple wrote:If using XI defaults, login to the gearman server, execute:

Code: Select all

/usr/local/nagios/libexec/check_icmp -H hostip -p 5
where hostip is the IP address of the host to be monitored. Your path may be different than the one I included, but the check_icmp should be the plugin being used.

Re: host check orphaned

Posted: Fri Mar 13, 2015 1:58 pm
by bosecorp
it works from the command line

# /usr/local/nagios/libexec/check_icmp -H 10.103.120.12 -p 5
OK - 10.103.120.12: rta 2.395ms, lost 0%|rta=2.395ms;200.000;500.000;0; pl=0%;40;80;;

Re: host check orphaned

Posted: Fri Mar 13, 2015 2:04 pm
by jdalrymple
Earlier you sent us a tail of your gearman log, but I don't think it had quite enough length and that the problem may be visible there. Can you zip up and PM us a copy of your gearman log from a server with failing host checks?

Gearman log is here:

Code: Select all

/var/log/gearmand.log
Thanks

Re: host check orphaned

Posted: Fri Mar 13, 2015 2:28 pm
by bosecorp
I have the log file. how do send you the file?

Edit: nevermind, I just PM you the file

what I am noticing with some devices is that if I force the check using the GUI, it will became green, but then after a while it will became orphan again.

I am going to PM you the mod_gearman log

question, do I need to update the gearmand as well. I only updated mod_gearman

Re: host check orphaned

Posted: Mon Mar 16, 2015 1:07 pm
by jdalrymple
bosecorp

Nothing in your full log leads me to believe anything but the path we've been going down all along, the gearman worker simply isn't returning results. But you said the check is working OK from the gearman server. OK, I guess the next step is to take a look at the mod_gearman_worker.log from the gearman worker server. Can you share the last 20 or so lines of that?

Re: host check orphaned

Posted: Mon Mar 16, 2015 1:44 pm
by bosecorp
[2015-03-15 15:33:57][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:35:58][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:37:59][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:40:00][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:42:01][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:44:02][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:46:03][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:48:04][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:50:05][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:52:06][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:54:07][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:56:08][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 15:58:09][4468][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 16:15:29][4468][INFO ] mod_gearman worker exited
[2015-03-15 16:18:27][2842][INFO ] mod_gearman worker daemon started with pid 2842
[2015-03-15 16:18:28][2842][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 16:20:29][2842][INFO ] no checks in 2minutes, restarting all workers
[2015-03-15 16:21:19][2842][INFO ] mod_gearman worker exited
[2015-03-15 16:22:19][4142][INFO ] mod_gearman worker daemon started with pid 4142
[2015-03-15 16:22:20][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 01:51:37][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 01:53:38][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 01:55:39][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 01:57:40][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 01:59:41][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 02:01:42][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 02:03:43][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 02:05:44][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 02:07:45][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 02:09:46][4142][INFO ] no checks in 2minutes, restarting all workers
[2015-03-16 02:11:47][4142][INFO ] no checks in 2minutes, restarting all workers

Re: host check orphaned

Posted: Mon Mar 16, 2015 1:57 pm
by jdalrymple
Which worker is this? From the worker server what is the output of:

Code: Select all

check_gearman -H nagmonus1 -q worker_`hostname`

Re: host check orphaned

Posted: Mon Mar 16, 2015 2:39 pm
by bosecorp
# check_gearman -H nagmonus1 -q worker_`hostname`
check_gearman WARNING - failed to connect to nagmonus1:4730 - No route to host
Queue worker_nagmonus1 not found
root@nagmonus1:(03-16 15:37): /root

that doesn't sound good.

Re: host check orphaned

Posted: Mon Mar 16, 2015 2:42 pm
by jdalrymple
It may be bad, it may not be. When you specified your job server did you use the hostname or the IP address. If you used the IP address you can simply replace nagmonus1 with the proper IP in the command I supplied earlier. If you used nagmonus1 in your mod_gearman_worker.conf though, this machine definitely won't be able to report back to the job server.

Does that make sense?