Fresh out of Advanced training and need help (modGearman)

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Fresh out of Advanced training and need help (modGearman)

Post by dlukinski »

Hello Nagios Support

I am fresh out of Advanced class (could not get mod Gearman to work in it) and need your help with installing modGearman:

Firstly, I am trying to get distributed monitoring, so I had installed CentOS and ran script, provided by MikeWeber to get worker on that remote CentOS
- what do I do to XI to make it talk to remote worker? Explanations did not work for me, no install shown :-\

Again we are looking and load balance / distributed checks scenario, where we would install 5-10 proxies (mod_Gearman) to offload checks from XI (or run them from another geographic location. Were told "it is easy, just follow the manual": not easy nor it is clear anyhow.

Thank you
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Fresh out of Advanced training and need help (modGearman

Post by Box293 »

This guide should be followed for installing Mod-Gearman in XI and also on the workers:

https://support.nagios.com/kb/article.php?id=225

This guide should explain answer all of your other questions:

https://support.nagios.com/kb/article.php?id=484
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Fresh out of Advanced training and need help (modGearman

Post by dlukinski »

Box293 wrote:This guide should be followed for installing Mod-Gearman in XI and also on the workers:

https://support.nagios.com/kb/article.php?id=225

This guide should explain answer all of your other questions:

https://support.nagios.com/kb/article.php?id=484
Thank you (Great manual)

Two important questions:

1) In training we were told to install (and keep upgrading) Nagios and custom plugins on worker (if used).
Your manual does not state installing plugins?

2) what if I do not want XI-side mG to run anything (only to use one for connecting to a number of remote Workers to run checks from/for remote locations?
- how to edit XI-side conf file? (NO to host and services?)
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: Fresh out of Advanced training and need help (modGearman

Post by bheden »

1) You install the plugins just as you would on a normal Nagios instance. ModGearman takes the checks from Nagios and executes them remotely (as if they were local) - so as long as everything is the same on the remote worker, you won't have any issues.

2) If you don't want the worker that is present on the Nagios XI box to perform any checks, then you disable it. service mod-gearman2-worker stop or service mod_gearman_worker stop, depending on the version you're using. Make sure to disable it from running at boot time, as well (chkconfig mod-gearman2-worker off or chkconfig mod_gearman_worker off).

2a) But there are exceptions, sometimes you need the XI server to handle some of its own checks (in the instance of localhost for example) ..if that is the case, then in the ModGearman module configuration, you need to change some settings like "localhostgroups" or "localservicegroups", and make sure that the hosts/services you don't want ModGearman to check are in those hostgroups or servicegroups.

Hope this helps.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Fresh out of Advanced training and need help (modGearman

Post by Box293 »

dlukinski wrote:1) In training we were told to install (and keep upgrading) Nagios and custom plugins on worker (if used).
Your manual does not state installing plugins?
Good feedback, I'll get the manual updated.
dlukinski wrote: 2) what if I do not want XI-side mG to run anything (only to use one for connecting to a number of remote Workers to run checks from/for remote locations?
- how to edit XI-side conf file? (NO to host and services?)
In addition to what @bhedden said ...
First you must be absolutely sure there are no checks that need to be executed locally.
On your XI server, simply configure the mod-gearman2-worker service to not start. If it never starts, it can never execute checks, only the connected workers will execute checks.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Fresh out of Advanced training and need help (modGearman

Post by dlukinski »

Box293 wrote:
dlukinski wrote:1) In training we were told to install (and keep upgrading) Nagios and custom plugins on worker (if used).
Your manual does not state installing plugins?
Good feedback, I'll get the manual updated.
dlukinski wrote: 2) what if I do not want XI-side mG to run anything (only to use one for connecting to a number of remote Workers to run checks from/for remote locations?
- how to edit XI-side conf file? (NO to host and services?)
In addition to what @bhedden said ...
First you must be absolutely sure there are no checks that need to be executed locally.
On your XI server, simply configure the mod-gearman2-worker service to not start. If it never starts, it can never execute checks, only the connected workers will execute checks.
------------------------------------------------------

Not sure i understand the concept for XI itself:
- I want to execute checks on XI by XI (not modGearman)
- I want SOME checks to be executed by the remote modGearman workers.

Idea is as follows:

1) XI runs most of the checks
2) some service checks done by remote modGearman workers (like perl-based DB checks)
3) some other checks (created in sets of the same checks for different workers) are done very remote modGerman workers for the purpose of geographic validation (same against the same host/service done from multiple locations to make sure it is accessible and available)

What we do in this case

--------------------------------------------------
seeing some mod_geraman defunct message in top command on workers

when having hosts moved out of local group, checks stop working on XI (orphaned hosts message)

Is there more case scenarios with explanations to read about?

------------------------------------------------

Say I want to run remote custom plugin, which requires Java. do I copy plugin and install Java on worker?

Say I want to run remote Oracle checks, do I install oracle instant client on worker?
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Fresh out of Advanced training and need help (modGearman

Post by rkennedy »

dlukinski wrote: ------------------------------------------------------

Not sure i understand the concept for XI itself:
- I want to execute checks on XI by XI (not modGearman)
- I want SOME checks to be executed by the remote modGearman workers.
Idea is as follows:

1) XI runs most of the checks
2) some service checks done by remote modGearman workers (like perl-based DB checks)
3) some other checks (created in sets of the same checks for different workers) are done very remote modGerman workers for the purpose of geographic validation (same against the same host/service done from multiple locations to make sure it is accessible and available)

What we do in this case
This is possible, but you won't be able to organize it based on a perl check. The way to assign a specific gearman worker to a set of checks is by using host groups / service groups. See this page for more information - https://assets.nagios.com/presentations ... earman.pdf (page 20)
dlukinski wrote: --------------------------------------------------
seeing some mod_geraman defunct message in top command on workers

when having hosts moved out of local group, checks stop working on XI (orphaned hosts message)

Is there more case scenarios with explanations to read about?
Can you please elaborate, or show us a screenshot of what you're seeing?

dlukinski wrote: ------------------------------------------------

Say I want to run remote custom plugin, which requires Java. do I copy plugin and install Java on worker?

Say I want to run remote Oracle checks, do I install oracle instant client on worker?
Yes, all the gearman workers will need the same information as the Nagios XI machine. This includes plugins, and dependencies.
Former Nagios Employee
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Fresh out of Advanced training and need help (modGearman

Post by dlukinski »

Let's start from scratch

A. It seems I could use either Troy's or Mike's approach to mod_gearman, but not both (installations and manuals are incompatible when followed directly due to lack of experience in my case)
- Going to use Troy's manuals (he is helping) because Mike did not help me via email nor did he clarify things, so cannot proceed with something questionable / no support.

I want to to use mod_gearman for distributed monitoring

- I install x amount of remote workers (same key)
- I install same plugins and dependencies on all workers
- I monitor workers (optional) and set them as parents for host under remote monitoring (optional)
--------------------------------------------------------------------------------------
Now
What we do with module.conf and worker.conf files?

- configure remote worker.conf services/hosts=NO and include 1 or more host group & 1 or more service group per worker. (please confirm)
- I DO NOT touch module.conf on workers (please confirm - Troy's manual has a paragraph starting from editing worker.conf and ending with "save module.conf")
-------------------------------------------------------------------------------------
- on XI server, i edit module.conf, but NOT worker.conf (please confirm)?
- I create local host and local service groups & move ALL hosts/services under them to prevent gearman execution on XI (fails in my case)
- if I stop gearman on XI, will I still be able to check remotely with workers (group-defined)?

What about hosts or services I want to be checked by remote workers?
- do I move them out of XI local groups, into remote groups (hosts/service)? as in this case I get hosts "down" / but their services working (unsure if this XI or remote worker making checks)


Thank you
You do not have the required permissions to view the files attached to this post.
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: Fresh out of Advanced training and need help (modGearman

Post by bheden »

The following should help you, as long as I understand what it is that you're trying to do:

Yes, install x amount of remote workers that all share the same key.

Yes, install the same plugins and dependencies on all workers (I like to verify they are working by executing the plugins locally with the arguments that will be passed, but thats just *my* preference).

You should definitely monitor the workers! The whole point with this setup, though, is that you don't need to limit 1 ModGearman worker to performing checks for just 1 hostgroup or servicegroup. You can have multiple ModGearman workers performing the checks - this comes in handy when you need to take one machine down to upgrade, or for other miscellaneous maintenance. If you're limiting some checks to only be performed by 1 worker, then what you suggest here is a good idea. I like to give each queue (specific to hostgroups / servicegroups) at LEAST 2 workers (for the reasons I just mentioned).

So, before we go any further, lets define an example scenario.

You have 5 servers! You have 1 XI/ModGearman server (it has a few checks that need to be performed locally ONLY), 2 Worker Servers dedicated to perfoming checks on a specific hostgroup (We'll define this in a moment), and then 2 more Worker Servers dedicated to performing ALL other checks. In order, we'll call these servers: nagios, hostgroup_worker1, hostgroup_worker2, catchall_worker1, catchall_worker2 - this'll make the following break down a lot easier to understand.

On server nagios, your nagios configuration for this example should look something like (obviously this exact configuration isn't going to work, I left out some required directives for brevity - but I think this should get my point across):

Code: Select all

define hostgroup {
   hostgroup_name   local-only
   alias            local-only
}

define hostgroup {
   hostgroup_name   specific-only
   alias            specific-only
}

define host {
   host_name    localhost
   alias        localhost
   address      127.0.0.1
   hostgroups   local-only
}

define host {
   host_name    host1
   alias        host1
   address      192.168.1.1
   hostgroups   specific-only
}

define host {
   host_name    host2
   alias        host2
   address      192.168.1.2
   hostgroups   specific-only
}

define host {
   host_name    host3
   alias        host3
   address      192.168.1.3
}

define host {
   host_name    host4
   alias        host4
   address      192.168.1.4
}
Here we have defined 2 hostgroups (local-only and specific-only), and 5 hosts (localhost, host1, host2, host3 and host4), hosts 1 and 2 belong to the specific-only hostgroup, and hosts 3 and 4 don't belong to any hostgroup.

With those definitions, we want localhost checks to not be picked up by ModGearman, and we want hosts host1 and host2 checks to be performed by servers hostgroup_worker1 and hostgroup_worker2, and hosts host3 and host4 checks to be performed by servers catchall_worker1 and catchall_worker2.

In order to accomplish that: on server nagios, in the /etc/mod_gearman2/module.conf, you should have the following directives set:

Code: Select all

hostgroups=specific-only
localhostgroups=local-only
Then restart nagios. Those settings will create a hostgroup queue specifically for the specific-only hostgroup, and declare that anything in the local-only hostgroup not be picked up by ModGearman! Make sure your ModGearman worker is not running on this machine - and its a good idea to disable it at boot with a

Code: Select all

chkconfig mod-gearman2-worker off
On hostgroup_worker1 and hostgroup_worker2 (these configurations should be identical!), in the /etc/mod_gearman2/worker.conf, you should have the following directives set:

Code: Select all

hosts=no
services=no
hostgroups=specific-only
You won't need to change the default settings on catchall_worker1 and catchall_worker2 in order to accomplish anything for this example (other than making sure your key matches!).

Hope this helps clear things up for you!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Fresh out of Advanced training and need help (modGearman

Post by dlukinski »

bheden wrote:The following should help you, as long as I understand what it is that you're trying to do:

Yes, install x amount of remote workers that all share the same key.

Yes, install the same plugins and dependencies on all workers (I like to verify they are working by executing the plugins locally with the arguments that will be passed, but thats just *my* preference).

You should definitely monitor the workers! The whole point with this setup, though, is that you don't need to limit 1 ModGearman worker to performing checks for just 1 hostgroup or servicegroup. You can have multiple ModGearman workers performing the checks - this comes in handy when you need to take one machine down to upgrade, or for other miscellaneous maintenance. If you're limiting some checks to only be performed by 1 worker, then what you suggest here is a good idea. I like to give each queue (specific to hostgroups / servicegroups) at LEAST 2 workers (for the reasons I just mentioned).

So, before we go any further, lets define an example scenario.

You have 5 servers! You have 1 XI/ModGearman server (it has a few checks that need to be performed locally ONLY), 2 Worker Servers dedicated to perfoming checks on a specific hostgroup (We'll define this in a moment), and then 2 more Worker Servers dedicated to performing ALL other checks. In order, we'll call these servers: nagios, hostgroup_worker1, hostgroup_worker2, catchall_worker1, catchall_worker2 - this'll make the following break down a lot easier to understand.

On server nagios, your nagios configuration for this example should look something like (obviously this exact configuration isn't going to work, I left out some required directives for brevity - but I think this should get my point across):

Code: Select all

define hostgroup {
   hostgroup_name   local-only
   alias            local-only
}

define hostgroup {
   hostgroup_name   specific-only
   alias            specific-only
}

define host {
   host_name    localhost
   alias        localhost
   address      127.0.0.1
   hostgroups   local-only
}

define host {
   host_name    host1
   alias        host1
   address      192.168.1.1
   hostgroups   specific-only
}

define host {
   host_name    host2
   alias        host2
   address      192.168.1.2
   hostgroups   specific-only
}

define host {
   host_name    host3
   alias        host3
   address      192.168.1.3
}

define host {
   host_name    host4
   alias        host4
   address      192.168.1.4
}
Here we have defined 2 hostgroups (local-only and specific-only), and 5 hosts (localhost, host1, host2, host3 and host4), hosts 1 and 2 belong to the specific-only hostgroup, and hosts 3 and 4 don't belong to any hostgroup.

With those definitions, we want localhost checks to not be picked up by ModGearman, and we want hosts host1 and host2 checks to be performed by servers hostgroup_worker1 and hostgroup_worker2, and hosts host3 and host4 checks to be performed by servers catchall_worker1 and catchall_worker2.

In order to accomplish that: on server nagios, in the /etc/mod_gearman2/module.conf, you should have the following directives set:

Code: Select all

hostgroups=specific-only
localhostgroups=local-only
Then restart nagios. Those settings will create a hostgroup queue specifically for the specific-only hostgroup, and declare that anything in the local-only hostgroup not be picked up by ModGearman! Make sure your ModGearman worker is not running on this machine - and its a good idea to disable it at boot with a

Code: Select all

chkconfig mod-gearman2-worker off
On hostgroup_worker1 and hostgroup_worker2 (these configurations should be identical!), in the /etc/mod_gearman2/worker.conf, you should have the following directives set:

Code: Select all

hosts=no
services=no
hostgroups=specific-only
You won't need to change the default settings on catchall_worker1 and catchall_worker2 in order to accomplish anything for this example (other than making sure your key matches!).

Hope this helps clear things up for you!
Made changes except for listing 2 x host groups and 2x service groups (we've got 2 test remote workers)
- In out case we requre same servers and portals tested from various locations (Free XI is another option, but too limited in terms of hosts allowed)
- so for worker A we need a set of services "A"
- for worker B we need copy of set "A", renamed into "B" and running on B.
- so on for dozen workers.

----------------------------------------------------------------------------------------

Still does not explain what to do with hosts in between local and remote groups.
"(host check orphaned, is the mod-gearman worker on queue 'host' running?)" when removing host from local group and adding to remote. Happens only in 1 of 2 cases (1 or 2 workers)
- unsure what is missing. Really thi sis not about host, because it has to stay local, while only services must remain remote-worker

Thank you
Locked