NagiosXI+Remote-Workers-(Distributed Monitoring)

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
maartin.pii
Posts: 84
Joined: Wed May 18, 2016 1:39 pm

NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by maartin.pii »

Hi Guys,

I am currently deploying the following Scenario:

- 1 VM - NagiosXI 5.3.3
- 6 VMs - Mod_Gearman (Nagios Remote Workers)

I have already taken a look at this paper -> https://assets.nagios.com/downloads/nag ... ios_XI.pdf

However, I have some issues from I wanted to be advised - Each worker will take the a role (for example: Worker 1 will check network devices //host and services - Worker 2 will check Linux services//Just OS's services - Worker 3 will check Linux Services //Just applications services - Worker 4 will check host/services from the DMZ //OS and Apps Services,etc)

What I am not sure is how to go through this scenario in order to configurate it.

Ideas:

- I was thinking on creating 1 Host Group per worker and deploy the worker just to check hostgroup, for example. But I am afraid that I am not sure if I do that, will the worker check the services too for that hostgroup? Or just the host check alive ?
- Is better to create servicegroups? In that case, I am sure that creating a servicegroup for a remote worker will not do that every host on that service group would try to execute all services within that service group?

Well I hope that I have explained myself, any recommendations would be usefull for me.

Thanks in advance guys, you rock!
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by tgriep »

If you setup the workers to use only one hostgroup, that worker will test the host and all of the services for those hosts without having to specify them.
If you need more granularity for a service check, that is where you would create a service group and apply that to the gearman worker.
For more details, take a look at this KB article.
https://support.nagios.com/kb/article.php?id=484
Be sure to check out our Knowledgebase for helpful articles and solutions!
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by dwhitfield »

in addition to what @tgriep said, you might find the videos at https://www.youtube.com/user/nagiosvide ... od_gearman useful.
maartin.pii
Posts: 84
Joined: Wed May 18, 2016 1:39 pm

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by maartin.pii »

Thanks Guys.

And from what @tgriep said, is it possible to setup a worker to check a hostgroup and another worker to check a service group? Because I have made something similar in the past and I started to have 'orphan' checks to the hostgroup that didn't run in a remote worker but on the local one.
maartin.pii
Posts: 84
Joined: Wed May 18, 2016 1:39 pm

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by maartin.pii »

tgriep wrote:If you setup the workers to use only one hostgroup, that worker will test the host and all of the services for those hosts without having to specify them.
If you need more granularity for a service check, that is where you would create a service group and apply that to the gearman worker.
For more details, take a look at this KB article.
https://support.nagios.com/kb/article.php?id=484
Hey Tom - I've checked the KB Article that you recommended to me, and I am still having the same issue whit it.

- I could use service groups OK - But this is only for the checks that 'share' or 'use' the same plugins... So in my scenario where I have a remote worker for OS's checks and other for Apps checks this won't work. Because I'd have to create a service group for CPU LOAD, a service group for root partition space, a service group for memory check (for example) and then I'd have to create a service group for tomcat port listening, for tomcat service availability, a service group for Oracle table spaces, etc (for example).

Do I explain myself? This wouldn't be an option I think... What I could do then?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by tgriep »

To use Mod Gearman efficiently, you would have to install the plugins on all of the workers and then you will not have to create all of the service groups.
I am guessing that the remote sites will be running the same checks as the other sites so installing all of the plugins on all the workers will give you the most flexibility.
Be sure to check out our Knowledgebase for helpful articles and solutions!
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by bheden »

Using the configuration documentation found at https://labs.consol.de/nagios/mod-gearm ... figuration we can easily figure out what you need.

Let's say you have the following example configuration in Nagios:

Code: Select all

define host{
   host_name                switch
   address                  192.168.1.1
   ...
}

define host{
   host_name                firewall
   address                  192.168.1.2
   ...
}

define service{
   host_name                switch
   service_description      switch_svc1
   ...
}

define service{
   host_name                firewall
   service_description      firewall_svc1
   ...
}

define host{
   host_name                linux1
   address                  192.168.2.1
   ...
}

define host{
   host_name                linux2
   address                  192.168.2.2
   ...
}

define service{
   host_name                linux1
   service_description      linux1_cpu
   ...
}

define service{
   host_name                linux1
   service_description      linux1_httpd
   ...
}

define service{
   host_name                linux2
   service_description      linux2_cpu
   ...
}

define service{
   host_name                linux2
   service_description      linux2_httpd
   ...
}

define host{
   host_name                dmz_host
   address                  192.168.3.1
   ...
}

define service{
   host_name                dmz_host
   service_description      dmz_host_svc1
   ...
}
That gives you the following hosts: switch,firewall,linux1,linux2, and dmz. Each has at least one service, but each linux host has an OS related service and an application related service.
Now, we can break this into logical groups based on what you're trying to accomplish:

Code: Select all

define hostgroup{
    hostgroup_name      network_devices
    members             switch,firewall
}

define hostgroup{
    hostgroup_name      linux_servers
    members             linux1,linux2
}

define servicegroup{
    servicegroup_name   linux_server_os_services
    members             linux1,linux1_cpu, linux2,linux2_cpu
}

define servicegroup{
    servicegroup_name   linux_server_application_services
    members             linux1,linux1_httpd, linux2,linux2_httpd
}

define hostgroup{
    hostgroup_name      dmz
    members             dmz_host
}
for example: Worker 1 will check network devices //host and services - Worker 2 will check Linux services//Just OS's services - Worker 3 will check Linux Services //Just applications services - Worker 4 will check host/services from the DMZ //OS and Apps Services,etc
Based on your quote, we can supply the following configurations to each of the workers to achieve our desired results:

ModGearman NEB configuration

Code: Select all

hosts=yes
services=yes
hostgroups=network_devices,dmz
servicegroups=linux_server_os_services,linux_server_application_services
Worker 1 ModGearman worker configuration

Code: Select all

hosts=no
services=no
hostgroups=network_devices
Worker 2 ModGearman worker configuration

Code: Select all

hosts=no
services=no
servicegroups=linux_server_os_services
Worker 3 ModGearman worker configuration

Code: Select all

hosts=no
services=no
servicegroups=linux_server_application_services
Worker 4 ModGearman worker configuration

Code: Select all

hosts=no
services=no
hostgroups=dmz
To specifically answer your questions:
I was thinking on creating 1 Host Group per worker and deploy the worker just to check hostgroup, for example. But I am afraid that I am not sure if I do that, will the worker check the services too for that hostgroup? Or just the host check alive ?
You're correct. If a service or host belongs to a hostgroup, they are both distributed to a worker that is designated to check that hostgroup.
Is better to create servicegroups? In that case, I am sure that creating a servicegroup for a remote worker will not do that every host on that service group would try to execute all services within that service group?
I wouldn't say it's always better, but in this particular case it makes the most sense. The host checks for the hosts that the services belong to aren't distributed to a ModGearman worker when using servicegroups!

And finally, a few important things to know and remember:
  • * The hosts=no and services=no lines don't stop the execution of host and service checks on the worker. When specified on the worker side, they stop the generic host and service queues from being distributed to that worker (hosts and services that aren't specified via a hostgroup or servicegroup queue).
    * You'll need an additional worker that has hosts=yes and services=yes specified. This will catch all of the additional hosts and services you don't specify so that their work can be distributed as well.
    * Caution! In the configuration above, you haven't specified anywhere for the host checks for linux1 and linux2 to be executed anywhere. So if you DON'T specify an additional worker with at least hosts=yes specified, the host checks for those two particular hosts WILL BE ORPHANED.
Hope this helps clear things up!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
maartin.pii
Posts: 84
Joined: Wed May 18, 2016 1:39 pm

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by maartin.pii »

THANK YOU VERY MUCH @bheden for your explanation - It was REALLY clear!

just to put all the things together and close this thread I wanted to ask you a final thought that still remain kind of confuse.

Lets suppose that I have the following config:

Code: Select all

define host{
   host_name                linux1
   address                  192.168.2.1
   ...
}

define host{
   host_name                linux2
   address                  192.168.2.2
   ...
}

define service{
   host_name                linux1
   service_description      linux1_cpu
   ...
}

define service{
   host_name                linux1
   service_description      linux1_mysqld
   ...
}

define service{
   host_name                linux2
   service_description      linux2_cpu
   ...
}

define service{
   host_name                linux2
   service_description      linux2_httpd
   ...
}
And then:

Code: Select all


define servicegroup{
    servicegroup_name   linux_server_application_services
    members             linux1,linux1_mysqld, linux2,linux2_httpd
}

My issue here is the following, creating this kind of service group would make that linux2 check mysqld check and linux1 check httpd?

What I want to be sure is that doing this kind of logic group won't make that for example If I only want httpd to be checked on linux2 and mysqld to be checked on linux1 generating this service group this won't cause to 'mix' the checks or to execute all the checks defined on that service group to all the servers.

Do I explain myself?

Thanks a lot!


bheden wrote:Using the configuration documentation found at https://labs.consol.de/nagios/mod-gearm ... figuration we can easily figure out what you need.

Let's say you have the following example configuration in Nagios:

Code: Select all

define host{
   host_name                switch
   address                  192.168.1.1
   ...
}

define host{
   host_name                firewall
   address                  192.168.1.2
   ...
}

define service{
   host_name                switch
   service_description      switch_svc1
   ...
}

define service{
   host_name                firewall
   service_description      firewall_svc1
   ...
}

define host{
   host_name                linux1
   address                  192.168.2.1
   ...
}

define host{
   host_name                linux2
   address                  192.168.2.2
   ...
}

define service{
   host_name                linux1
   service_description      linux1_cpu
   ...
}

define service{
   host_name                linux1
   service_description      linux1_httpd
   ...
}

define service{
   host_name                linux2
   service_description      linux2_cpu
   ...
}

define service{
   host_name                linux2
   service_description      linux2_httpd
   ...
}

define host{
   host_name                dmz_host
   address                  192.168.3.1
   ...
}

define service{
   host_name                dmz_host
   service_description      dmz_host_svc1
   ...
}
That gives you the following hosts: switch,firewall,linux1,linux2, and dmz. Each has at least one service, but each linux host has an OS related service and an application related service.
Now, we can break this into logical groups based on what you're trying to accomplish:

Code: Select all

define hostgroup{
    hostgroup_name      network_devices
    members             switch,firewall
}

define hostgroup{
    hostgroup_name      linux_servers
    members             linux1,linux2
}

define servicegroup{
    servicegroup_name   linux_server_os_services
    members             linux1,linux1_cpu, linux2,linux2_cpu
}

define servicegroup{
    servicegroup_name   linux_server_application_services
    members             linux1,linux1_httpd, linux2,linux2_httpd
}

define hostgroup{
    hostgroup_name      dmz
    members             dmz_host
}
for example: Worker 1 will check network devices //host and services - Worker 2 will check Linux services//Just OS's services - Worker 3 will check Linux Services //Just applications services - Worker 4 will check host/services from the DMZ //OS and Apps Services,etc
Based on your quote, we can supply the following configurations to each of the workers to achieve our desired results:

ModGearman NEB configuration

Code: Select all

hosts=yes
services=yes
hostgroups=network_devices,dmz
servicegroups=linux_server_os_services,linux_server_application_services
Worker 1 ModGearman worker configuration

Code: Select all

hosts=no
services=no
hostgroups=network_devices
Worker 2 ModGearman worker configuration

Code: Select all

hosts=no
services=no
servicegroups=linux_server_os_services
Worker 3 ModGearman worker configuration

Code: Select all

hosts=no
services=no
servicegroups=linux_server_application_services
Worker 4 ModGearman worker configuration

Code: Select all

hosts=no
services=no
hostgroups=dmz
To specifically answer your questions:
I was thinking on creating 1 Host Group per worker and deploy the worker just to check hostgroup, for example. But I am afraid that I am not sure if I do that, will the worker check the services too for that hostgroup? Or just the host check alive ?
You're correct. If a service or host belongs to a hostgroup, they are both distributed to a worker that is designated to check that hostgroup.
Is better to create servicegroups? In that case, I am sure that creating a servicegroup for a remote worker will not do that every host on that service group would try to execute all services within that service group?
I wouldn't say it's always better, but in this particular case it makes the most sense. The host checks for the hosts that the services belong to aren't distributed to a ModGearman worker when using servicegroups!

And finally, a few important things to know and remember:
  • * The hosts=no and services=no lines don't stop the execution of host and service checks on the worker. When specified on the worker side, they stop the generic host and service queues from being distributed to that worker (hosts and services that aren't specified via a hostgroup or servicegroup queue).
    * You'll need an additional worker that has hosts=yes and services=yes specified. This will catch all of the additional hosts and services you don't specify so that their work can be distributed as well.
    * Caution! In the configuration above, you haven't specified anywhere for the host checks for linux1 and linux2 to be executed anywhere. So if you DON'T specify an additional worker with at least hosts=yes specified, the host checks for those two particular hosts WILL BE ORPHANED.
Hope this helps clear things up!
bheden
Product Development Manager
Posts: 179
Joined: Thu Feb 13, 2014 9:50 am
Location: Nagios Enterprises

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by bheden »

My issue here is the following, creating this kind of service group would make that linux2 check mysqld check and linux1 check httpd?

What I want to be sure is that doing this kind of logic group won't make that for example If I only want httpd to be checked on linux2 and mysqld to be checked on linux1 generating this service group this won't cause to 'mix' the checks or to execute all the checks defined on that service group to all the servers.
If you define your servicegroup as you posted, then you're exactly right. They won't mix the way you suggest.

Alternatively, you can add hosts and services to hostgroup and servicegroups via the original definitions as well, if you'd prefer:

Code: Select all

define service{
   host_name                linux1
   service_description      linux1_mysqld
   servicegroups			linux_server_application_services
   ...
}

define service{
   host_name                linux2
   service_description      linux2_httpd
   servicegroups			linux_server_application_services
   ...
}
See here for more goodies: https://assets.nagios.com/downloads/nag ... tions.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Nagios Enterprises
Senior Developer
maartin.pii
Posts: 84
Joined: Wed May 18, 2016 1:39 pm

Re: NagiosXI+Remote-Workers-(Distributed Monitoring)

Post by maartin.pii »

Hi Guys,

I have done the following configuration:


- Nagios XI (1VM)
- Remote Workers (6VMs)

----------------------------

On my neb config I have defined all my HG/SG on the HG/SG definition. I also have defined some hostgroups to be cheked as local hostgroups.

On my remote workers I have defined 4 of them to work with hosgroups and 2 of them to work with servicegrups.

As you recommended to me in one of them I also have defined host=yes and services=yes.

However, what I am seeing is that the hosts check of some servers that are part of the service groups are 'orphaned' or they flap from orphan from ok.

If I force the checks to run, they are ok. But after some time they come to orphan state again.

I will upload my workers configurations.

Regards,
Locked