bheden wrote:After reviewing the configuration files you posted, a few things stick out:
You do not have all of your hostgroups listed in the NEB config in the worker configs. Some are missing.
Those will be orphaned.
You have some hostgroups specified in both the hostgroups directive and the localhostgroups. I've actually never seen someone do that, but I assume it could end up orphaned as well.
Remember that whatever hostgroup and servicegroup you define to split into a queue in the NEB config HAS TO HAVE A CORRESPONDING DEFINITION on one of the workers. (Or it will be orphaned).
-
Hi @bheden - First of all, thinks for answering.
- You do not have all of your hostgroups listed in the NEB config in the worker configs. Some are missing.
Those will be orphaned.
Yeap - You are right. However, all of the hostgroups that are listed in the NEB but not in the worker config, are hg that are not being used any more and I only put them there for the company legacy reasons. But I could clear them from the NEB config, no problem.
You have some hostgroups specified in both the hostgroups directive and the localhostgroups. I've actually never seen someone do that, but I assume it could end up orphaned as well.
Right, I have to correct that. However the servers that are defined on those HG are not the ones which are generating me problems. But I will correct it.
-----------------------------
I think that I know what is happening here - if you read the entire thread, I opened this post because I had a difficult scenario and I wanted to be sure that I was making all my configs under Nagios best practices. To summary it - this is my entire scenario:
1 - Nagios XI (1vm)
|
|- Remote Worker 1 // This works with Windows Hostgroups (check everything)
|- Remote Worker 2 // This works with Linux ServiceGroups (check Linux/Unix OS Services - Ex: Load, Disk, etc)
|- Remote Worker 3 // This works with Linux ServiceGroups (check Linux/Unix Applications Services - Ex: Tomcat, Httpd, etc )
| - Remote Worker 4 // This works with DB ServiceGroups (Check Oracle/MSSQL/MySQL Services)
| - Remote Worker 5 // This works with Networking Hostgroups (check everything)
| - Remote Worker 6 // This works with DMZ Hostgroups (check everything)
Note: On Remote Worker 5 I have specified 'services=yes' and 'host=yes' for those checks that could get orphan state.
What I am seeing is that everything works OK (I have NO orphan service) except for the 'check host alive' check of the Linux/Unix Hosts. As I created a service group for all the service checks but none for the host check, they get orphan state.
If force them to run, they return OK state. However, a few minutes later they start to flap again between ok and orpha.
Do I explain myself? Is there any way resole this?
I thought that having a worker with the host&check = yes would prevent myself from this problem. Otherwise could I force that host check to run local?
Regards