I'm sorry for posting this here if this is not the correct location. I could not post in the Enterprise customer support forum.... Maybe it is because my account is new?
We are currently setting up cluster checks as part of our monitoring infrastructure and for some reason, we cannot get the service to display all the hosts/services listed (and delimited by comma). Say we have over 150 hosts listed along with a service. This is great...but then we see maybe 120 hosts. Say we reverse-sort that same list? Then we might see 8 hosts listed in the OK state. All other states stay at zero.
The numbers of missing hosts do not correspond with one another when sorting from either end of the full list, and we have checked and double-triple-quadruple-checked the syntax. But we see no reason why the check is refusing to see all hosts. We have also tried removing just a single random host from the list--just to see what would happen--and suddenly the service could only see 1 single host. This baffles us.
We've not been able to see any rhyme or reason to this strangeness, and we've been able to set up monitoring across our sdlc environments without running into any issues.
Is it something with the code of the check_cluster check itself...?
check_cluster not seeing all hosts in delimited list
-
aniyahqueen
- Posts: 3
- Joined: Wed Nov 16, 2016 6:54 pm
Re: check_cluster not seeing all hosts in delimited list
For the Customer forum - please contact [email protected] to get added to post this on our customer forum.
Could you post your service definitions for us to look at so that we can take a look at how check_cluster is being used?
Could you post your service definitions for us to look at so that we can take a look at how check_cluster is being used?
Former Nagios Employee
-
aniyahqueen
- Posts: 3
- Joined: Wed Nov 16, 2016 6:54 pm
Re: check_cluster not seeing all hosts in delimited list
I will send an email for access to the customer forum.
For the time being, yes... we followed the instructions at this url to setup the command: https://assets.nagios.com/downloads/nag ... sters.html
This is the command in the commands.cfg file:
And here is an example of the service we are using:
*** Please note that there are over 150 hosts -- not just 2....
Also, We started out with the delimited list w/o line continuations, but saw the same behavior of missing hosts both ways. We decided on line continuations so that we could more easily see if there were any typos -- which were not the issue since we used a loop to generate the list we were using... We can add hosts up to 150...and then as soon as we get to 150, we can add hosts, but they will not show in the service status. It does not matter what host we add next or in what order, we cannot seem to get past 150. We have another two services that do the same thing, but with 161 hosts.
For the time being, yes... we followed the instructions at this url to setup the command: https://assets.nagios.com/downloads/nag ... sters.html
This is the command in the commands.cfg file:
Code: Select all
define command {
command_name check_service_cluster
command_line /usr/local/nagios/libexec/check_cluster --service -l $ARG1$ -w $ARG2$ -c $ARG3$ -d $ARG4$
}Code: Select all
define service {
host_name cluster_checks
service_description PRD_Cluster-Service-Status
use linux-cluster-service-prod-standard
check_command check_service_cluster!"Cluster_Service_Status"!25!140!\
$SERVICESTATEID:HOST01:Service Client$,\
$SERVICESTATEID:HOST02:Service Client$,\
notification_period 24x7
notifications_enabled 1
contact_groups linux-emailonly
register 1
}Also, We started out with the delimited list w/o line continuations, but saw the same behavior of missing hosts both ways. We decided on line continuations so that we could more easily see if there were any typos -- which were not the issue since we used a loop to generate the list we were using... We can add hosts up to 150...and then as soon as we get to 150, we can add hosts, but they will not show in the service status. It does not matter what host we add next or in what order, we cannot seem to get past 150. We have another two services that do the same thing, but with 161 hosts.
-
aniyahqueen
- Posts: 3
- Joined: Wed Nov 16, 2016 6:54 pm
Re: check_cluster not seeing all hosts in delimited list
I think we may have actually just realized what it is.. is there a character limit on the value of the delimited field? We went ahead and shortened the name of the service and we could render results from more hosts...
Re: check_cluster not seeing all hosts in delimited list
You are probably running into this: https://support.nagios.com/kb/article.php?id=478
Former Nagios employee