Monitoring Nagios Log Server
Re: Monitoring Nagios Log Server
Yes I can use the NRPE to get the json and parse it. Are there any other URIs i could use? For example what are the checks to for status check for elasticSearch and logstash, where you guys report green or red. Is status from url 'http://localhost:9200/_cluster/health?pretty=true' the indicator?jolson wrote:By default, Nagios Log Server won't allow you to query that information from the outside. Your best bet is to use a plugin like NRPE to perform local queries.
For instance, the following will return proper java results if you run it on Nagios Log Server:Basic health check:Code: Select all
curl -XGET localhost:9200/_nodes/jvm?prettyThese queries don't work from the outside for security purposes.Code: Select all
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
You could easily use a plugin like NRPE to launch these queries locally - if that's something you're interested in getting data on. Otherwise, I can recommend the following plugin: https://github.com/anchor/nagios-plugin-elasticsearch
Re: Monitoring Nagios Log Server
It might be easier for your to use check_procs or similar to monitor elasticsearch and logstash - but if you're set on using the api, the call is as follows:
Where your token is the token used by your NLS user - you can find it by using a developer console and watching the token sent while pressing 'restart' on logstash. I'd be wary about sending this token across your network from Nagios - so you may be better off using NRPE.
Code: Select all
http://192.168.x.x/nagioslogserver/index.php/api/system/status?subsystem=logstash&token=xxxxxxxYou do not have the required permissions to view the files attached to this post.
Re: Monitoring Nagios Log Server
Hint of September's talk: We're using check_procs to check for running processes via NRPE....
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
Re: Monitoring Nagios Log Server
I believe that is the correct way to go about monitoring the processes. It allows for future expansion (monitor other processes/use different plugins on the same server), and would likely be more reliable than using the API calls.Hint of September's talk: We're using check_procs to check for running processes via NRPE....
Re: Monitoring Nagios Log Server
Just to give you update: so I setup the monitors using NRPE as well as using the API with token combo. Everything is working great, getting all the stuff I need. It actually caught status change on the Cluster.
So my question is the following: I have all my 4 nodes in the clusters up, but the status is at yellow, instance statuses are green
I see this:
Active Primary Shards 161
Active Shards 207
Relocating Shards 0
Initializing Shards 8
Unassigned Shards 107
Is this due to Unassgined Shards? How is it determined the warning state?
So my question is the following: I have all my 4 nodes in the clusters up, but the status is at yellow, instance statuses are green
I see this:
Active Primary Shards 161
Active Shards 207
Relocating Shards 0
Initializing Shards 8
Unassigned Shards 107
Is this due to Unassgined Shards? How is it determined the warning state?
Re: Monitoring Nagios Log Server
This is certainly due to the uninitialized shards. Cluster health statesare described as follows:Is this due to Unassgined Shards? How is it determined the warning state?
Let's take a look at your cluster health in more detail. Please run the following on your CLI and return the results to us:green
All primary and replica shards are allocated. Your cluster is 100% operational.
yellow
All primary shards are allocated, but at least one replica is missing. No data is missing, so search results will still be complete. However, your high availability is compromised to some degree. If more shards disappear, you might lose data. Think of yellow as a warning that should prompt investigation.
red
At least one primary shard (and all of its replicas) are missing. This means that you are missing data: searches will return partial results, and indexing into that shard will return an exception.
Code: Select all
curl 'localhost:9200/_cluster/health?level=indices&pretty'Re: Monitoring Nagios Log Server
Code: Select all
{
"cluster_name" : "xxxxxxxxxxxxxxxxxxxxxxxxxx",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 4,
"active_primary_shards" : 161,
"active_shards" : 264,
"relocating_shards" : 0,
"initializing_shards" : 8,
"unassigned_shards" : 50,
"indices" : {
"nagioslogserver" : {
"status" : "yellow",
"number_of_shards" : 1,
"number_of_replicas" : 1,
"active_primary_shards" : 1,
"active_shards" : 1,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1
},
"logstash-2015.06.19" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 9,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1
},
"logstash-2015.06.28" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"nagioslogserver_log" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.29" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.26" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"logstash-2015.06.27" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 4
},
"logstash-2015.06.24" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.25" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"logstash-2015.06.22" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.23" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"logstash-2015.06.03" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.20" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 2,
"unassigned_shards" : 2
},
"logstash-2015.06.02" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.21" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 4
},
"logstash-2015.06.01" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.07" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.06" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.05" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.04" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.09" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.08" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.15" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 1,
"unassigned_shards" : 2
},
"logstash-2015.06.16" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 2
},
"logstash-2015.06.17" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.18" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 2
},
"logstash-2015.06.11" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.05.31" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.12" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.13" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 3,
"unassigned_shards" : 0
},
"logstash-2015.06.14" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 2,
"unassigned_shards" : 0
},
"kibana-int" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.10" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
}
}
}
Last edited by tmcdonald on Mon Jun 29, 2015 3:29 pm, edited 1 time in total.
Reason: Please wrap your long output in [code][/code] tags for readability
Reason: Please wrap your long output in [code][/code] tags for readability
Re: Monitoring Nagios Log Server
The issue lies in the numbers above. I notice that this number has decreased from the metric you posted earlier:"initializing_shards" : 8,
"unassigned_shards" : 50,
This is a good sign. It means that the shards without homes are being assigned to instances of Nagios Log Server properly. Keep an eye on your cluster - if the health isn't green in a day or so, I want you to show us another capture of the index status:Initializing Shards 8
Unassigned Shards 107
Code: Select all
curl 'localhost:9200/_cluster/health?level=indices&pretty'Re: Monitoring Nagios Log Server
Got it thanks. What triggers this behavior do you know? And how can I prevent this?jolson wrote:The issue lies in the numbers above. I notice that this number has decreased from the metric you posted earlier:"initializing_shards" : 8,
"unassigned_shards" : 50,This is a good sign. It means that the shards without homes are being assigned to instances of Nagios Log Server properly. Keep an eye on your cluster - if the health isn't green in a day or so, I want you to show us another capture of the index status:Initializing Shards 8
Unassigned Shards 107For now, as long as the 'unassigned shards' number is going down, we're on track to a green cluster state. Ultimately this means that your shards are moving between your instances for load balance and availability purposes - this movement takes some time.Code: Select all
curl 'localhost:9200/_cluster/health?level=indices&pretty'