Re: Monitoring Nagios Log Server
Posted: Wed Jun 24, 2015 1:18 pm
Newborns get in for free!
Support for Nagios products and services
https://support.nagios.com/forum/
Yes I can use the NRPE to get the json and parse it. Are there any other URIs i could use? For example what are the checks to for status check for elasticSearch and logstash, where you guys report green or red. Is status from url 'http://localhost:9200/_cluster/health?pretty=true' the indicator?jolson wrote:By default, Nagios Log Server won't allow you to query that information from the outside. Your best bet is to use a plugin like NRPE to perform local queries.
For instance, the following will return proper java results if you run it on Nagios Log Server:Basic health check:Code: Select all
curl -XGET localhost:9200/_nodes/jvm?prettyThese queries don't work from the outside for security purposes.Code: Select all
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
You could easily use a plugin like NRPE to launch these queries locally - if that's something you're interested in getting data on. Otherwise, I can recommend the following plugin: https://github.com/anchor/nagios-plugin-elasticsearch
Code: Select all
http://192.168.x.x/nagioslogserver/index.php/api/system/status?subsystem=logstash&token=xxxxxxxI believe that is the correct way to go about monitoring the processes. It allows for future expansion (monitor other processes/use different plugins on the same server), and would likely be more reliable than using the API calls.Hint of September's talk: We're using check_procs to check for running processes via NRPE....
This is certainly due to the uninitialized shards. Cluster health statesare described as follows:Is this due to Unassgined Shards? How is it determined the warning state?
Let's take a look at your cluster health in more detail. Please run the following on your CLI and return the results to us:green
All primary and replica shards are allocated. Your cluster is 100% operational.
yellow
All primary shards are allocated, but at least one replica is missing. No data is missing, so search results will still be complete. However, your high availability is compromised to some degree. If more shards disappear, you might lose data. Think of yellow as a warning that should prompt investigation.
red
At least one primary shard (and all of its replicas) are missing. This means that you are missing data: searches will return partial results, and indexing into that shard will return an exception.
Code: Select all
curl 'localhost:9200/_cluster/health?level=indices&pretty'Code: Select all
{
"cluster_name" : "xxxxxxxxxxxxxxxxxxxxxxxxxx",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 4,
"active_primary_shards" : 161,
"active_shards" : 264,
"relocating_shards" : 0,
"initializing_shards" : 8,
"unassigned_shards" : 50,
"indices" : {
"nagioslogserver" : {
"status" : "yellow",
"number_of_shards" : 1,
"number_of_replicas" : 1,
"active_primary_shards" : 1,
"active_shards" : 1,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1
},
"logstash-2015.06.19" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 9,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1
},
"logstash-2015.06.28" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"nagioslogserver_log" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.29" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.26" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"logstash-2015.06.27" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 4
},
"logstash-2015.06.24" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.25" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"logstash-2015.06.22" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.23" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"logstash-2015.06.03" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.20" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 2,
"unassigned_shards" : 2
},
"logstash-2015.06.02" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.21" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 4
},
"logstash-2015.06.01" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.07" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.06" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.05" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.04" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.09" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.08" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.15" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 1,
"unassigned_shards" : 2
},
"logstash-2015.06.16" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 2
},
"logstash-2015.06.17" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 3
},
"logstash-2015.06.18" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 2
},
"logstash-2015.06.11" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.05.31" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.12" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.13" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 3,
"unassigned_shards" : 0
},
"logstash-2015.06.14" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 2,
"unassigned_shards" : 0
},
"kibana-int" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
},
"logstash-2015.06.10" : {
"status" : "green",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
}
}
}The issue lies in the numbers above. I notice that this number has decreased from the metric you posted earlier:"initializing_shards" : 8,
"unassigned_shards" : 50,
This is a good sign. It means that the shards without homes are being assigned to instances of Nagios Log Server properly. Keep an eye on your cluster - if the health isn't green in a day or so, I want you to show us another capture of the index status:Initializing Shards 8
Unassigned Shards 107
Code: Select all
curl 'localhost:9200/_cluster/health?level=indices&pretty'Got it thanks. What triggers this behavior do you know? And how can I prevent this?jolson wrote:The issue lies in the numbers above. I notice that this number has decreased from the metric you posted earlier:"initializing_shards" : 8,
"unassigned_shards" : 50,This is a good sign. It means that the shards without homes are being assigned to instances of Nagios Log Server properly. Keep an eye on your cluster - if the health isn't green in a day or so, I want you to show us another capture of the index status:Initializing Shards 8
Unassigned Shards 107For now, as long as the 'unassigned shards' number is going down, we're on track to a green cluster state. Ultimately this means that your shards are moving between your instances for load balance and availability purposes - this movement takes some time.Code: Select all
curl 'localhost:9200/_cluster/health?level=indices&pretty'