Hi,
I am planning to use Nagios core to monitor Nagios log server. Is there a way to monitor elasticsearch cluster related services using Nagios core other than monitoring the services?
For example:
I want to monitor cluster status. Whenever its become "Yellow" or "red", I want to generate an alert via Nagios core. How can it be done?
Likewise , I want to monitor all the possible scenarios regarding elasticsearch.
Can somebody give a little guide please?
Thanks
Luke.
Nagios core to monitor nagios log server
Re: Nagios core to monitor nagios log server
Hello Luke,
Thanks for reaching out, understand that you would like to monitor your Nagios Log Server Cluster on your Nagios Core.
I checked into the 'check_es_system.sh' plugin from the exchange. I downloaded and installed the check_es_system.sh script and added that to my /usr/local/nagios/libexec/ plugin library.
By running the script from the command line we see the following status options (with the -t switch) you can check on along with other options.
Some notes to pay attention to:
The option to add the command switch -t status -w 1 -c 2 to prompt for notification.
Thanks,
Perry
Thanks for reaching out, understand that you would like to monitor your Nagios Log Server Cluster on your Nagios Core.
I checked into the 'check_es_system.sh' plugin from the exchange. I downloaded and installed the check_es_system.sh script and added that to my /usr/local/nagios/libexec/ plugin library.
By running the script from the command line we see the following status options (with the -t switch) you can check on along with other options.
Code: Select all
#Variables and defaults
STATE_OK=0 # define the exit code if status is OK
STATE_WARNING=1 # define the exit code if status is Warning
STATE_CRITICAL=2 # define the exit code if status is Critical
STATE_UNKNOWN=3 # define the exit code if status is Unknown
Some notes to pay attention to:
- Make sure that you add your path to the export PATH=$PATH parameter in the script is added.
- I found that I had to install 'jq/jashon' parsers as well through apt or yum.
- Also may require some config updates to the elasticsearch.yaml depending on your environment and test with the curl command:
Code: Select all
curl -X GET http://[ipaddress]:9200
Code: Select all
./check_es_system.sh -H [myipaddress] -t status
ES SYSTEM OK - Elasticsearch Cluster "a59627d3-3649-4445-b4e4-7a947e74975d" is green (2 nodes, null data nodes, 184 shards, 16421343 docs)|total_nodes=2;;;; data_nodes=null;;;; total_shards=184;;;; relocating_shards=0;;;; initializing_shards=0;;;; unassigned_shards=0;;;; docs=16421343;;;;
Thanks,
Perry
Re: Nagios core to monitor nagios log server
Here are a couple curl requests to get cluster status. I imagine you can replace localhost with the FQDN of the host, use check_http and look for green in the output.
I'd also check the status of the httpd, logstash and elasticsearch services with check_init_service.
$ curl -X GET 'http://localhost:9200/_cluster/health?pretty'
{
"cluster_name" : "14ea04c0-cc4f-44d4-96e2-1e731227a935",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 156,
"active_shards" : 312,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
}
$ curl 'localhost:9200/_cat/health?v'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks
1596151687 23:28:07 14ea04c0-cc4f-44d4-96e2-1e731227a935 green 3 3 312 156 0 0 0 0
I'd also check the status of the httpd, logstash and elasticsearch services with check_init_service.
$ curl -X GET 'http://localhost:9200/_cluster/health?pretty'
{
"cluster_name" : "14ea04c0-cc4f-44d4-96e2-1e731227a935",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 156,
"active_shards" : 312,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
}
$ curl 'localhost:9200/_cat/health?v'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks
1596151687 23:28:07 14ea04c0-cc4f-44d4-96e2-1e731227a935 green 3 3 312 156 0 0 0 0
Re: Nagios core to monitor nagios log server
Hello Luke;
Thanks for following up, and you are correct that you could implement a script to output alerts based on matching arguments in the plugin.
For a basic example you could use the following line and add the filter for output to match:
For simple filter: curl -v 'http://localhost:9200/_cluster/health?' 2>&1 | grep -Eo 'green|yellow|red'
(there are other ways to filter the status)
thanks,
Perry
Thanks for following up, and you are correct that you could implement a script to output alerts based on matching arguments in the plugin.
For a basic example you could use the following line and add the filter for output to match:
Output:curl -v 'http://localhost:9200/_cluster/health?' 2>&1
Code: Select all
{"cluster_name":"a59627d3-3649-4445-b4e4-7a947e74975d","status":"green","timed_out":false,"number_of_nodes":2,"number_of_data_nodes":2,"active_primary_shards":97,"active_shards":194,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0}
(there are other ways to filter the status)
thanks,
Perry
Re: Nagios core to monitor nagios log server
Thank you so much for all your comments and the support. I started to do the configurations and will update you soon.
Regards
Luke.
Regards
Luke.