Adding a node, Elasticsearch and Logstash down in GUI

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Adding a node, Elasticsearch and Logstash down in GUI

Post by polarbear1 »

Hi,

Background:
Added a 2nd node, it worked fine, then I started messing with the elasticsearch config and messed it up where it wouldn't want to recover. Ended up deleting everything out and reinstalling from scratch. I was able to add a new node and it shows up in the GUI but it shows that elasticsearch and logstash are both down. Checking the machine itself shows that the services are running fine.
status.PNG
I am able to access the web ui from either hostname and it's showing the same story on both - which if nag2 (new node) thinks that elasticsearch is down, wouldn't it take me to an error page? I tried restarting (just the services, and the whole box) with no luck. Also - note lack of "delete" (garbage can icon) in the Actions column. Also in the System Status section of the admin screen in the Instance drop down I only get nag1 (the original node) as an option, and the same in the Per Instance (Advanced) section - regardless of which hostname I access the gui from.

I am not denying the fact that maybe I screwed up something with my deleting and reinstalling and I'm not against doing it again, just want to make sure I'm not missing anything.



EDIT - After restarting elasticsearch I do get this error:

Code: Select all

[root@schpnag2 ~]# service elasticsearch restart
Stopping elasticsearch:                                    [  OK  ]
Starting elasticsearch:                                    [  OK  ]
[root@schpnag2 ~]# Exception in thread ">output" org.elasticsearch.client.transport.NoNodeAvailableException: No node available
        at org.elasticsearch.client.transport.TransportClientNodesService.execute(org/elasticsearch/client/transport/TransportClientNodesService.java:219)
        at org.elasticsearch.client.transport.support.InternalTransportIndicesAdminClient.execute(org/elasticsearch/client/transport/support/InternalTransportIndicesAdminClient.java:85)
        at org.elasticsearch.client.support.AbstractIndicesAdminClient.getTemplates(org/elasticsearch/client/support/AbstractIndicesAdminClient.java:544)
        at org.elasticsearch.action.admin.indices.template.get.GetIndexTemplatesRequestBuilder.doExecute(org/elasticsearch/action/admin/indices/template/get/GetIndexTemplatesRequestBuilder.java:41)
        at org.elasticsearch.action.ActionRequestBuilder.execute(org/elasticsearch/action/ActionRequestBuilder.java:85)
        at org.elasticsearch.action.ActionRequestBuilder.execute(org/elasticsearch/action/ActionRequestBuilder.java:59)
        at org.elasticsearch.action.ActionRequestBuilder.get(org/elasticsearch/action/ActionRequestBuilder.java:67)
        at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
        at RUBY.template_exists?(/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:231)
        at RUBY.template_install(/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:21)
        at RUBY.register(/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:259)
        at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)
        at RUBY.outputworker(/usr/local/nagioslogserver/logstash/lib/logstash/pipeline.rb:220)
        at RUBY.start_outputs(/usr/local/nagioslogserver/logstash/lib/logstash/pipeline.rb:152)
        at java.lang.Thread.run(java/lang/Thread.java:745)
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by jolson »

Please run the following on both nodes:

Code: Select all

cat /usr/local/nagioslogserver/var/cluster_hosts
cat /usr/local/nagioslogserver/var/cluster_uuid
cat /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf
From the above we should be able to tell where things might be going wrong.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by polarbear1 »

Node 1

Code: Select all

[root@schpnag1 ~]# cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
192.168.1.175
192.168.1.249

[root@schpnag1 ~]# cat /usr/local/nagioslogserver/var/cluster_uuid
4f703585-84ab-40e0-9ff9-f72c904bdc38

[root@schpnag1 ~]# cat /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 04 Mar 2015 15:27:23 -0600
#

#
# Required output for Nagios Log Server
#

output {
    elasticsearch {
        cluster => '4f703585-84ab-40e0-9ff9-f72c904bdc38'
        host => 'localhost'
        index_type => '%{type}'
        node_name => ''
        protocol => 'transport'
        workers => 4
    }
}

#
# Global outputs
#



#
# Local outputs
#


[root@schpnag1 ~]# clear
[root@schpnag1 ~]# cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
192.168.1.175
192.168.1.249[root@slear
[root@schpnag1 ~]#
[root@schpnag1 ~]# cat /usr/local/nagioslogserver/var/cluster_uuid
4f703585-84ab-40e0-9ff9-f72c904bdc38
[root@schpnag1 ~]# clear
[root@schpnag1 ~]# cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
192.168.1.175
192.168.1.249[root@sat /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 04 Mar 2015 15:27:23 -0600
#

#
# Required output for Nagios Log Server
#

output {
    elasticsearch {
        cluster => '4f703585-84ab-40e0-9ff9-f72c904bdc38'
        host => 'localhost'
        index_type => '%{type}'
        node_name => ''
        protocol => 'transport'
        workers => 4
    }
}

#
# Global outputs
#



#
# Local outputs
#



Node 2

Code: Select all

[root@schpnag2 ~]# cat /usr/local/nagioslogserver/var/cluster_hosts
localhost

schpnag1[root@schpnag2 ~]# cat /usr/local/nagioslogserver/var/cluster_uuid
4f703585-84ab-40e0-9ff9-f72c904bdc38

[root@schpnag2 ~]# cat /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Tue, 14 Jul 2015 13:55:36 -0500
#

#
# Required output for Nagios Log Server
#

output {
    elasticsearch {
        cluster => 'c3003aac-586b-4be2-8581-e73938592447'
        host => 'localhost'
        index_type => '%{type}'
        node_name => '843eb4bb-fb4a-4166-9f69-a1cfd529a18d'
        protocol => 'transport'
        workers => 4
    }
}

#
# Global outputs
#



#
# Local outputs
#



So no big surprise there, something about how the cluster was set up for node 2 seems to be off...
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by jolson »

[root@schpnag2 ~]# cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
This throws me off. Please modify this file to include localhost, the local IP, and the IP of the other node as your other node is configured:
cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
192.168.1.175
192.168.1.249
The following is also interesting:
output {
elasticsearch {
cluster => '4f703585-84ab-40e0-9ff9-f72c904bdc38'

output {
elasticsearch {
cluster => 'c3003aac-586b-4be2-8581-e73938592447'
Be sure that the 'cluster' field in this file is set to the cluster UUID (in this case, the proper UUID is likely 4f703585-84ab-40e0-9ff9-f72c904bdc38.

After changing those two things, you should see the Web GUI respond more politely. :)
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by polarbear1 »

Made the changes, and did a full system restart. No dice.

NODE 2 (after the changes):

Code: Select all

[root@schpnag2 ~]# cat /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Tue, 14 Jul 2015 13:55:36 -0500
#

#
# Required output for Nagios Log Server
#

output {
    elasticsearch {
        cluster => '4f703585-84ab-40e0-9ff9-f72c904bdc38'
        host => 'localhost'
        index_type => '%{type}'
        node_name => '843eb4bb-fb4a-4166-9f69-a1cfd529a18d'
        protocol => 'transport'
        workers => 4
    }
}

#
# Global outputs
#



#
# Local outputs
#

Code: Select all

[root@schpnag2 ~]# cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
192.168.1.175
192.168.1.249

One thing I didn't notice earlier is that on NODE 1 the node_name field is blank:

Code: Select all

[root@sat /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 04 Mar 2015 15:27:23 -0600
#

#
# Required output for Nagios Log Server
#

output {
    elasticsearch {
        cluster => '4f703585-84ab-40e0-9ff9-f72c904bdc38'
        host => 'localhost'
        index_type => '%{type}'
        node_name => ''
        protocol => 'transport'
        workers => 4
    }
}

#
# Global outputs
#



#
# Local outputs
#
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by jolson »

One thing I didn't notice earlier is that on NODE 1 the node_name field is blank:
Good catch. Ensure that the node_name field matches up with what you see in the node_uuid file:

node_name should be set to the value of:

Code: Select all

cat /usr/local/nagioslogserver/var/node_uuid
After making that change, restart logstash on node 1. Any change in your behavior?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by polarbear1 »

Fixed NODE 1 to reflect the node name. As far as I can tell all 3 files on the 2 nodes now reflect each other, and both nodes have been rebooted. Still no go. What are some other possible variables that might be causing this?

Code: Select all

[root@schpnag1 ~]# cat /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 04 Mar 2015 15:27:23 -0600
#

#
# Required output for Nagios Log Server
#

output {
    elasticsearch {
        cluster => '4f703585-84ab-40e0-9ff9-f72c904bdc38'
        host => 'localhost'
        index_type => '%{type}'
        node_name => 'ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b'
        protocol => 'transport'
        workers => 4
    }
}

#
# Global outputs
#



#
# Local outputs
#

jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by jolson »

I can't think of a reason this might be happening. Let's collect some additional information.

Run the following on both nodes:

Code: Select all

curl 'localhost:9200/_cat/master?v'
Run the following on one node:

Code: Select all

curl 'localhost:9200/_cat/nodes?v'
curl 'localhost:9200/_cat/pending_tasks?v'
curl -XGET 'localhost:9200/_cat/recovery?v'
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by polarbear1 »

Same on both nodes

Code: Select all

[root@schpnag1 ~]# curl 'localhost:9200/_cat/master?v'
id                     host     ip            node
qF8dekxASSKDwhE39PDwjg schpnag1 192.168.1.175 ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b

[root@schpnag2 ~]# curl 'localhost:9200/_cat/master?v'
id                     host     ip            node
qF8dekxASSKDwhE39PDwjg schpnag1 192.168.1.175 ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b

Code: Select all

[root@schpnag1 ~]# curl 'localhost:9200/_cat/nodes?v'
host     ip            heap.percent ram.percent load node.role master name
schpnag1 192.168.1.175           36          61 0.52 d         *      ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b
schpnag2 127.0.0.1               37          60 0.06 d         m      843eb4bb-fb4a-4166-9f69-a1cfd529a18d

[root@schpnag1 ~]# curl 'localhost:9200/_cat/pending_tasks?v'
insertOrder timeInQueue priority source

[root@schpnag1 ~]# curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "4f703585-84ab-40e0-9ff9-f72c904bdc38",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 726,
  "active_shards" : 1451,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 1
}
recovery too long to insert in message, added as attachment
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Adding a node, Elasticsearch and Logstash down in GUI

Post by jolson »

Any interesting data from the following logs on either node?

Code: Select all

cat /var/log/elasticsearch/*.log
cat /var/log/logstash/logstash.log
tail -n20 /var/log/httpd/error_log
tail -n20 /var/log/httpd/access_log
tail -f /usr/local/nagioslogserver/var/jobs.log
tail -f /usr/local/nagioslogserver/var/poller.log
Please note that poller.log and jobs.log will need to be tailed for a few minutes to see any good output.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked