Cluster 2nd Node OFF

teirekos · Post by **teirekos** » Fri Mar 20, 2015 9:34 am

ok I have deleted the corrupted index(I have the backup anyway...)
I attach also the shards info from both nodes as requested.

thanx.

tmcdonald · Post by **tmcdonald** » Fri Mar 20, 2015 3:11 pm

One possibly related github post (for our reference going forward) and a few questions/requests:

https://github.com/elastic/elasticsearch/issues/9212

What OS and version is this?
Was this a fullinstall from source or a pre-built VM?
Please run the following and post the output: time curl 'localhost:9200/_nodes/_local/stats?pretty'
What's your Java version? /usr/bin/java -version
Please run and show us the output: cat /etc/hosts
Please (re-)run the timezone script manually: /usr/local/nagioslogserver/scripts/change_timezone.sh

scottwilkerson · Post by **scottwilkerson** » Fri Mar 20, 2015 3:55 pm

scottwilkerson wrote:teirekos,

Lets make the following change to your elasticsearch configuration /usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml

On each instance change this
Code: Select all
# discovery.zen.minimum_master_nodes: 1
To this
Code: Select all
discovery.zen.minimum_master_nodes: 2
Then lets restart elasticsearch on each instance
Code: Select all
service elasticsearch restart

Back in this post we should have set the minimum masters to be 2, however I noticed in your latest cluster health you have

Code: Select all

"number_of_nodes":1

Can you run the following on each node for clarification

Code: Select all

grep minimum_master_nodes /usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml

Also, can we see the output of this

Code: Select all

curl -XGET localhost:9200/_cluster/settings

Thanks

teirekos · Post by **teirekos** » Mon Mar 23, 2015 10:45 am

What OS and version is this?
CentOS release 6.6 (Final)

Was this a fullinstall from source or a pre-built VM?
vSphere ovf template from Nagios Log Server download page

All the rest are in the attached text document...

Thanx.

jolson · Post by **jolson** » Mon Mar 23, 2015 11:13 am

I am curious why the following is in your /etc/hosts file:

Node A
-------
10.1.11.10 NagiosLogServer.teiresias.gr

Node B
-------
10.1.11.11 NagiosLogServer2.teiresias.gr

Is there a reason you have added each host to its own hostfile? Was this done automatically? If not, I recommend removing that entry and restarting elasticsearch. I am currently examining the other logs that you have given us.

teirekos · Post by **teirekos** » Tue Mar 24, 2015 8:56 am

Yes, I remembered why I've put this line in /etc/hosts... and I say I remembered because I removed it from my second node, then reboot it and then I tried to run the timezone script and I got the following:
--------------------------
[root@NagiosLogServer2 ~]# /usr/local/nagioslogserver/scripts/change_timezone.sh -z Europe/Athens
Stopping httpd: [ OK ]
Starting httpd: httpd: apr_sockaddr_info_get() failed for NagiosLogServer2
httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
[ OK ]
Restarting Logstash Daemon: [ OK ]
[ OK ]
All timezone configurations updated to "Europe/Athens"
[root@NagiosLogServer2 ~]# Exception in thread ">output" org.elasticsearch.client.transport.NoNodeAvailableException: No node available
at org.elasticsearch.client.transport.TransportClientNodesService.execute(org/elasticsearch/client/transport/TransportClientNodesService.java:219)
at org.elasticsearch.client.transport.support.InternalTransportIndicesAdminClient.execute(org/elasticsearch/client/transport/support/InternalTransportIndicesAdminClient.java:85)
at org.elasticsearch.client.support.AbstractIndicesAdminClient.getTemplates(org/elasticsearch/client/support/AbstractIndicesAdminClient.java:544)
at org.elasticsearch.action.admin.indices.template.get.GetIndexTemplatesRequestBuilder.doExecute(org/elasticsearch/action/admin/indices/template/get/GetIndexTemplatesRequestBuilder.java:41)
at org.elasticsearch.action.ActionRequestBuilder.execute(org/elasticsearch/action/ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(org/elasticsearch/action/ActionRequestBuilder.java:59)
at org.elasticsearch.action.ActionRequestBuilder.get(org/elasticsearch/action/ActionRequestBuilder.java:67)
at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
at RUBY.template_exists?(/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:231)
at RUBY.template_install(/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:21)
at RUBY.register(/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:259)
at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)
at RUBY.outputworker(/usr/local/nagioslogserver/logstash/lib/logstash/pipeline.rb:220)
at RUBY.start_outputs(/usr/local/nagioslogserver/logstash/lib/logstash/pipeline.rb:152)
at java.lang.Thread.run(java/lang/Thread.java:745)
-----------------
Anyway I have added "Server name" in the /etc/httpd/conf/httpd.conf. and httpd restarts fine but again when I try to run the timezone script I get the exception above!

scottwilkerson · Post by **scottwilkerson** » Tue Mar 24, 2015 11:31 am

On both of your nodes, edit the following file
/usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml
and change this

Code: Select all

# discovery.zen.minimum_master_nodes: 2

to this

Code: Select all

discovery.zen.minimum_master_nodes: 2

Then restart elasticsearch

Code: Select all

service elasticsearch restart

teirekos · Post by **teirekos** » Tue Mar 31, 2015 3:42 am

Cluster seems to be fine. Pls close the thread. Thanx a lot for your help.

ssax · Post by **ssax** » Tue Mar 31, 2015 8:59 am

I'm glad it's working for you, marking as resolved and locking the topic now.

Nagios Support Forum

Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF

Re: Cluster 2nd Node OFF