Cannot add new install to existing cluster
-
mark.payne
- Posts: 22
- Joined: Mon Sep 14, 2015 11:25 pm
Cannot add new install to existing cluster
I am evaluating NLS and managed to get a single server installed and in a acceptable state.
Now I am trying to add another server to the cluster.
I add the hostname and cluster ID then click finish installation however it keeps saying "Could not establish connection, this could be due to a slow connection, or you may want to re-enter your cluster information."
Before I tried this I made sure that TCP ports 9300-9400 are open both ways and can telnet from the new server to the existing server on port 9300 successfully.
So I ran a tcpdump on the existing server and ran the connect to existing cluster on new server again, nothing coming in at all. Has to be the new server at this point.
Running a tcpdump on the new server and rerun the connect to existing cluster but still nothing is being sent out.
Connecting to the existing cluster Finish install button is not doing anything... Please help.
Running CentOS 7.1
Now I am trying to add another server to the cluster.
I add the hostname and cluster ID then click finish installation however it keeps saying "Could not establish connection, this could be due to a slow connection, or you may want to re-enter your cluster information."
Before I tried this I made sure that TCP ports 9300-9400 are open both ways and can telnet from the new server to the existing server on port 9300 successfully.
So I ran a tcpdump on the existing server and ran the connect to existing cluster on new server again, nothing coming in at all. Has to be the new server at this point.
Running a tcpdump on the new server and rerun the connect to existing cluster but still nothing is being sent out.
Connecting to the existing cluster Finish install button is not doing anything... Please help.
Running CentOS 7.1
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Cannot add new install to existing cluster
What is the output of this on both servers?
Code: Select all
cat /usr/local/nagioslogserver/var/cluster_hostsAs of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
mark.payne
- Posts: 22
- Joined: Mon Sep 14, 2015 11:25 pm
Re: Cannot add new install to existing cluster
Existing server:
localhost
192.168.xxx.131
New server:
localhost
192.168.xxx.131
192.168.xxx.131
192.168.xxx.131
192.168.xxx.131
192.168.xxx.131
xxx being the same value for all.
localhost
192.168.xxx.131
New server:
localhost
192.168.xxx.131
192.168.xxx.131
192.168.xxx.131
192.168.xxx.131
192.168.xxx.131
xxx being the same value for all.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Cannot add new install to existing cluster
OK so that looks ok.
What is the output of this on both servers?
Can show us the firewall status and rules?
Also, as a test, are you able to stop the local firewall on CentOS?
What is the output of this on both servers?
Code: Select all
netstat -na|grep LISTENAlso, as a test, are you able to stop the local firewall on CentOS?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
mark.payne
- Posts: 22
- Joined: Mon Sep 14, 2015 11:25 pm
Re: Cannot add new install to existing cluster
Existing server:
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp6 0 0 :::2056 :::* LISTEN
tcp6 0 0 :::5544 :::* LISTEN
tcp6 0 0 :::2057 :::* LISTEN
tcp6 0 0 127.0.0.1:9200 :::* LISTEN
tcp6 0 0 :::80 :::* LISTEN
tcp6 0 0 :::9300 :::* LISTEN
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 :::3515 :::* LISTEN
New Server:
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp6 0 0 :::80 :::* LISTEN
tcp6 0 0 :::22 :::* LISTEN
Stopping the firewall doesn't help.
As explained in first post tcpdump on the new server doesn't show any tcp traffic when trying to "Connect to Existing Cluster"
Its like its not even trying to do anything.
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp6 0 0 :::2056 :::* LISTEN
tcp6 0 0 :::5544 :::* LISTEN
tcp6 0 0 :::2057 :::* LISTEN
tcp6 0 0 127.0.0.1:9200 :::* LISTEN
tcp6 0 0 :::80 :::* LISTEN
tcp6 0 0 :::9300 :::* LISTEN
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 :::3515 :::* LISTEN
New Server:
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp6 0 0 :::80 :::* LISTEN
tcp6 0 0 :::22 :::* LISTEN
Stopping the firewall doesn't help.
As explained in first post tcpdump on the new server doesn't show any tcp traffic when trying to "Connect to Existing Cluster"
Its like its not even trying to do anything.
-
mark.payne
- Posts: 22
- Joined: Mon Sep 14, 2015 11:25 pm
Re: Cannot add new install to existing cluster
I noticed that when I try connect to existing cluster the elasticsearch service exits and starts.
Below is the log
[2015-10-30 17:27:03,024][INFO ][node ] [ead9d787-7f47-41c8-b504-76ccabba3bd2] version[1.6.0], pid[4966], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-10-30 17:27:03,024][INFO ][node ] [ead9d787-7f47-41c8-b504-76ccabba3bd2] initializing ...
[2015-10-30 17:27:03,038][INFO ][plugins ] [ead9d787-7f47-41c8-b504-76ccabba3bd2] loaded [knapsack-1.5.2.0-f340ad1], sites []
[2015-10-30 17:27:03,076][ERROR][bootstrap ] Exception
org.elasticsearch.ElasticsearchIllegalStateException: Failed to created node environment
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:164)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:77)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:245)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Caused by: java.nio.file.AccessDeniedException: /mnt/sdb1/nagioslogserver/elasticsearch/data/88244341-3928-4ea7-9363-93d3c9b771ca
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:383)
at java.nio.file.Files.createDirectory(Files.java:630)
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:734)
at java.nio.file.Files.createDirectories(Files.java:720)
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:126)
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:162)
... 4 more
After I installed I added another drive and moved the directory and data to the new drive.
Below is the log
[2015-10-30 17:27:03,024][INFO ][node ] [ead9d787-7f47-41c8-b504-76ccabba3bd2] version[1.6.0], pid[4966], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-10-30 17:27:03,024][INFO ][node ] [ead9d787-7f47-41c8-b504-76ccabba3bd2] initializing ...
[2015-10-30 17:27:03,038][INFO ][plugins ] [ead9d787-7f47-41c8-b504-76ccabba3bd2] loaded [knapsack-1.5.2.0-f340ad1], sites []
[2015-10-30 17:27:03,076][ERROR][bootstrap ] Exception
org.elasticsearch.ElasticsearchIllegalStateException: Failed to created node environment
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:164)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:77)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:245)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Caused by: java.nio.file.AccessDeniedException: /mnt/sdb1/nagioslogserver/elasticsearch/data/88244341-3928-4ea7-9363-93d3c9b771ca
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:383)
at java.nio.file.Files.createDirectory(Files.java:630)
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:734)
at java.nio.file.Files.createDirectories(Files.java:720)
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:126)
at org.elasticsearch.node.internal.InternalNode.<init>(InternalNode.java:162)
... 4 more
After I installed I added another drive and moved the directory and data to the new drive.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Cannot add new install to existing cluster
Great thanks for that, sometimes we need to double check things just to be sure.
Did you follow any documented procedure for this?
What is the permissions of:
From the server that you are seeing this in the logs.
Was this on the new or existing server?mark.payne wrote:After I installed I added another drive and moved the directory and data to the new drive.
Did you follow any documented procedure for this?
What is the permissions of:
Code: Select all
ll /mnt/sdb1/nagioslogserver/elasticsearch/data/88244341-3928-4ea7-9363-93d3c9b771caAs of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
mark.payne
- Posts: 22
- Joined: Mon Sep 14, 2015 11:25 pm
Re: Cannot add new install to existing cluster
Reverted the move of the data directory and can now add to the cluster.
Moved the data back to the new drive and all is well.
I guess shouldn't be changing data around until the cluster is setup...
How this would work when you have large amounts of data already to replicate and the drive you initially installed NLS on doesn't have enough space not sure.
One thing is that there doesn't seem to be any documentation around firewall rules for cluster communication. Might want to add some documentation around this.
Moved the data back to the new drive and all is well.
I guess shouldn't be changing data around until the cluster is setup...
How this would work when you have large amounts of data already to replicate and the drive you initially installed NLS on doesn't have enough space not sure.
One thing is that there doesn't seem to be any documentation around firewall rules for cluster communication. Might want to add some documentation around this.
-
mark.payne
- Posts: 22
- Joined: Mon Sep 14, 2015 11:25 pm
Re: Cannot add new install to existing cluster
Moved data on new server.Box293 wrote:Great thanks for that, sometimes we need to double check things just to be sure.
Was this on the new or existing server?mark.payne wrote:After I installed I added another drive and moved the directory and data to the new drive.
Did you follow any documented procedure for this?
What is the permissions of:
From the server that you are seeing this in the logs.Code: Select all
ll /mnt/sdb1/nagioslogserver/elasticsearch/data/88244341-3928-4ea7-9363-93d3c9b771ca
Yes, I followed the documentation. Same thing I had already done on the existing server.
Permissions are drwxr-xr-x 3 nagios users 4096 Oct 30 17:45 nodes
Cluster status is Green
Re: Cannot add new install to existing cluster
I'm not sure I understand this question. Could you clarify a bit please?How this would work when you have large amounts of data already to replicate and the drive you initially installed NLS on doesn't have enough space not sure.
Happy to hear it! That must mean that your instances connected properly?Cluster status is Green