Page 1 of 3
Unsuccessful Add of Instance
Posted: Mon Sep 09, 2019 5:13 pm
by rocheryderm
Hello
Trying to add 3 more NLS instances to my existing instance.
See attached images and profile.
Every time I run through the "Connect to an Existing Cluster" choice from the new instances, I get this error:
Code: Select all
"Could not establish connection, this could be due to a slow connection, or you may want to re-enter your cluster information."
- I am never able to progress past this point.
I am specifically not sending anything to the servers from clients so the servers have nothing else to do but respond to this choice. They are not busy at all.
On the existing Instance, it LOOKS like the command was successful, because the additional instances do show up.
What should I be looking at? Any hints/tips?
I can't find anything obvious in the logs in /var/log/elasticsearch
Mike
Support edit: Downloaded system-profile.tar.gz and shared with team
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 8:04 am
by rocheryderm
Looking at this again... second instance still claims that Elasticsearch isn't running, but data has certainly balanced evenly between both servers... (see screenshot attached)
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 10:29 am
by mbellerue
From the profile, and that screenshot, it looks like the addition was mostly successful, it's just not able to communicate fully. Is there a firewall running on the second server? Can you reach TCP port 9300 on the second server from the first server?
Code: Select all
telnet <SecondServerNameOrIP> 9300
or
Code: Select all
nmap <SecondServerNameOrIP> -p 9300
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 1:08 pm
by rocheryderm
telnet is banned in favor of SSH
here's an nmap response
Code: Select all
[root@rbbusnls1p ~]# nmap rbbusnls2p -p 9300
Starting Nmap 6.40 ( http://nmap.org ) at 2019-09-10 14:06 EDT
Nmap scan report for rbbusnls2p (151.120.113.52)
Host is up (0.00016s latency).
PORT STATE SERVICE
9300/tcp open vrace
MAC Address: 00:17:A4:77:04:1C (Hewlett-Packard Company)
Nmap done: 1 IP address (1 host up) scanned in 0.10 seconds
[root@rbbusnls1p ~]#
I'm fairly certain this is not a firewall issue. Blade-to-blade traffic, so latency is minimal as we aren't even leaving the chassis.
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 2:18 pm
by mbellerue
If you point your browser to that hostname, rbbusnls2p, does it bring up the Log Server page?
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 3:05 pm
by rocheryderm
Yes, both the master and the secondary instance respond correctly in the browser.
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 3:50 pm
by rocheryderm
Is there a way to complete this manually? The suspense is killing me.
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 4:41 pm
by mbellerue
The thing is, it looks like it's complete, and we're just running into the interface thinking things aren't connected. On the second server, go to Admin -> Instance Status, and let's see if it's using any storage.
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 4:51 pm
by rocheryderm
Hi... my other screenshots should have demonstrated that but here's another screenshot
Wait a minute, I can't do anything on the second server, it's stuck at the Install or Connect screen.
It looks like both nodes are active
Additionally, here's an excerpt from the elasticsearch log file (/var/log/elasticsearch...) on the master and secondary nodes
Code: Select all
[2019-09-10 17:42:11,992][INFO ][cluster.service ] [77596958-30db-4cb4-bf11-09e114a44012] removed {[fb689df9-6555-48f7-ada4-70fb56095c6f][hc3JnLGjTq60ID_icK0gdg][rbbusnls2p][inet[/151.120.113.52:9300]]{max_local_storage_nodes=1},}, reason: zen-disco-node_left([fb689df9-6555-48f7-ada4-70fb56095c6f][hc3JnLGjTq60ID_icK0gdg][rbbusnls2p][inet[/151.120.113.52:9300]]{max_local_storage_nodes=1})
[2019-09-10 17:42:40,509][INFO ][cluster.service ] [77596958-30db-4cb4-bf11-09e114a44012] added {[3fcf7609-0455-49a5-81c3-c7baaa51776a][BgCAY0VpQxC3PxPonH5gBQ][rbbusnls2p][inet[/151.120.113.52:9300]]{max_local_storage_nodes=1},}, reason: zen-disco-receive(join from node[[3fcf7609-0455-49a5-81c3-c7baaa51776a][BgCAY0VpQxC3PxPonH5gBQ][rbbusnls2p][inet[/151.120.113.52:9300]]{max_local_storage_nodes=1}])
Code: Select all
[2019-09-10 17:42:15,906][INFO ][node ] [fb689df9-6555-48f7-ada4-70fb56095c6f] stopping ...
[2019-09-10 17:42:16,033][INFO ][node ] [fb689df9-6555-48f7-ada4-70fb56095c6f] stopped
[2019-09-10 17:42:16,033][INFO ][node ] [fb689df9-6555-48f7-ada4-70fb56095c6f] closing ...
[2019-09-10 17:42:16,041][INFO ][node ] [fb689df9-6555-48f7-ada4-70fb56095c6f] closed
[2019-09-10 17:42:37,424][INFO ][node ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] version[1.7.6], pid[51593], build[c730b59/2016-11-18T15:21:16Z]
[2019-09-10 17:42:37,425][INFO ][node ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] initializing ...
[2019-09-10 17:42:37,503][INFO ][plugins ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] loaded [knapsack-1.7.3.0-d0ea246], sites []
[2019-09-10 17:42:37,543][INFO ][env ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] using [1] data paths, mounts [[/usr/local/nagioslogserver/elasticsearch/data (/dev/mapper/vg00-lv_elastic)]], net usable_space [944.7gb], net total_space [1.6tb], types [xfs]
[2019-09-10 17:42:41,009][INFO ][node ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] initialized
[2019-09-10 17:42:41,010][INFO ][node ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] starting ...
[2019-09-10 17:42:41,370][INFO ][transport ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/151.120.113.52:9300]}
[2019-09-10 17:42:41,386][INFO ][discovery ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] 15edd11f-8263-4eb7-9054-8ace66feebb6/BgCAY0VpQxC3PxPonH5gBQ
[2019-09-10 17:42:44,555][INFO ][cluster.service ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] detected_master [77596958-30db-4cb4-bf11-09e114a44012][4BAV0gZFRkmvGxVxvB1ycA][rbbusnls1p][inet[/151.120.113.51:9300]]{max_local_storage_nodes=1}, added {[77596958-30db-4cb4-bf11-09e114a44012][4BAV0gZFRkmvGxVxvB1ycA][rbbusnls1p][inet[/151.120.113.51:9300]]{max_local_storage_nodes=1},}, reason: zen-disco-receive(from master [[77596958-30db-4cb4-bf11-09e114a44012][4BAV0gZFRkmvGxVxvB1ycA][rbbusnls1p][inet[/151.120.113.51:9300]]{max_local_storage_nodes=1}])
[2019-09-10 17:42:44,925][INFO ][http ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] bound_address {inet[/127.0.0.1:9200]}, publish_address {inet[localhost/127.0.0.1:9200]}
[2019-09-10 17:42:44,926][INFO ][node ] [3fcf7609-0455-49a5-81c3-c7baaa51776a] started
Re: Unsuccessful Add of Instance
Posted: Tue Sep 10, 2019 5:10 pm
by rocheryderm
Way back when I was toying purely with Elasticsearch... sometimes my queries from Kibana would fail after 30000ms.
This process does take 30-45 seconds (from hitting enter to getting the error) - is it possible a query is timing out or failing?
I've scoured every log file I can find, it looks like your code to add this node to the cluster is encrypted and thus I can't be much more help.
Is there something I can do to manually complete this, without having to use the GUI?