Page 1 of 2
Added second NLS server to cluster but...
Posted: Tue May 12, 2015 8:06 am
by krobertson71
Seeing this.. Looks like data is being replicated, but they show offline to each other.. Status of cluster is green thus the confusion. Tried restarting browser and cache, re-applying configuration, nothing..
newcluster.PNG
SystemStatus.PNG
Under system status they only show the first instance, nagilgp01. Again that is on both servers.
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 9:19 am
by jolson
Please run the following commands on both of your NLS nodes:
Code: Select all
cat /usr/local/nagioslogserver/var/cluster_hosts
sestatus
curl 'localhost:9200/_cat/master?v'
tail -f /usr/local/nagioslogserver/var/poller.log
What browser are you using?
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 10:25 am
by krobertson71
Here you go. Running Firefox 37.0.2
Code: Select all
cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
10.0.103.180
10.136.132.107
Code: Select all
[nagios@nagilgp01 ~]$ sestatus
SELinux status: disabled
Code: Select all
[nagios@nagilgp01 ~]$ curl 'localhost:9200/_cat/master?v'
id host ip node
dYVMjW-KTPavYvFoNiRaTA nagilgp02 10.136.132.107 11fe29cc-9353-4cc1-a368-14a0b6977937
Code: Select all
Updating Cluster Hosts File
Updating Elasticsearch with instance...
Updating Cluster Hosts File
Updating Elasticsearch with instance...
Updating Cluster Hosts File
Updating Elasticsearch with instance...
Updating Cluster Hosts File
Updating Elasticsearch with instance...
Finished Polling.
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 10:29 am
by jolson
Could you please run the commands on your other node as well?
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 11:23 am
by krobertson71
Here you go. Also the first command is leading me to believe we do have an issue.
Code: Select all
cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
Code: Select all
-bash-4.1$ curl 'localhost:9200/_cat/master?v'
id host ip node
dYVMjW-KTPavYvFoNiRaTA nagilgp02 10.136.132.107 11fe29cc-9353-4cc1-a368-14a0b6977937
Code: Select all
-bash-4.1$ tail -f /usr/local/nagioslogserver/var/poller.log
tail: cannot open `/usr/local/nagioslogserver/var/poller.log' for reading: No such file or directory
tail: no files remaining
-bash-4.1$ cd /usr/local/nagioslogserver/
-bash-4.1$ ll
total 32
drwxr-xr-x 7 nagios nagios 4096 May 11 14:39 elasticsearch
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 etc
drwxr-xr-x 9 nagios nagios 4096 May 11 14:39 logstash
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 mibs
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 scripts
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 snapshots
drwxrwxr-x 3 nagios nagios 4096 May 11 23:03 tmp
drwxrwxr-x 2 nagios nagios 4096 May 11 23:02 var
-bash-4.1$ cd var
-bash-4.1$ ll
total 12
-rwxrwxr-x 1 nagios nagios 34 May 11 23:02 cluster_hosts
-rw-rw-r-- 1 nagios nagios 36 May 11 23:02 cluster_uuid
-rw-rw-r-- 1 nagios nagios 37 May 11 23:02 node_uuid
-bash-4.1$
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 11:27 am
by jolson
Here you go. Also the first command is leading me to believe we do have an issue.
You are correct - please add the following to the 'cluster_hosts' file on node 2 (the one you posted second):
Code: Select all
echo "10.0.103.180" >> /usr/local/nagioslogserver/var/cluster_hosts
echo "10.136.132.107" >> /usr/local/nagioslogserver/var/cluster_hosts
After adding that information, please re-run your tests to see whether or not that helped. If your issue isn't resolved, restart elasticsearch on both nodes:
Thanks!
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 1:49 pm
by krobertson71
Still having same issue where nagilgp02 is showing red for elasticsearch and logstash and when I go to select instances it is only showing 01. All the files are as they should be per your changes.
I do notice there is no poller.log on the 02 server.
Any ideas?
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 2:16 pm
by krobertson71
Maybe this is working as expected. I am able to log into nagilgp02.dcri.duke.net just fine and look at all the events and dashboards as they were defined on lgp01 before.
This is the instance status screen. Elasticsearch and Logstash are running according to ps -ef
Code: Select all
bash-4.1$ ps -ef | grep -i elastic
nagios 1519 1 4 13:46 ? 00:03:52 /usr/bin/java -Xms7975m -Xmx7975m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Des.cluster.name=907e60a9-dc29-411e-96e8-2dfe503e0867 -Des.node.name=11fe29cc-9353-4cc1-a368-14a0b6977937 -Des.discovery.zen.ping.unicast.hosts=localhost,10.0.103.180,10.136.132.107 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/local/nagioslogserver/elasticsearch -cp :/usr/local/nagioslogserver/elasticsearch/lib/elasticsearch-1.3.2.jar:/usr/local/nagioslogserver/elasticsearch/lib/*:/usr/local/nagioslogserver/elasticsearch/lib/sigar/* -Des.default.path.home=/usr/local/nagioslogserver/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/usr/local/nagioslogserver/elasticsearch/data -Des.default.path.work=/usr/local/nagioslogserver/tmp/elasticsearch -Des.default.path.conf=/usr/local/nagioslogserver/elasticsearch/config org.elasticsearch.bootstrap.Elasticsearch
NLS-InstanceOverview.png
Re: Added second NLS server to cluster but...
Posted: Tue May 12, 2015 2:30 pm
by jolson
It's interesting that elasticsearch isn't being detected properly from the GUI.
Does any functionality seem impacted? I'd like to see a screenshot of your 'Cluster Status' page.
As far as detection is concerned, please perform the following procedure:
Log into Node 1 and navigate to 'Administration -> System Status'. Select the instance that you're on. Are all of the buttons showing green? Select the other instance and report what displays.
Log into Node 2 and navigate to 'Administration -> System Status'. Select the instance that you're on. Are all of the buttons showing green? Select the other instance and report what displays.
My assumption is that if you're logged into Node 1, you can't see the status of Node 2 - and vice versa. I would like you to confirm this.
Re: Added second NLS server to cluster but...
Posted: Wed May 13, 2015 8:42 am
by krobertson71
Actually, on both nagilgp01 and 02 under Administration --> System Status they only show nagilgp01.
This is from nagilgp02 host. Nagilgp01 is the only option available, same if I am on the 01 server.
nls-systemstatus.png
Here is the Cluster Status page you requested. I think it is working normally as I can use the 02 web gui just fine. I just cannot control 01 from 02 and vice versa.
NSL-ClusterStatus.png