Added second NLS server to cluster but...

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Added second NLS server to cluster but...

Post by krobertson71 »

Seeing this.. Looks like data is being replicated, but they show offline to each other.. Status of cluster is green thus the confusion. Tried restarting browser and cache, re-applying configuration, nothing..
newcluster.PNG
SystemStatus.PNG
Under system status they only show the first instance, nagilgp01. Again that is on both servers.
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Added second NLS server to cluster but...

Post by jolson »

Please run the following commands on both of your NLS nodes:

Code: Select all

cat /usr/local/nagioslogserver/var/cluster_hosts
sestatus
curl 'localhost:9200/_cat/master?v'
tail -f /usr/local/nagioslogserver/var/poller.log
What browser are you using?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Added second NLS server to cluster but...

Post by krobertson71 »

Here you go. Running Firefox 37.0.2


Code: Select all

 cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
10.0.103.180
10.136.132.107

Code: Select all

[nagios@nagilgp01 ~]$ sestatus
SELinux status:                 disabled

Code: Select all

[nagios@nagilgp01 ~]$ curl 'localhost:9200/_cat/master?v'
id                     host      ip             node
dYVMjW-KTPavYvFoNiRaTA nagilgp02 10.136.132.107 11fe29cc-9353-4cc1-a368-14a0b6977937

Code: Select all

Updating Cluster Hosts File
Updating Elasticsearch with instance...
Updating Cluster Hosts File
Updating Elasticsearch with instance...
Updating Cluster Hosts File
Updating Elasticsearch with instance...
Updating Cluster Hosts File
Updating Elasticsearch with instance...
Finished Polling.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Added second NLS server to cluster but...

Post by jolson »

Could you please run the commands on your other node as well?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Added second NLS server to cluster but...

Post by krobertson71 »

Here you go. Also the first command is leading me to believe we do have an issue.

Code: Select all

cat /usr/local/nagioslogserver/var/cluster_hosts
localhost

Code: Select all

SELinux status:                 disabled

Code: Select all

-bash-4.1$ curl 'localhost:9200/_cat/master?v'
id                     host      ip             node
dYVMjW-KTPavYvFoNiRaTA nagilgp02 10.136.132.107 11fe29cc-9353-4cc1-a368-14a0b6977937

Code: Select all

-bash-4.1$ tail -f /usr/local/nagioslogserver/var/poller.log
tail: cannot open `/usr/local/nagioslogserver/var/poller.log' for reading: No such file or directory
tail: no files remaining
-bash-4.1$ cd /usr/local/nagioslogserver/
-bash-4.1$ ll
total 32
drwxr-xr-x 7 nagios nagios 4096 May 11 14:39 elasticsearch
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 etc
drwxr-xr-x 9 nagios nagios 4096 May 11 14:39 logstash
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 mibs
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 scripts
drwxrwxr-x 2 nagios nagios 4096 May 11 14:39 snapshots
drwxrwxr-x 3 nagios nagios 4096 May 11 23:03 tmp
drwxrwxr-x 2 nagios nagios 4096 May 11 23:02 var
-bash-4.1$ cd var
-bash-4.1$ ll
total 12
-rwxrwxr-x 1 nagios nagios 34 May 11 23:02 cluster_hosts
-rw-rw-r-- 1 nagios nagios 36 May 11 23:02 cluster_uuid
-rw-rw-r-- 1 nagios nagios 37 May 11 23:02 node_uuid
-bash-4.1$
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Added second NLS server to cluster but...

Post by jolson »

Here you go. Also the first command is leading me to believe we do have an issue.
You are correct - please add the following to the 'cluster_hosts' file on node 2 (the one you posted second):

Code: Select all

echo "10.0.103.180" >> /usr/local/nagioslogserver/var/cluster_hosts
echo "10.136.132.107" >> /usr/local/nagioslogserver/var/cluster_hosts
After adding that information, please re-run your tests to see whether or not that helped. If your issue isn't resolved, restart elasticsearch on both nodes:

Code: Select all

service elasticsearch restart
Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Added second NLS server to cluster but...

Post by krobertson71 »

Still having same issue where nagilgp02 is showing red for elasticsearch and logstash and when I go to select instances it is only showing 01. All the files are as they should be per your changes.


I do notice there is no poller.log on the 02 server.

Any ideas?
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Added second NLS server to cluster but...

Post by krobertson71 »

Maybe this is working as expected. I am able to log into nagilgp02.dcri.duke.net just fine and look at all the events and dashboards as they were defined on lgp01 before.

This is the instance status screen. Elasticsearch and Logstash are running according to ps -ef

Code: Select all

bash-4.1$ ps -ef | grep -i elastic
nagios    1519     1  4 13:46 ?        00:03:52 /usr/bin/java -Xms7975m -Xmx7975m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Des.cluster.name=907e60a9-dc29-411e-96e8-2dfe503e0867 -Des.node.name=11fe29cc-9353-4cc1-a368-14a0b6977937 -Des.discovery.zen.ping.unicast.hosts=localhost,10.0.103.180,10.136.132.107 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/local/nagioslogserver/elasticsearch -cp :/usr/local/nagioslogserver/elasticsearch/lib/elasticsearch-1.3.2.jar:/usr/local/nagioslogserver/elasticsearch/lib/*:/usr/local/nagioslogserver/elasticsearch/lib/sigar/* -Des.default.path.home=/usr/local/nagioslogserver/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/usr/local/nagioslogserver/elasticsearch/data -Des.default.path.work=/usr/local/nagioslogserver/tmp/elasticsearch -Des.default.path.conf=/usr/local/nagioslogserver/elasticsearch/config org.elasticsearch.bootstrap.Elasticsearch

NLS-InstanceOverview.png
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Added second NLS server to cluster but...

Post by jolson »

It's interesting that elasticsearch isn't being detected properly from the GUI.

Does any functionality seem impacted? I'd like to see a screenshot of your 'Cluster Status' page.

As far as detection is concerned, please perform the following procedure:

Log into Node 1 and navigate to 'Administration -> System Status'. Select the instance that you're on. Are all of the buttons showing green? Select the other instance and report what displays.

Log into Node 2 and navigate to 'Administration -> System Status'. Select the instance that you're on. Are all of the buttons showing green? Select the other instance and report what displays.

My assumption is that if you're logged into Node 1, you can't see the status of Node 2 - and vice versa. I would like you to confirm this.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Added second NLS server to cluster but...

Post by krobertson71 »

Actually, on both nagilgp01 and 02 under Administration --> System Status they only show nagilgp01.


This is from nagilgp02 host. Nagilgp01 is the only option available, same if I am on the 01 server.
nls-systemstatus.png
Here is the Cluster Status page you requested. I think it is working normally as I can use the 02 web gui just fine. I just cannot control 01 from 02 and vice versa.
NSL-ClusterStatus.png
You do not have the required permissions to view the files attached to this post.
Locked