Page 1 of 2

Cluster Health status yellow Indices not showing on 1 leg

Posted: Wed Sep 02, 2015 5:32 pm
by Jklre
I recently setup 2 separate clusters for nagios log server. The new servers are showing there health as yellow and on one of the legs the Indices are not showing up while they are showing up on the other leg.

Cluster Health
Status Yellow
Timed Out? false
# Instances 2
# Data Instances 2
Active Primary Shards 61
Active Shards 91
Relocating Shards 0
Initializing Shards 0
Unassigned Shards 31
1.jpg
Any Idea whats causing this? Thank you.

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Wed Sep 02, 2015 7:16 pm
by Box293
What version of Nagios Log Server?
Jklre wrote:I recently setup 2 separate clusters for nagios log server. The new servers are showing there health as yellow and on one of the legs the Indices are not showing up while they are showing up on the other leg.
The two separate clusters will have no relationship with each other.

I want some clarification, how many nodes does each cluster have?

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Thu Sep 03, 2015 11:07 am
by Jklre
Box293 wrote:What version of Nagios Log Server?
Jklre wrote:I recently setup 2 separate clusters for nagios log server. The new servers are showing there health as yellow and on one of the legs the Indices are not showing up while they are showing up on the other leg.
The two separate clusters will have no relationship with each other.

I want some clarification, how many nodes does each cluster have?

I have 3 different clusters (one per datacenter) 2 nodes each.

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Thu Sep 03, 2015 11:26 am
by jolson
Jklre,

To get this straight-

You have a 3 instance cluster, and while two of the instances appear to be functioning properly, one of them appears to be disconnected. Is that correct?

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Thu Sep 03, 2015 3:19 pm
by Jklre
jolson wrote:Jklre,

To get this straight-

You have a 3 instance cluster, and while two of the instances appear to be functioning properly, one of them appears to be disconnected. Is that correct?
Actually not quite

I have 3 different clusters all with 2 nodes. (see bad photoshop visual X means not showing them)
2.jpg
one node on each the other 2 are not showing the Indies. Thank you.

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Thu Sep 03, 2015 3:57 pm
by jolson
I have a couple of hunches about what might be happening here.

On *one* of your clusters, I'd like to see the output of the following on each instance:

Code: Select all

cat /usr/local/nagioslogserver/var/node_uuid
cat /usr/local/nagioslogserver/var/cluster_uuid
cat /usr/local/nagioslogserver/var/cluster_hosts
ip a

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Thu Sep 03, 2015 4:40 pm
by Jklre
jolson wrote:I have a couple of hunches about what might be happening here.

On *one* of your clusters, I'd like to see the output of the following on each instance:

Code: Select all

cat /usr/local/nagioslogserver/var/node_uuid
cat /usr/local/nagioslogserver/var/cluster_uuid
cat /usr/local/nagioslogserver/var/cluster_hosts
ip a

Here's the output from one of the clusters.

cat /usr/local/nagioslogserver/var/node_uuid
d2372b37-3f94-4d4b-896d-d62812c1806a
cat /usr/local/nagioslogserver/var/cluster_uuid
886c2610-e1e0-4f9f-a33c-5de17cff9435
cat /usr/local/nagioslogserver/var/cluster_hosts
localhost
*.*.20.93
*.*.20.94
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:95:1d:28 brd ff:ff:ff:ff:ff:ff
inet *.*.20.93/23 brd *.*.21.255 scope global eth0
inet6 fe80::250:56ff:fe95:1d28/64 scope link
valid_lft forever preferred_lft forever

cat /usr/local/nagioslogserver/var/node_uuid
20370d97-c74e-4d34-9933-0646a06bf34e
cat /usr/local/nagioslogserver/var/cluster_uuid
886c2610-e1e0-4f9f-a33c-5de17cff9435
cat /usr/local/nagioslogserver/var/cluster_hosts
localhost

UNLS01lxv
*.*.20.93
*.*.20.94
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:95:06:0a brd ff:ff:ff:ff:ff:ff
inet *.*.20.94/23 brd *.*.21.255 scope global eth0
inet6 fe80::250:56ff:fe95:60a/64 scope link
valid_lft forever preferred_lft forever

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Thu Sep 03, 2015 4:52 pm
by jolson
Try removing the following from your cluster_hosts file:
UNLS01lxv

Does the cluster uuid 886c2610-e1e0-4f9f-a33c-5de17cff9435 match the UUID you see on the 'Administration -> Cluster Status' page of the Web GUI?

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Tue Sep 08, 2015 3:05 pm
by Jklre
jolson wrote:Try removing the following from your cluster_hosts file:
UNLS01lxv

Does the cluster uuid 886c2610-e1e0-4f9f-a33c-5de17cff9435 match the UUID you see on the 'Administration -> Cluster Status' page of the Web GUI?
The UUID does match with 886c2610-e1e0-4f9f-a33c-5de17cff9435

The cluster_hosts file now matches on both.

localhost
172.24.20.93
172.24.20.94

I did a restart of each leg and now on UNLS01LXV "No results There were no results because no indices were found that match your selected time span"

Re: Cluster Health status yellow Indices not showing on 1 le

Posted: Tue Sep 08, 2015 4:12 pm
by jolson
I did a restart of each leg and now on UNLS01LXV "No results There were no results because no indices were found that match your selected time span"
Did the indices populate properly in the 'Index Status' page? Are the dates of your two servers synchronized?

Code: Select all

date
Please run the following command on each leg and report the output:

Code: Select all

curl -s localhost:9200/_cat/shards