nagioslogserver_history stuck NITIALIZING
Posted: Fri Feb 12, 2021 9:42 am
Hi,
So this morning I went to log into Nagios Log and I was unable to, I eventually worked out that it was built on the floating IP so it one instances fails it jumps to the other, but this is where the fun began. I was able to log into the individual instances by going directly to their https://servername/nagioslogserver URL rather than our DNS entry. When i logged into the server I noticed they both had different instance ID's and in the Instance Status window the opposing server was missing from the other, as if they had started to act independently and leave the cluster.
Anyway I powered off the second instance and left the primary alone for 10 mins, I then powered the second server back on and instance became one again. But now I was faced with this mess..
I let Elasticsearch do it's thing I left it for a couple of hours and sure enough, most of unassigned Shards had once gain found their place.
But it has now been a couple of hours now and I am still stuck with the following and am unsure how to proceed. Instance status is still red
Any ideas? Do I just or will that break it even more?
So this morning I went to log into Nagios Log and I was unable to, I eventually worked out that it was built on the floating IP so it one instances fails it jumps to the other, but this is where the fun began. I was able to log into the individual instances by going directly to their https://servername/nagioslogserver URL rather than our DNS entry. When i logged into the server I noticed they both had different instance ID's and in the Instance Status window the opposing server was missing from the other, as if they had started to act independently and leave the cluster.
Anyway I powered off the second instance and left the primary alone for 10 mins, I then powered the second server back on and instance became one again. But now I was faced with this mess..
Code: Select all
[root@naglp01 ~]# curl -XGET 'http://localhost:9200/_cluster/health?pretty= true'
{
"cluster_name" : "8e96de2d-514c-4909-8b28-b596c70b50e0",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 59,
"active_shards" : 66,
"relocating_shards" : 0,
"initializing_shards" : 4,
"unassigned_shards" : 52,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0Code: Select all
[root@naglp01 ~]# curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
"cluster_name" : "8e96de2d-514c-4909-8b28-b596c70b50e0",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 55,
"active_shards" : 110,
"relocating_shards" : 0,
"initializing_shards" : 1,
"unassigned_shards" : 1,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
}
Code: Select all
[root@naglp02 ~]# curl -s -XGET http://localhost:9200/_cat/shards?v | egrep 'UNASSIGNED|INITIALIZING'
nagioslogserver_history 2 p INITIALIZING 10.31.10.152 e6cd8034-67c9-4b5a-b913-3808cd5caf13
nagioslogserver_history 2 r UNASSIGNED
Code: Select all
curl -XDELETE 'http://localhost:9200/nagioslogserver_history/'