nagioslogserver_history stuck NITIALIZING

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
danniiffxi
Posts: 121
Joined: Tue Jan 30, 2018 3:29 am
Location: UK

nagioslogserver_history stuck NITIALIZING

Post by danniiffxi »

Hi,

So this morning I went to log into Nagios Log and I was unable to, I eventually worked out that it was built on the floating IP so it one instances fails it jumps to the other, but this is where the fun began. I was able to log into the individual instances by going directly to their https://servername/nagioslogserver URL rather than our DNS entry. When i logged into the server I noticed they both had different instance ID's and in the Instance Status window the opposing server was missing from the other, as if they had started to act independently and leave the cluster.

Anyway I powered off the second instance and left the primary alone for 10 mins, I then powered the second server back on and instance became one again. But now I was faced with this mess..

Code: Select all

[root@naglp01 ~]# curl -XGET 'http://localhost:9200/_cluster/health?pretty=                                                                             true'
{
  "cluster_name" : "8e96de2d-514c-4909-8b28-b596c70b50e0",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 59,
  "active_shards" : 66,
  "relocating_shards" : 0,
  "initializing_shards" : 4,
  "unassigned_shards" : 52,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
I let Elasticsearch do it's thing I left it for a couple of hours and sure enough, most of unassigned Shards had once gain found their place.

Code: Select all

[root@naglp01 ~]# curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "8e96de2d-514c-4909-8b28-b596c70b50e0",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 55,
  "active_shards" : 110,
  "relocating_shards" : 0,
  "initializing_shards" : 1,
  "unassigned_shards" : 1,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}
 
But it has now been a couple of hours now and I am still stuck with the following and am unsure how to proceed. Instance status is still red

Code: Select all

[root@naglp02 ~]# curl -s -XGET http://localhost:9200/_cat/shards?v | egrep 'UNASSIGNED|INITIALIZING'
nagioslogserver_history 2     p      INITIALIZING                 10.31.10.152 e6cd8034-67c9-4b5a-b913-3808cd5caf13
nagioslogserver_history 2     r      UNASSIGNED
Any ideas? Do I just

Code: Select all

curl -XDELETE 'http://localhost:9200/nagioslogserver_history/'
or will that break it even more?
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: nagioslogserver_history stuck NITIALIZING

Post by vtrac »

HI danniiffxi,
Is the IP "10.31.10.152" (below) correct for the "p" (Primary) server?
You have mentioned that its used floating IP, so that why I asked.

Code: Select all

[root@naglp02 ~]# curl -s -XGET http://localhost:9200/_cat/shards?v | egrep 'UNASSIGNED|INITIALIZING'
nagioslogserver_history 2     p      INITIALIZING                 10.31.10.152 e6cd8034-67c9-4b5a-b913-3808cd5caf13
nagioslogserver_history 2     r      UNASSIGNED
Let try taking down both Log Servers completely (wait couple minutes) then bringing up just the Primary "p".

Run the command you had used (below) and check until ALL Primary "p" shards are initialized and "STARTED" (ASSIGNED), then bring the Replica "r" up.

Code: Select all

curl -XGET http://localhost:9200/_cat/shards?v | egrep 'UNASSIGNED|INITIALIZING'
Regards,
Vinh
danniiffxi
Posts: 121
Joined: Tue Jan 30, 2018 3:29 am
Location: UK

Re: nagioslogserver_history stuck INITIALIZING

Post by danniiffxi »

Hi Vinh

Thanks, it is all working now. I did as you said, all shards on the Primary started in a few mins, I then powered on the secondary and the status went to Yellow and after a few hours it went from Yellow to Green.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: nagioslogserver_history stuck INITIALIZING

Post by scottwilkerson »

danniiffxi wrote:Hi Vinh

Thanks, it is all working now. I did as you said, all shards on the Primary started in a few mins, I then powered on the secondary and the status went to Yellow and after a few hours it went from Yellow to Green.
Great!

Locking thread
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked