adding extra NLS datapath

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
rocheryderm
Posts: 69
Joined: Fri Jul 13, 2018 1:09 pm

Re: adding extra NLS datapath

Post by rocheryderm »

ZOOMING!!!! THANKS!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: adding extra NLS datapath

Post by scottwilkerson »

rocheryderm wrote:ZOOMING!!!! THANKS!
Awesome!
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
rocheryderm
Posts: 69
Joined: Fri Jul 13, 2018 1:09 pm

Re: adding extra NLS datapath

Post by rocheryderm »

So... recovery seems to have hung up.

Red cluster status, 1006 shards still unassigned and it has stopped assigning them. It's been this way for a few hours now. All unassigned shards have this status.

"UNASSIGNED CLUSTER_RECOVERED"

Any thoughts?

Code: Select all

# curl -XGet 'localhost:9200/_cluster/health?pretty'
{
  "cluster_name" : "15edd11f-8263-4eb7-9054-8ace66feebb6",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 4,
  "active_primary_shards" : 1315,
  "active_shards" : 2630,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 1006,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}

# curl -XGET 'http://localhost:9200/_cluster/settings?pretty'
{
  "persistent" : { },
  "transient" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "node_concurrent_recoveries" : "6"
        }
      }
    },
    "indices" : {
      "recovery" : {
        "max_bytes_per_sec" : "250mb"
      }
    }
  }
}
I've been trying to figure out exactly why the shards are unassigned with this command

Code: Select all

curl -XGet localhost:9200/_cluster/allocation/explain?pretty -d '
{ "index": "randomindexname-2019.09.23",
 "shard": 0,
 "primary": true
}
';
but keep getting this error consistently no matter which index I query, which I don't understand.

Code: Select all

{
  "error" : "IndexMissingException[[_cluster] missing]",
  "status" : 404
}

I'm running out of ideas. Sadly, I haven't created snapshot repositories yet so recovering that way is not an option.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: adding extra NLS datapath

Post by scottwilkerson »

How much free space do you have on each of the nodes? If you are getting over 70% utilized on any instance it could cause issues

Code: Select all

df -h
rocheryderm wrote:Sadly, I haven't created snapshot repositories yet so recovering that way is not an option.
This is VERY unfortunate.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
rocheryderm
Posts: 69
Joined: Fri Jul 13, 2018 1:09 pm

Re: adding extra NLS datapath

Post by rocheryderm »

I'm ok with dataloss, the system is not really in production. I would just like to avoid gaps if possible.

PLENTY of space - I've just added 4TB to every node, df -h shows highest utilization is at 27%

No worries there.

Here's an example of shard allocation on one of the indices with unassigned shards

Code: Select all

curl -XGET 'localhost:9200/_cat/shards/somerandomindex'
somerandomindex 2 r STARTED    3005596 2gb 151.120.113.53 39e75611-f913-4be5-969e-b6ad41fd5437
somerandomindex 2 p STARTED    3005596 2gb 151.120.113.51 77596958-30db-4cb4-bf11-09e114a44012
somerandomindex 0 p UNASSIGNED
somerandomindex 0 r UNASSIGNED
somerandomindex 3 p UNASSIGNED
somerandomindex 3 r UNASSIGNED
somerandomindex 1 p UNASSIGNED
somerandomindex 1 r UNASSIGNED
somerandomindex 4 p STARTED    3006740 2gb 151.120.113.51 77596958-30db-4cb4-bf11-09e114a44012
somerandomindex 4 r STARTED    3006740 2gb 151.120.113.54 dda7f85c-6641-4b98-b573-fbdf7121c025
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: adding extra NLS datapath

Post by scottwilkerson »

somehow you don't have either a primary or replica 0,1 or 3 shard for somerandomindex

Given this, and no backup, I can only suggest you close that index
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
rocheryderm
Posts: 69
Joined: Fri Jul 13, 2018 1:09 pm

Re: adding extra NLS datapath

Post by rocheryderm »

scottwilkerson wrote:somehow you don't have either a primary or replica 0,1 or 3 shard for somerandomindex

Given this, and no backup, I can only suggest you close that index
Ugh.

OK, well, I guess I'll have to get repositories and snapshots configured.

If I close corrupt indices, will curator still function against them when I purge old indices?
Last edited by rocheryderm on Fri Nov 15, 2019 3:37 pm, edited 1 time in total.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: adding extra NLS datapath

Post by scottwilkerson »

rocheryderm wrote:If I close corrupt indices, will conductor still function against them when I purge old indices?
I don't know what you mean by conductor
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
rocheryderm
Posts: 69
Joined: Fri Jul 13, 2018 1:09 pm

Re: adding extra NLS datapath

Post by rocheryderm »

sorry.

CURATOR
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: adding extra NLS datapath

Post by scottwilkerson »

Yes curator will still delete closed indexes when it is time to do so.

worth noting it will not back them up, which it cannot anyways because they are incomplete
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked