Hello,
I think I might be experiencing an issue after a recent logstash failure.
I ran the following command on the primary node in my Log Server cluster: curl -XGET 'http://localhost:9200/_cat/shards?v'
It appears that the: nagioslogserver_log is stuck INITIALIZING and it's keeping my cluster status in Yellow.
Please see attached export from Log Server.
Not sure what to do to get it back to Green. Also, I'm having trouble with snapshots again. It seems that once we try to snap 30 days, the system just can't handle it and we get logstash failures.
nagioslogserver_log INITIALIZING - after logstash failure
nagioslogserver_log INITIALIZING - after logstash failure
You do not have the required permissions to view the files attached to this post.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Was the output modified to remove the node information(uuid/ip)? I would start by restarting the elasticsearch service where the replica's are found and if that doesn't do the trick, restart the service on the primary(after the first server's service comes back).
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Yes, I omitted the IPs. I didn't think they would matter for my question.
Thank you, I'll try that.
Thank you, I'll try that.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Unfortunately, that did not work. There are still two log shards that are stuck initializing.
I even took the cluster completely down and brought it back up.
Any other ideas?
I even took the cluster completely down and brought it back up.
Any other ideas?
Re: nagioslogserver_log INITIALIZING - after logstash failur
Actually, disregard my last post. We're back. It looks like taking down Log Server completely and bringing it back up did the trick. There's a snapshot running now and the system is Green.
Now onto my second question, we seem to be having a lot of inconsistency with our snapshots. We're good doing anything between 14-29 days, but as soon as we try to snap 30 days logstash starts failing or the system becomes unresponsive.
I used this article to resolve our issues before: https://support.nagios.com/kb/article/n ... g-576.html
Is it possible we need to increase those numbers even further?
Thank you.
Now onto my second question, we seem to be having a lot of inconsistency with our snapshots. We're good doing anything between 14-29 days, but as soon as we try to snap 30 days logstash starts failing or the system becomes unresponsive.
I used this article to resolve our issues before: https://support.nagios.com/kb/article/n ... g-576.html
Is it possible we need to increase those numbers even further?
Thank you.
Re: nagioslogserver_log INITIALIZING - after logstash failur
If you see those errors in the console, you can try and increase those variables again to see if it helps.
Also, check the log files in the following folders when the server is having the issue and if you find any errors, post them here.
/var/log/logstash/ and /var/log/elasticsearch/
Also, check the log files in the following folders when the server is having the issue and if you find any errors, post them here.
/var/log/logstash/ and /var/log/elasticsearch/
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: nagioslogserver_log INITIALIZING - after logstash failur
Thank you.
What command would I use to see if there is currently a snapshot running or the status of previous snapshot?
What command would I use to see if there is currently a snapshot running or the status of previous snapshot?
Re: nagioslogserver_log INITIALIZING - after logstash failur
What does it mean with the Command Subsystem says WAITING for the snapshot, but Snapshots and Maintenance still shows the snapshot IN PROGRESS?
This is the issue we keep seeing. The subsystem says the snapshot is complete, but when we go to check the system locks up and says the snapshot is still running.
This is the issue we keep seeing. The subsystem says the snapshot is complete, but when we go to check the system locks up and says the snapshot is still running.
Re: nagioslogserver_log INITIALIZING - after logstash failur
I ran the following command and received this output. Currently, my log server system is completely locked up...
[root@nagioslscc2 xxxxxxx]# curl -s -XGET 'http://localhost:9200/_cluster/state?pretty' | grep snapshot -A 100
"snapshot" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"path" : {
"type" : "string"
},
"auto" : {
"type" : "long"
},
"filename" : {
"type" : "string"
},
"created" : {
"type" : "long"
},
"name" : {
"type" : "string"
},
"clean_filename" : {
"type" : "string"
}
}
},
"_default_" : {
"_timestamp" : {
"enabled" : true
},
"properties" : { }
},
"commands" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"last_run_status" : {
"type" : "string"
},
"run_time" : {
"type" : "long"
},
"created" : {
"type" : "string"
},
"active" : {
"type" : "long"
},
"type" : {
"type" : "string"
},
"created_by" : {
"type" : "string"
},
"command" : {
"type" : "string"
},
"last_run_output" : {
"type" : "string"
},
"last_run_time" : {
"type" : "string"
},
"frequency" : {
"type" : "string"
},
"args" : {
"properties" : {
"sh_id" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"timezone" : {
"type" : "string"
},
"sh_created" : {
"type" : "long"
},
"id" : {
"type" : "string"
}
}
},
"node" : {
"index" : "not_analyzed",
"type" : "string"
},
"modified_by" : {
"type" : "string"
},
"modified" : {
"type" : "string"
},
"status" : {
"type" : "string"
}
}
},
"node" : {
--
"snapshots" : {
"snapshots" : [ {
"repository" : "nlsrep",
"snapshot" : "curator-20181221123022",
"include_global_state" : true,
"state" : "STARTED",
"indices" : [ "logstash-2018.11.21", "logstash-2018.11.22", "logstash-2018.11.23", "logstash-2018.11.24", "logstash-2018.11.25", "logstash-2018.11.26", "logstash-2018.11.27", "logstash-2018.11.28", "logstash-2018.11.29", "logstash-2018.11.30", "logstash-2018.12.01", "logstash-2018.12.02", "logstash-2018.12.03", "logstash-2018.12.04", "logstash-2018.12.05", "logstash-2018.12.06", "logstash-2018.12.07", "logstash-2018.12.08", "logstash-2018.12.09", "logstash-2018.12.10", "logstash-2018.12.11", "logstash-2018.12.12", "logstash-2018.12.13", "logstash-2018.12.14", "logstash-2018.12.15", "logstash-2018.12.16", "logstash-2018.12.17", "logstash-2018.12.18", "logstash-2018.12.19", "logstash-2018.12.20" ],
"shards" : [ {
"index" : "logstash-2018.11.22",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.03",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.22",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.03",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.22",
"shard" : 4,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.03",
"shard" : 0,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.22",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 0,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.02",
"shard" : 4,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.23",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.02",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.14",
"shard" : 4,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.23",
"shard" : 0,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.02",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.14",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.23",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.04",
[root@nagioslscc2 xxxxxxx]# curl -s -XGET 'http://localhost:9200/_cluster/state?pretty' | grep snapshot -A 100
"snapshot" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"path" : {
"type" : "string"
},
"auto" : {
"type" : "long"
},
"filename" : {
"type" : "string"
},
"created" : {
"type" : "long"
},
"name" : {
"type" : "string"
},
"clean_filename" : {
"type" : "string"
}
}
},
"_default_" : {
"_timestamp" : {
"enabled" : true
},
"properties" : { }
},
"commands" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"last_run_status" : {
"type" : "string"
},
"run_time" : {
"type" : "long"
},
"created" : {
"type" : "string"
},
"active" : {
"type" : "long"
},
"type" : {
"type" : "string"
},
"created_by" : {
"type" : "string"
},
"command" : {
"type" : "string"
},
"last_run_output" : {
"type" : "string"
},
"last_run_time" : {
"type" : "string"
},
"frequency" : {
"type" : "string"
},
"args" : {
"properties" : {
"sh_id" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"timezone" : {
"type" : "string"
},
"sh_created" : {
"type" : "long"
},
"id" : {
"type" : "string"
}
}
},
"node" : {
"index" : "not_analyzed",
"type" : "string"
},
"modified_by" : {
"type" : "string"
},
"modified" : {
"type" : "string"
},
"status" : {
"type" : "string"
}
}
},
"node" : {
--
"snapshots" : {
"snapshots" : [ {
"repository" : "nlsrep",
"snapshot" : "curator-20181221123022",
"include_global_state" : true,
"state" : "STARTED",
"indices" : [ "logstash-2018.11.21", "logstash-2018.11.22", "logstash-2018.11.23", "logstash-2018.11.24", "logstash-2018.11.25", "logstash-2018.11.26", "logstash-2018.11.27", "logstash-2018.11.28", "logstash-2018.11.29", "logstash-2018.11.30", "logstash-2018.12.01", "logstash-2018.12.02", "logstash-2018.12.03", "logstash-2018.12.04", "logstash-2018.12.05", "logstash-2018.12.06", "logstash-2018.12.07", "logstash-2018.12.08", "logstash-2018.12.09", "logstash-2018.12.10", "logstash-2018.12.11", "logstash-2018.12.12", "logstash-2018.12.13", "logstash-2018.12.14", "logstash-2018.12.15", "logstash-2018.12.16", "logstash-2018.12.17", "logstash-2018.12.18", "logstash-2018.12.19", "logstash-2018.12.20" ],
"shards" : [ {
"index" : "logstash-2018.11.22",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.03",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.22",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.03",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.22",
"shard" : 4,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.03",
"shard" : 0,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.22",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.15",
"shard" : 0,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.02",
"shard" : 4,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.23",
"shard" : 1,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.02",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.14",
"shard" : 4,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.23",
"shard" : 0,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.02",
"shard" : 2,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.14",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.11.23",
"shard" : 3,
"state" : "SUCCESS",
"node" : "OMSbsV-8RXeTUT1LcgeotQ"
}, {
"index" : "logstash-2018.12.04",
Re: nagioslogserver_log INITIALIZING - after logstash failur
See attached screenshots for issue described. There are two snapshots that show IN_PROCESS simultaneously even though the Command Subsystem says WAITING.
You do not have the required permissions to view the files attached to this post.