What is the output when you run:
ps aux | grep curator
?
are the snapshots for 12-21-2018 still in 12-20-2018 still in the IN_PROGRESS state ? When snapshots are in that state, how do they compare with the output of the above command and to the status shown when running the command:
curl -XGET 'http://localhost:9200/_snapshot/nlsrep/_all?pretty'
?
I wonder if it is just taking a very long time to run the curator command(which is started by the snapshot_maintenance job) and jobs are then piling up. Or if it may be crashing - in which case looking at the elasticsearch logs in /var/log/elasticsearch/<CLUSTER_UUID>.log can usually give us some more info as to why.
nagioslogserver_log INITIALIZING - after logstash failure
Re: nagioslogserver_log INITIALIZING - after logstash failur
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Here's the output for the first command:
[root@nagioslscc2 rferebee]# ps aux | grep curator
nagios 407 0.0 0.0 106076 1164 ? S Jan03 0:00 /bin/sh /usr/local/nagioslogserver/scripts/curator.sh optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
nagios 408 0.0 0.0 215992 10920 ? S Jan03 0:00 /usr/bin/python /usr/bin/curator optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
root 27715 0.0 0.0 2668 164 pts/0 D+ 13:46 0:00 grep curator
See attached for the .log file. I'm not sure what I'm looking for in there.
[root@nagioslscc2 rferebee]# ps aux | grep curator
nagios 407 0.0 0.0 106076 1164 ? S Jan03 0:00 /bin/sh /usr/local/nagioslogserver/scripts/curator.sh optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
nagios 408 0.0 0.0 215992 10920 ? S Jan03 0:00 /usr/bin/python /usr/bin/curator optimize indices --older-than 2 --time-unit days --timestring %Y.%m.%d
root 27715 0.0 0.0 2668 164 pts/0 D+ 13:46 0:00 grep curator
See attached for the .log file. I'm not sure what I'm looking for in there.
You do not have the required permissions to view the files attached to this post.
Re: nagioslogserver_log INITIALIZING - after logstash failur
It's sticking on the optimization command which can require a lot of resources with minimum benefit - https://www.elastic.co/guide/en/elastic ... imize.html. Disable optimization by setting it to 0 under Admin > System > Snapshots & Maintenance > Maintenance Settings > Optimize Indexes older than.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Will that affect the amount of storage needed to take a snapshot? Or, the storage on the system overall?
Re: nagioslogserver_log INITIALIZING - after logstash failur
It shouldn't have any noticeable impact on storage needs.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Ok, we will give this a try. Please leave this thread open until I can confirm the Snapshots are working as intended.
Thanks for all your help!
Thanks for all your help!
Re: nagioslogserver_log INITIALIZING - after logstash failur
No problem. Keep us posted!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Ok, snapshots appear to be working now. 2 days without issue.
My question now, why all of a sudden can we no longer optimize indexes? We've had that option set at '2 days' for well over a year. This is more of a curiosity question than anything else.
My question now, why all of a sudden can we no longer optimize indexes? We've had that option set at '2 days' for well over a year. This is more of a curiosity question than anything else.
Re: nagioslogserver_log INITIALIZING - after logstash failur
The number indices, documents in the database and number of segments will have an impact on how quickly the optimize command runs. Perhaps older indices were not getting optimized as expected leading to a build up in the number of optimization needing to be done(any older than 2 days by default) and finally it impacted snapshots. Segment information can be gathered from the command line. This gathers everything:
curl -XGET http://localhost:9200/_segments?pretty
and this for specific indices:
curl -XGET http://localhost:9200/logstash-YYYY.MM. ... nts?pretty
curl -XGET http://localhost:9200/_segments?pretty
and this for specific indices:
curl -XGET http://localhost:9200/logstash-YYYY.MM. ... nts?pretty
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: nagioslogserver_log INITIALIZING - after logstash failur
Turning off optimization seems to have a big impact on our storage. Ever since I set it to 0 days our snapshot repository has grown from 7% free space to 2% free space in roughly 9 days.
Instead of setting it at 0 days, we're going to set it at 15 days to see if perhaps the optimization task won't take as long. Hopefully the snapshots will finish over the weekend without failure.
Instead of setting it at 0 days, we're going to set it at 15 days to see if perhaps the optimization task won't take as long. Hopefully the snapshots will finish over the weekend without failure.