Here's what I got when I ran that command.
[root@nagioslscc2 rferebee]# curl -XGET http://localhost:9200/_cluster/settings
{"persistent":{"cluster":{"routing":{"allocation":{"disk":{"watermark":{"low":"99%"}}}}}},"transient":{"plugin":{"knapsack":{"export":{"state":"[]"}}}}}
It appears my predecessor made some sort of change to the cluster settings.
Nagios user java command using over 200% CPU
Re: Nagios user java command using over 200% CPU
It looks like there was an attempt anyway. The format is different than what we'd expect and judging by the logged messages, not effective. You can run the command that was provided to overwrite it.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Nagios user java command using over 200% CPU
Ok, I made the change yesterday. A snapshot ran last night, neither logstash or elasticsearch failed.
However, it looks like the snapshot is still in progress which is odd. Typically if it hasn't failed it's done by now.
In the Command Subsystem it shows State = Waiting, but when I run: curl -s -XGET 'http://localhost:9200/_cluster/state?pretty' | grep snapshot -A 100
I get the following output:
"snapshot" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"path" : {
"type" : "string"
},
"auto" : {
"type" : "long"
},
"filename" : {
"type" : "string"
},
"created" : {
"type" : "long"
},
"name" : {
"type" : "string"
},
"clean_filename" : {
"type" : "string"
}
}
},
"_default_" : {
"_timestamp" : {
"enabled" : true
},
"properties" : { }
},
"commands" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"last_run_status" : {
"type" : "string"
},
"run_time" : {
"type" : "long"
},
"created" : {
"type" : "string"
},
"active" : {
"type" : "long"
},
"type" : {
"type" : "string"
},
"created_by" : {
"type" : "string"
},
"command" : {
"type" : "string"
},
"last_run_output" : {
"type" : "string"
},
"last_run_time" : {
"type" : "string"
},
"frequency" : {
"type" : "string"
},
"args" : {
"properties" : {
"sh_id" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"timezone" : {
"type" : "string"
},
"sh_created" : {
"type" : "long"
},
"id" : {
"type" : "string"
}
}
},
"node" : {
"index" : "not_analyzed",
"type" : "string"
},
"modified_by" : {
"type" : "string"
},
"modified" : {
"type" : "string"
},
"status" : {
"type" : "string"
}
}
},
"node" : {
--
"snapshots" : {
"snapshots" : [ {
"repository" : "NLSREPCC",
"snapshot" : "curator-20190214064611",
"include_global_state" : true,
"state" : "STARTED",
"indices" : [ "logstash-2019.01.15", "logstash-2019.01.16", "logstash-2019.01.17", "logstash-2019.01.18", "logstash-2019.01.19", "logstash-2019.01.20", "logstash-2019.01.21", "logstash-2019.01.22", "logstash-2019.01.23", "logstash-2019.01.24", "logstash-2019.01.25", "logstash-2019.01.26", "logstash-2019.01.27", "logstash-2019.01.28", "logstash-2019.01.29", "logstash-2019.01.30", "logstash-2019.01.31", "logstash-2019.02.01", "logstash-2019.02.02", "logstash-2019.02.03", "logstash-2019.02.04", "logstash-2019.02.05", "logstash-2019.02.06", "logstash-2019.02.07", "logstash-2019.02.08", "logstash-2019.02.09", "logstash-2019.02.10", "logstash-2019.02.11", "logstash-2019.02.12", "logstash-2019.02.13" ],
"shards" : [ {
"index" : "logstash-2019.02.02",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.02",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.02",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.22",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 1,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 1,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 1,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.17",
However, it looks like the snapshot is still in progress which is odd. Typically if it hasn't failed it's done by now.
In the Command Subsystem it shows State = Waiting, but when I run: curl -s -XGET 'http://localhost:9200/_cluster/state?pretty' | grep snapshot -A 100
I get the following output:
"snapshot" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"path" : {
"type" : "string"
},
"auto" : {
"type" : "long"
},
"filename" : {
"type" : "string"
},
"created" : {
"type" : "long"
},
"name" : {
"type" : "string"
},
"clean_filename" : {
"type" : "string"
}
}
},
"_default_" : {
"_timestamp" : {
"enabled" : true
},
"properties" : { }
},
"commands" : {
"_timestamp" : {
"enabled" : true
},
"properties" : {
"last_run_status" : {
"type" : "string"
},
"run_time" : {
"type" : "long"
},
"created" : {
"type" : "string"
},
"active" : {
"type" : "long"
},
"type" : {
"type" : "string"
},
"created_by" : {
"type" : "string"
},
"command" : {
"type" : "string"
},
"last_run_output" : {
"type" : "string"
},
"last_run_time" : {
"type" : "string"
},
"frequency" : {
"type" : "string"
},
"args" : {
"properties" : {
"sh_id" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"timezone" : {
"type" : "string"
},
"sh_created" : {
"type" : "long"
},
"id" : {
"type" : "string"
}
}
},
"node" : {
"index" : "not_analyzed",
"type" : "string"
},
"modified_by" : {
"type" : "string"
},
"modified" : {
"type" : "string"
},
"status" : {
"type" : "string"
}
}
},
"node" : {
--
"snapshots" : {
"snapshots" : [ {
"repository" : "NLSREPCC",
"snapshot" : "curator-20190214064611",
"include_global_state" : true,
"state" : "STARTED",
"indices" : [ "logstash-2019.01.15", "logstash-2019.01.16", "logstash-2019.01.17", "logstash-2019.01.18", "logstash-2019.01.19", "logstash-2019.01.20", "logstash-2019.01.21", "logstash-2019.01.22", "logstash-2019.01.23", "logstash-2019.01.24", "logstash-2019.01.25", "logstash-2019.01.26", "logstash-2019.01.27", "logstash-2019.01.28", "logstash-2019.01.29", "logstash-2019.01.30", "logstash-2019.01.31", "logstash-2019.02.01", "logstash-2019.02.02", "logstash-2019.02.03", "logstash-2019.02.04", "logstash-2019.02.05", "logstash-2019.02.06", "logstash-2019.02.07", "logstash-2019.02.08", "logstash-2019.02.09", "logstash-2019.02.10", "logstash-2019.02.11", "logstash-2019.02.12", "logstash-2019.02.13" ],
"shards" : [ {
"index" : "logstash-2019.02.02",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.02",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.02",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.22",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 1,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.18",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 1,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.21",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 2,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 1,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 0,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 4,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.02.03",
"shard" : 3,
"state" : "SUCCESS",
"node" : "XiP7eUflQ9uppcZnHxZP0A"
}, {
"index" : "logstash-2019.01.17",
Re: Nagios user java command using over 200% CPU
Can you post a screenshot of the snapshot & maintenance settings?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Nagios user java command using over 200% CPU
I cannot access the Snapshot & Maintenance settings while a snapshot is in progress. This is one of the issues I've been trying to get resolved with your team. It causes the GUI to lock up.
Re: Nagios user java command using over 200% CPU
Let's grab it once it becomes available and in the meantime please PM me a profile. This can be generated under Admin > System > Command Subsystem. If it is too large to PM, please open a ticket at https://support.nagios.com/tickets/ and attach it there.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Nagios user java command using over 200% CPU
It finally came up! See attached.
You do not have the required permissions to view the files attached to this post.
Re: Nagios user java command using over 200% CPU
I typically recommend setting the optimization option to 0 to disable it since it requires resources - space, cpu, mem - causing more problems like this than it provides benefit. The main benefit being quicker restart times of the elasticsearch service.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Nagios user java command using over 200% CPU
Are there any storage ramifications we need to worry about? I think we tried this about a month ago and it appeared that our snapshot size grew quite a bit.
Re: Nagios user java command using over 200% CPU
Optimization does merge segments so that there are fewer of them and this can be more efficient for storage. And look at the data again, it may not actually be the optimization part that is causing the hang. The next time you see it hang please run "ps aux | grep curator" and gather another profile.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.