Logstash/Elasticsearch Services Crashing
Posted: Wed Apr 29, 2020 12:23 pm
Hello,
The elasticsearch and logstash services keep crashing. I am noticing the used memory on the server keeps increasing until one or both services crash, then returns to normal. Sometimes just the elasticsearch service will crash, and sometimes both the elasticsearch and then the logstash service will crash. If I reboot the server, everything will run fine until the memory get low again. I am using version 2.1.6 of Nagios Log Server. The vm is running Centos7, with 16GB RAM, and 16 vCPU. The system has a 4TB disk for the indexes. I have 705 devices sending logs to a single instance of Nagios Log Server. Any suggestions on how to troubleshoot further?
Last entry in /var/log/logstash/logstash.log:
{:timestamp=>"2020-04-28T19:16:49.402000-0700", :message=>"syslog listener died", :protocol=>:udp, :address=>"0.0.0.0:1514", :exception=>#<SocketError: recvfrom: name or service not known>, :backtrace=>["/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:138:in `udp_listener'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:117:in `server'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:97:in `run'"], :level=>:warn}
Last entry in /var/log/elasticsearch/c2180ab3-41d3-42ab-bbd4-dfd9f9f655fb.log
[2020-04-28 18:58:48,225][WARN ][index.merge.scheduler ] [7a763a0a-1a65-4738-9485-b96612b66187] [logstash-2020.04.24][1] failed to merge
org.apache.lucene.store.AlreadyClosedException: refusing to delete any files: this IndexWriter hit an unrecoverable exception
at org.apache.lucene.index.IndexFileDeleter.ensureOpen(IndexFileDeleter.java:354)
at org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:719)
at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:451)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3826)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
at org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
Caused by: java.lang.OutOfMemoryError: Java heap space
Thanks.
The elasticsearch and logstash services keep crashing. I am noticing the used memory on the server keeps increasing until one or both services crash, then returns to normal. Sometimes just the elasticsearch service will crash, and sometimes both the elasticsearch and then the logstash service will crash. If I reboot the server, everything will run fine until the memory get low again. I am using version 2.1.6 of Nagios Log Server. The vm is running Centos7, with 16GB RAM, and 16 vCPU. The system has a 4TB disk for the indexes. I have 705 devices sending logs to a single instance of Nagios Log Server. Any suggestions on how to troubleshoot further?
Last entry in /var/log/logstash/logstash.log:
{:timestamp=>"2020-04-28T19:16:49.402000-0700", :message=>"syslog listener died", :protocol=>:udp, :address=>"0.0.0.0:1514", :exception=>#<SocketError: recvfrom: name or service not known>, :backtrace=>["/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:138:in `udp_listener'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:117:in `server'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:97:in `run'"], :level=>:warn}
Last entry in /var/log/elasticsearch/c2180ab3-41d3-42ab-bbd4-dfd9f9f655fb.log
[2020-04-28 18:58:48,225][WARN ][index.merge.scheduler ] [7a763a0a-1a65-4738-9485-b96612b66187] [logstash-2020.04.24][1] failed to merge
org.apache.lucene.store.AlreadyClosedException: refusing to delete any files: this IndexWriter hit an unrecoverable exception
at org.apache.lucene.index.IndexFileDeleter.ensureOpen(IndexFileDeleter.java:354)
at org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:719)
at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:451)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3826)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:409)
at org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:107)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:486)
Caused by: java.lang.OutOfMemoryError: Java heap space
Thanks.