Logs stop coming in
Posted: Tue Dec 16, 2014 12:42 pm
Probably related to my last post about this where I thought there was something wrong with just backups (http://support.nagios.com/forum/viewtop ... 38&t=30496).
First I checked logstash logs:
Then checked the elasticsearch logs and dumped those since the time of the logs stopped coming in. (see attached)
I noticed in the elastic search logs that it said somethign about being out of memory. So I checked the memory usage on the system. Looks ok-ish to me:
While all the memory has been requested, it not actually in use (hence plenty of free -/+ buffers/cache), also no swap used.. so plenty of free.
I restarted elastic search (/etc/init.d/elasticsearch restart) and everything went back to normal. However, I don't want to have to monitor for this and restart elasticsearch all the time.
If there are any other diagnostics I can run, or config information i can give you, i would be happy to.
First I checked logstash logs:
Code: Select all
{:timestamp=>"2014-12-16T11:16:28.570000-0500", :message=>"Failed to flush outgoing items", :outgoing_count=>5000, :exception=>#<RuntimeError: Non-OK response code from Elasticsearch: 500>, :backtrace=>["/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:127:in `bulk_ftw'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in `bulk'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:315:in `flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:159:in `buffer_receive'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:311:in `receive'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/base.rb:86:in `handle'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/base.rb:78:in `worker_setup'"], :level=>:warn}
{:timestamp=>"2014-12-16T11:19:48.816000-0500", :message=>"Failed to flush outgoing items", :outgoing_count=>5000, :exception=>#<Errno::ECONNRESET: Connection reset by peer - Connection reset by peer>, :backtrace=>["org/jruby/RubyIO.java:3016:in `sysread'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:243:in `read'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/protocol.rb:30:in `read_http_message'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/request.rb:93:in `execute'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:325:in `execute'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:106:in `bulk_ftw'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in `bulk'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:315:in `flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:159:in `buffer_receive'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:311:in `receive'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/base.rb:86:in `handle'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/base.rb:78:in `worker_setup'"], :level=>:warn}
I noticed in the elastic search logs that it said somethign about being out of memory. So I checked the memory usage on the system. Looks ok-ish to me:
Code: Select all
total used free shared buffers cached
Mem: 15898 15734 164 0 155 12785
-/+ buffers/cache: 2793 13105
Swap: 8015 0 8015
I restarted elastic search (/etc/init.d/elasticsearch restart) and everything went back to normal. However, I don't want to have to monitor for this and restart elasticsearch all the time.
If there are any other diagnostics I can run, or config information i can give you, i would be happy to.