Logstash Crashing after 2015R2.0 update
Posted: Mon Jul 20, 2015 9:25 am
I've been running nice and happy on the previous stable release and all config remains the same, but ever since updating to the new release my logstash has been crashing fairly regularly. I have 2 nodes in my cluster, and they are both doing this.
crash:
So obviously "OutOfMemory", but we are using 8GB for ES_HEAP_SIZE (1/2 the 16GB total in the machine) and it was stable for the longest time before the update.
/etc/sysconfig/elasticsearch:
/etc/sysconfig/logstash:
crash:
Code: Select all
Exception in thread "input|syslog|tcp|192.168.1.52:50526}" java.lang.ArrayIndexOutOfBoundsException: -1
at org.jruby.runtime.ThreadContext.popRubyClass(ThreadContext.java:702)
at org.jruby.runtime.ThreadContext.postYield(ThreadContext.java:1269)
at org.jruby.runtime.ContextAwareBlockBody.post(ContextAwareBlockBody.java:29)
at org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:198)
at org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)
at org.jruby.runtime.Block.call(Block.java:101)
at org.jruby.RubyProc.call(RubyProc.java:290)
at org.jruby.RubyProc.call(RubyProc.java:228)
at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
at java.lang.Thread.run(Thread.java:745)
ConcurrencyError: interrupted waiting for mutex: null
lock at org/jruby/ext/thread/Mutex.java:94
execute_task_once at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.8.0-java/lib/concurrent/delay.rb:83
wait at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.8.0-java/lib/concurrent/delay.rb:60
value at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.8.0-java/lib/concurrent/obligation.rb:47
global_timer_set at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.8.0-java/lib/concurrent/configuration.rb:58
finalize_global_executors at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.8.0-java/lib/concurrent/configuration.rb:137
Concurrent at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.8.0-java/lib/concurrent/configuration.rb:165
Error: Your application used more memory than the safety cap of 500M.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace
Code: Select all
Jul 20, 2015 9:36:19 AM org.elasticsearch.plugins.PluginsService <init>
INFO: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] loaded [], sites []
Jul 20, 2015 9:36:21 AM org.elasticsearch.plugins.PluginsService <init>
INFO: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] loaded [], sites []
Jul 20, 2015 9:36:21 AM org.elasticsearch.plugins.PluginsService <init>
INFO: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] loaded [], sites []
Jul 20, 2015 9:36:21 AM org.elasticsearch.plugins.PluginsService <init>
INFO: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] loaded [], sites []
Jul 20, 2015 9:36:21 AM org.elasticsearch.plugins.PluginsService <init>
INFO: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] loaded [], sites []
Jul 20, 2015 9:38:39 AM org.elasticsearch.transport.netty.NettyInternalESLogger warn
WARNING: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: GC overhead limit exceeded
Jul 20, 2015 9:37:57 AM org.elasticsearch.transport.netty.NettyInternalESLogger warn
WARNING: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: GC overhead limit exceeded
Error: Your application used more memory than the safety cap of 500M.Error: Your application used more memory than the safety cap of 500M.
Jul 20, 2015 9:39:31 AM org.elasticsearch.transport.netty.NettyTransport exceptionCaught
WARNING: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] exception caught on transport layer [[id: 0x1a0c7404, /127.0.0.1:55679 => localhost/127.0.0.1:9300]], closing connection
java.lang.OutOfMemoryError: Java heap space
Jul 20, 2015 9:39:31 AM org.elasticsearch.transport.netty.NettyTransport exceptionCaught
WARNING: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] exception caught on transport layer [[id: 0x7c0fc4f6, /127.0.0.1:55740 => localhost/127.0.0.1:9300]], closing connection
java.lang.OutOfMemoryError: GC overhead limit exceeded
Jul 20, 2015 9:39:31 AM org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler doSample
INFO: [ea9ddcd0-c0a5-4d5d-a802-e741d9c51a5b] failed to get node info for [#transport#-1][schpnag1][inet[localhost/127.0.0.1:9300]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[localhost/127.0.0.1:9300]][cluster:monitor/nodes/info] request_id [10] timed out after [24496ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
/etc/sysconfig/elasticsearch:
Code: Select all
# Directory where the Elasticsearch binary distribution resides
APP_DIR="/usr/local/nagioslogserver"
ES_HOME="$APP_DIR/elasticsearch"
# Heap Size (defaults to 256m min, 1g max)
ES_HEAP_SIZE=8g
# Heap new generation
#ES_HEAP_NEWSIZE=
# max direct memory
#ES_DIRECT_SIZE=
# Additional Java OPTS
#ES_JAVA_OPTS=
# Maximum number of open files
MAX_OPEN_FILES=65535
# Maximum amount of locked memory
MAX_LOCKED_MEMORY=unlimited
# Maximum number of VMA (Virtual Memory Areas) a process can own
MAX_MAP_COUNT=262144
# Elasticsearch log directory
LOG_DIR=/var/log/elasticsearch
# Elasticsearch data directory
DATA_DIR="/nagios/data"
# Elasticsearch work directory
WORK_DIR="$APP_DIR/tmp/elasticsearch"
# Elasticsearch conf directory
CONF_DIR="$ES_HOME/config"
# Elasticsearch configuration file (elasticsearch.yml)
CONF_FILE="$ES_HOME/config/elasticsearch.yml"
# User to run as, change this to a specific elasticsearch user if possible
# Also make sure, this user can write into the log directories in case you change them
# This setting only works for the init script, but has to be configured separately for systemd startup
ES_USER=nagios
ES_GROUP=nagios
# Configure restart on package upgrade (true, every other setting will lead to not restarting)
#RESTART_ON_UPGRADE=true
if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" -o "x$1" == "xforce-reload" ];then
GET_ES_CONFIG_MESSAGE="$( php $APP_DIR/scripts/get_es_config.php )"
GET_ES_CONFIG_RETURN=$?
if [ "$GET_ES_CONFIG_RETURN" != "0" ]; then
echo $GET_ES_CONFIG_MESSAGE
exit 1
else
ES_JAVA_OPTS="$GET_ES_CONFIG_MESSAGE"
fi
fi
Code: Select all
###############################
# Default settings for logstash
###############################
# Override Java location
#JAVACMD=/usr/bin/java
# Set a home directory
APP_DIR=/usr/local/nagioslogserver
LS_HOME="$APP_DIR/logstash"
# set ES_CLUSTER
ES_CLUSTER=$(cat $APP_DIR/var/cluster_uuid)
# Arguments to pass to java
#LS_HEAP_SIZE="256m"
LS_JAVA_OPTS="-Djava.io.tmpdir=$APP_DIR/tmp"
# Logstash filter worker threads
#LS_WORKER_THREADS=1
# pidfiles aren't used for upstart; this is for sysv users.
#LS_PIDFILE=/var/run/logstash.pid
# user id to be invoked as; for upstart: edit /etc/init/logstash.conf
LS_USER=nagios
LS_GROUP=nagios
# logstash logging
#LS_LOG_FILE=/var/log/logstash/logstash.log
#LS_USE_GC_LOGGING="true"
# logstash configuration directory
LS_CONF_DIR="$LS_HOME/etc/conf.d"
# Open file limit; cannot be overridden in upstart
#LS_OPEN_FILES=2048
# Nice level
#LS_NICE=0
# Increase Filter workers to 4 threads
LS_OPTS=" -w 4"
if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" ];then
GET_LOGSTASH_CONFIG_MESSAGE=$( php /usr/local/nagioslogserver/scripts/get_logstash_config.php )
GET_LOGSTASH_CONFIG_RETURN=$?
if [ "$GET_LOGSTASH_CONFIG_RETURN" != "0" ]; then
echo $GET_LOGSTASH_CONFIG_MESSAGE
exit 1
fi