Page 1 of 1

Log Server EC2 instance upgrade to 1.4 fails

Posted: Thu Jan 14, 2016 3:13 pm
by beachbar
I am trying to upgrade the standard EC2 image instance to 1.4.0. I followed the upgrade instructions and the installation indicated it was successful, but elasticsearch database will not start and the following messages appear repeatedly on the console until a reboot. After reboot, the database remains offline and will not start using service elasticsearch start. I have tried to upgrade an existing EC2 image instance with data, and a brand new image instance. The same thing happened to both.

Jan 14, 2016 2:29:31 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://localhost:9200
Jan 14, 2016 2:29:31 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (org.apache.http.conn.HttpHostConnectException) caught when processing request to {}->http://localhost:9200: Connect to localhost:9200 [localhost/127.0.0.1] failed: Connection refused
Jan 14, 2016 2:29:31 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://localhost:9200
Jan 14, 2016 2:29:31 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (org.apache.http.conn.HttpHostConnectException) caught when processing request to {}->http://localhost:9200: Connect to localhost:9200 [localhost/127.0.0.1] failed: Connection refused
Jan 14, 2016 2:29:31 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://localhost:9200
Jan 14, 2016 2:29:31 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (org.apache.http.conn.HttpHostConnectException) caught when processing request to {}->http://localhost:9200: Connect to localhost:9200 [localhost/127.0.0.1] failed: Connection refused

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Thu Jan 14, 2016 4:14 pm
by jolson
It almost seems like elasticsearch isn't binding to localhost. Could you please get the output of the following for me?

Code: Select all

cat /etc/*release*
cat /usr/local/nagioslogserver/elasticsearch/config/*.yml
cat /etc/hosts
cat /etc/sysconfig/elasticsearch /etc/sysconfig/logstash
tail -n200 /var/log/elasticsearch/*.log
sestatus

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Thu Jan 14, 2016 4:55 pm
by beachbar
These are from the new EC2 image instance I installed, verified working from the image, and then ran the upgrade process.

Code: Select all

[~]$ cat /etc/*release*
CentOS release 6.7 (Final)
CentOS release 6.7 (Final)
CentOS release 6.7 (Final)
cpe:/o:centos:linux:6:GA

Code: Select all

[~]$ cat /usr/local/nagioslogserver/elasticsearch/config/*.yml
##################### Elasticsearch Configuration Example #####################

# This file contains an overview of various configuration settings,
# targeted at operations staff. Application developers should
# consult the guide at <http://elasticsearch.org/guide>.
#
# The installation procedure is covered at
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html>.
#
# Elasticsearch comes with reasonable defaults for most settings,
# so you can try it out without bothering with configuration.
#
# Most of the time, these defaults are just fine for running a production
# cluster. If you're fine-tuning your cluster, or wondering about the
# effect of certain configuration option, please _do ask_ on the
# mailing list or IRC channel [http://elasticsearch.org/community].

# Any element in the configuration can be replaced with environment variables
# by placing them in ${...} notation. For example:
#
# node.rack: ${RACK_ENV_VAR}

# For information on supported formats and syntax for the config file, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html>


################################### Cluster ###################################

# Cluster name identifies your cluster for auto-discovery. If you're running
# multiple clusters on the same network, make sure you're using unique names.
#
cluster.name: nagios_elasticsearch


#################################### Node #####################################

# Node names are generated dynamically on startup, so you're relieved
# from configuring them manually. You can tie this node to a specific name:
#
# node.name: "Franz Kafka"

# Every node can be configured to allow or deny being eligible as the master,
# and to allow or deny to store the data.
#
# Allow this node to be eligible as a master node (enabled by default):
#
# node.master: true
#
# Allow this node to store data (enabled by default):
#
# node.data: true

# You can exploit these settings to design advanced cluster topologies.
#
# 1. You want this node to never become a master node, only to hold data.
#    This will be the "workhorse" of your cluster.
#
# node.master: false
# node.data: true
#
# 2. You want this node to only serve as a master: to not store any data and
#    to have free resources. This will be the "coordinator" of your cluster.
#
# node.master: true
# node.data: false
#
# 3. You want this node to be neither master nor data node, but
#    to act as a "search load balancer" (fetching data from nodes,
#    aggregating results, etc.)
#
# node.master: false
# node.data: false

# Use the Cluster Health API [http://localhost:9200/_cluster/health], the
# Node Info API [http://localhost:9200/_nodes] or GUI tools
# such as <http://www.elasticsearch.org/overview/marvel/>,
# <http://github.com/karmi/elasticsearch-paramedic>,
# <http://github.com/lukas-vlcek/bigdesk> and
# <http://mobz.github.com/elasticsearch-head> to inspect the cluster state.

Code: Select all

[~]$ cat /etc/hosts
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

[~]$ cat /etc/sysconfig/elasticsearch /etc/sysconfig/logstash
# Directory where the Elasticsearch binary distribution resides
APP_DIR="/usr/local/nagioslogserver"
ES_HOME="$APP_DIR/elasticsearch"

# Heap Size (defaults to 256m min, 1g max)
# Nagios Log Server Default to 0.5 physical Memory
ES_HEAP_SIZE=$(expr $(free -m|awk '/^Mem:/{print $2}') / 2 )m

# Heap new generation
#ES_HEAP_NEWSIZE=

# max direct memory
#ES_DIRECT_SIZE=

# Additional Java OPTS
#ES_JAVA_OPTS=

# Maximum number of open files
MAX_OPEN_FILES=65535

# Maximum amount of locked memory
MAX_LOCKED_MEMORY=unlimited

# Maximum number of VMA (Virtual Memory Areas) a process can own
MAX_MAP_COUNT=262144

# Elasticsearch log directory
LOG_DIR=/var/log/elasticsearch

# Elasticsearch data directory
DATA_DIR="$ES_HOME/data"

# Elasticsearch work directory
WORK_DIR="$APP_DIR/tmp/elasticsearch"

# Elasticsearch conf directory
CONF_DIR="$ES_HOME/config"

# Elasticsearch configuration file (elasticsearch.yml)
CONF_FILE="$ES_HOME/config/elasticsearch.yml"

# User to run as, change this to a specific elasticsearch user if possible
# Also make sure, this user can write into the log directories in case you change them
# This setting only works for the init script, but has to be configured separately for systemd startup
ES_USER=nagios
ES_GROUP=nagios

# Configure restart on package upgrade (true, every other setting will lead to not restarting)
#RESTART_ON_UPGRADE=true

if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" -o "x$1" == "xforce-reload" ];then
        GET_ES_CONFIG_MESSAGE="$( php $APP_DIR/scripts/get_es_config.php )"
        GET_ES_CONFIG_RETURN=$?

        if [ "$GET_ES_CONFIG_RETURN" != "0" ]; then
                echo $GET_ES_CONFIG_MESSAGE
                exit 1
        else
                ES_JAVA_OPTS="$GET_ES_CONFIG_MESSAGE"
        fi
fi
###############################
# Default settings for logstash
###############################

# Override Java location
#JAVACMD=/usr/bin/java

# Set a home directory
APP_DIR=/usr/local/nagioslogserver
LS_HOME="$APP_DIR/logstash"

# set ES_CLUSTER
ES_CLUSTER=$(cat $APP_DIR/var/cluster_uuid)

# Arguments to pass to java
#LS_HEAP_SIZE="256m"
LS_JAVA_OPTS="-Djava.io.tmpdir=$APP_DIR/tmp"

# Logstash filter worker threads
#LS_WORKER_THREADS=1

# pidfiles aren't used for upstart; this is for sysv users.
#LS_PIDFILE=/var/run/logstash.pid

# user id to be invoked as; for upstart: edit /etc/init/logstash.conf
LS_USER=nagios
LS_GROUP=nagios

# logstash logging
#LS_LOG_FILE=/var/log/logstash/logstash.log
#LS_USE_GC_LOGGING="true"

# logstash configuration directory
LS_CONF_DIR="$LS_HOME/etc/conf.d"

# Open file limit; cannot be overridden in upstart
#LS_OPEN_FILES=2048

# Nice level
#LS_NICE=0

# Increate Filter workers to 4 threads
LS_OPTS=" -w 4"

if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" ];then
        GET_LOGSTASH_CONFIG_MESSAGE=$( php /usr/local/nagioslogserver/scripts/get_logstash_config.php )
        GET_LOGSTASH_CONFIG_RETURN=$?
        if [ "$GET_LOGSTASH_CONFIG_RETURN" != "0" ]; then
                echo $GET_LOGSTASH_CONFIG_MESSAGE
                exit 1
        fi

Code: Select all

[~]$ tail -n200 /var/log/elasticsearch/*.log
==> /var/log/elasticsearch/e5484b24-18a8-419a-bfea-a7de2bb2defe_index_indexing_slowlog.log <==

==> /var/log/elasticsearch/e5484b24-18a8-419a-bfea-a7de2bb2defe_index_search_slowlog.log <==

==> /var/log/elasticsearch/e5484b24-18a8-419a-bfea-a7de2bb2defe.log <==
[2016-01-14 14:11:46,375][WARN ][common.jna               ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2016-01-14 14:11:47,272][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] version[1.3.2], pid[1622], build[dee175d/2014-08-13T14:29:30Z]
[2016-01-14 14:11:47,273][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] initializing ...
[2016-01-14 14:11:48,592][INFO ][plugins                  ] [358bc8b4-69db-4f31-8311-70ea54fffe63] loaded [knapsack-1.3.2.0-d5501ef], sites []
[2016-01-14 14:12:57,095][WARN ][common.jna               ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2016-01-14 14:12:57,450][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] version[1.3.2], pid[848], build[dee175d/2014-08-13T14:29:30Z]
[2016-01-14 14:12:57,450][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] initializing ...
[2016-01-14 14:12:57,499][INFO ][plugins                  ] [358bc8b4-69db-4f31-8311-70ea54fffe63] loaded [knapsack-1.3.2.0-d5501ef], sites []
[2016-01-14 14:13:03,841][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] initialized
[2016-01-14 14:13:03,841][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] starting ...
[2016-01-14 14:13:04,008][INFO ][transport                ] [358bc8b4-69db-4f31-8311-70ea54fffe63] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.31.15.220:9300]}
[2016-01-14 14:13:04,016][INFO ][discovery                ] [358bc8b4-69db-4f31-8311-70ea54fffe63] e5484b24-18a8-419a-bfea-a7de2bb2defe/mLE1LMNmTEKwBANNIMQbUw
[2016-01-14 14:13:07,074][INFO ][cluster.service          ] [358bc8b4-69db-4f31-8311-70ea54fffe63] new_master [358bc8b4-69db-4f31-8311-70ea54fffe63][mLE1LMNmTEKwBANNIMQbUw][ip-172-31-15-220.us-west-2.compute.internal][inet[/172.31.15.220:9300]]{max_local_storage_nodes=1}, reason: zen-disco-join (elected_as_master)
[2016-01-14 14:13:07,112][INFO ][http                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] bound_address {inet[/127.0.0.1:9200]}, publish_address {inet[localhost/127.0.0.1:9200]}
[2016-01-14 14:13:07,113][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] started
[2016-01-14 14:13:07,146][INFO ][gateway                  ] [358bc8b4-69db-4f31-8311-70ea54fffe63] recovered [0] indices into cluster_state
[2016-01-14 14:16:51,190][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [nagioslogserver] creating index, cause [auto(index api)], shards [1]/[1], mappings [cf_option, node, reactor_server, snapshot, alert, _default_, query, commands, snmp_reactor, nrdp_server, user]
[2016-01-14 14:16:52,101][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [nagioslogserver_log] creating index, cause [auto(index api)], shards [5]/[1], mappings []
[2016-01-14 14:16:52,594][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [nagioslogserver_log] update_mapping [SECURITY] (dynamic)
[2016-01-14 14:16:52,953][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [nagioslogserver] update_mapping [node] (dynamic)
[2016-01-14 14:17:02,612][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [nagioslogserver_log] update_mapping [POLLER] (dynamic)
[2016-01-14 14:17:03,107][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [nagioslogserver_log] update_mapping [JOBS] (dynamic)
[2016-01-14 14:17:18,668][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [kibana-int] creating index, cause [auto(index api)], shards [5]/[1], mappings []
[2016-01-14 14:17:18,991][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [kibana-int] update_mapping [dashboard] (dynamic)
[2016-01-14 14:18:08,133][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [logstash-2016.01.14] creating index, cause [auto(bulk api)], shards [5]/[1], mappings [_default_]
[2016-01-14 14:18:08,480][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [logstash-2016.01.14] update_mapping [syslog] (dynamic)
[2016-01-14 14:18:12,431][INFO ][cluster.metadata         ] [358bc8b4-69db-4f31-8311-70ea54fffe63] [nagioslogserver_log] update_mapping [SECURITY] (dynamic)
[2016-01-14 14:21:45,454][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] stopping ...
[2016-01-14 14:21:45,774][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] stopped
[2016-01-14 14:21:45,774][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] closing ...
[2016-01-14 14:21:45,801][INFO ][node                     ] [358bc8b4-69db-4f31-8311-70ea54fffe63] closed

Code: Select all

[~]$ sestatus
SELinux status:                 disabled

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Thu Jan 14, 2016 5:30 pm
by jolson
Try opening the elasticsearch.yml file and adding the following lines:

Code: Select all

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["localhost"]
http.host: "localhost"
transport.tcp.compress: true
bootstrap.mlockall: true
node.max_local_storage_nodes: 1
After adding the above lines, give it a restart:

Code: Select all

service elasticsearch restart
Does this alter your behavior at all? I'm not sure why those values wouldn't be set by default if you ran a normal install - they should be. Give it a try and let me know, thanks!

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Fri Jan 15, 2016 9:32 am
by beachbar
Those lines are in there. Apparently my previously posted cat output was cut off for some reason. Here are the contents of that file:

Code: Select all

[~]$ cat /usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml
##################### Elasticsearch Configuration Example #####################

# This file contains an overview of various configuration settings,
# targeted at operations staff. Application developers should
# consult the guide at <http://elasticsearch.org/guide>.
#
# The installation procedure is covered at
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html>.
#
# Elasticsearch comes with reasonable defaults for most settings,
# so you can try it out without bothering with configuration.
#
# Most of the time, these defaults are just fine for running a production
# cluster. If you're fine-tuning your cluster, or wondering about the
# effect of certain configuration option, please _do ask_ on the
# mailing list or IRC channel [http://elasticsearch.org/community].

# Any element in the configuration can be replaced with environment variables
# by placing them in ${...} notation. For example:
#
# node.rack: ${RACK_ENV_VAR}

# For information on supported formats and syntax for the config file, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html>


################################### Cluster ###################################

# Cluster name identifies your cluster for auto-discovery. If you're running
# multiple clusters on the same network, make sure you're using unique names.
#
cluster.name: nagios_elasticsearch


#################################### Node #####################################

# Node names are generated dynamically on startup, so you're relieved
# from configuring them manually. You can tie this node to a specific name:
#
# node.name: "Franz Kafka"

# Every node can be configured to allow or deny being eligible as the master,
# and to allow or deny to store the data.
#
# Allow this node to be eligible as a master node (enabled by default):
#
# node.master: true
#
# Allow this node to store data (enabled by default):
#
# node.data: true

# You can exploit these settings to design advanced cluster topologies.
#
# 1. You want this node to never become a master node, only to hold data.
#    This will be the "workhorse" of your cluster.
#
# node.master: false
# node.data: true
#
# 2. You want this node to only serve as a master: to not store any data and
#    to have free resources. This will be the "coordinator" of your cluster.
#
# node.master: true
# node.data: false
#
# 3. You want this node to be neither master nor data node, but
#    to act as a "search load balancer" (fetching data from nodes,
#    aggregating results, etc.)
#
# node.master: false
# node.data: false

# Use the Cluster Health API [http://localhost:9200/_cluster/health], the
# Node Info API [http://localhost:9200/_nodes] or GUI tools
# such as <http://www.elasticsearch.org/overview/marvel/>,
# <http://github.com/karmi/elasticsearch-paramedic>,
# <http://github.com/lukas-vlcek/bigdesk> and
# <http://mobz.github.com/elasticsearch-head> to inspect the cluster state.

# A node can have generic attributes associated with it, which can later be used
# for customized shard allocation filtering, or allocation awareness. An attribute
# is a simple key value pair, similar to node.key: value, here is an example:
#
# node.rack: rack314

# By default, multiple nodes are allowed to start from the same installation location
# to disable it, set the following:
node.max_local_storage_nodes: 1


#################################### Index ####################################

# You can set a number of options (such as shard/replica options, mapping
# or analyzer definitions, translog settings, ...) for indices globally,
# in this file.
#
# Note, that it makes more sense to configure index settings specifically for
# a certain index, either when creating it or by using the index templates API.
#
# See <http://elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules.html> and
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html>
# for more information.

# Set the number of shards (splits) of an index (5 by default):
#
# index.number_of_shards: 5

# Set the number of replicas (additional copies) of an index (1 by default):
#
# index.number_of_replicas: 1

# Note, that for development on a local machine, with small indices, it usually
# makes sense to "disable" the distributed features:
#
# index.number_of_shards: 1
# index.number_of_replicas: 0

# These settings directly affect the performance of index and search operations
# in your cluster. Assuming you have enough machines to hold shards and
# replicas, the rule of thumb is:
#
# 1. Having more *shards* enhances the _indexing_ performance and allows to
#    _distribute_ a big index across machines.
# 2. Having more *replicas* enhances the _search_ performance and improves the
#    cluster _availability_.
#
# The "number_of_shards" is a one-time setting for an index.
#
# The "number_of_replicas" can be increased or decreased anytime,
# by using the Index Update Settings API.
#
# Elasticsearch takes care about load balancing, relocating, gathering the
# results from nodes, etc. Experiment with different settings to fine-tune
# your setup.

# Use the Index Status API (<http://localhost:9200/A/_status>) to inspect
# the index status.


#################################### Paths ####################################

# Path to directory containing configuration (this file and logging.yml):
#
# path.conf: /path/to/conf

# Path to directory where to store index data allocated for this node.
#
# path.data: /path/to/data
#
# Can optionally include more than one location, causing data to be striped across
# the locations (a la RAID 0) on a file level, favouring locations with most free
# space on creation. For example:
#
# path.data: /path/to/data1,/path/to/data2

# Path to temporary files:
#
# path.work: /path/to/work

# Path to log files:
#
# path.logs: /path/to/logs

# Path to where plugins are installed:
#
# path.plugins: /path/to/plugins


#################################### Plugin ###################################

# If a plugin listed here is not installed for current node, the node will not start.
#
# plugin.mandatory: mapper-attachments,lang-groovy


################################### Memory ####################################

# Elasticsearch performs poorly when JVM starts swapping: you should ensure that
# it _never_ swaps.
#
# Set this property to true to lock the memory:
#
bootstrap.mlockall: true

# Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set
# to the same value, and that the machine has enough memory to allocate
# for Elasticsearch, leaving enough memory for the operating system itself.
#
# You should also make sure that the Elasticsearch process is allowed to lock
# the memory, eg. by using `ulimit -l unlimited`.


############################## Network And HTTP ###############################

# Elasticsearch, by default, binds itself to the 0.0.0.0 address, and listens
# on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node
# communication. (the range means that if the port is busy, it will automatically
# try the next port).

# Set the bind address specifically (IPv4 or IPv6):
#
# network.bind_host: 192.168.0.1

# Set the address other nodes will use to communicate with this node. If not
# set, it is automatically derived. It must point to an actual IP address.
#
# network.publish_host: 192.168.0.1

# Set both 'bind_host' and 'publish_host':
#
# network.host: 192.168.0.1

# Set a custom port for the node to node communication (9300 by default):
#
# transport.tcp.port: 9300

# Enable compression for all communication between nodes (disabled by default):
#
transport.tcp.compress: true

# Set a custom port to listen for HTTP traffic:
#
# http.port: 9200

# Set a custom allowed content length:
#
# http.max_content_length: 100mb

# Disable HTTP completely:
#
# http.enabled: false

# Set the HTTP host to listen to
#
http.host: "localhost"

################################### Gateway ###################################

# The gateway allows for persisting the cluster state between full cluster
# restarts. Every change to the state (such as adding an index) will be stored
# in the gateway, and when the cluster starts up for the first time,
# it will read its state from the gateway.

# There are several types of gateway implementations. For more information, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html>.

# The default gateway type is the "local" gateway (recommended):
#
# gateway.type: local

# Settings below control how and when to start the initial recovery process on
# a full cluster restart (to reuse as much local data as possible when using shared
# gateway).

# Allow recovery process after N nodes in a cluster are up:
#
# gateway.recover_after_nodes: 1

# Set the timeout to initiate the recovery process, once the N nodes
# from previous setting are up (accepts time value):
#
# gateway.recover_after_time: 5m

# Set how many nodes are expected in this cluster. Once these N nodes
# are up (and recover_after_nodes is met), begin recovery process immediately
# (without waiting for recover_after_time to expire):
#
# gateway.expected_nodes: 2


############################# Recovery Throttling #############################

# These settings allow to control the process of shards allocation between
# nodes during initial recovery, replica allocation, rebalancing,
# or when adding and removing nodes.

# Set the number of concurrent recoveries happening on a node:
#
# 1. During the initial recovery
#
# cluster.routing.allocation.node_initial_primaries_recoveries: 4
#
# 2. During adding/removing nodes, rebalancing, etc
#
# cluster.routing.allocation.node_concurrent_recoveries: 2

# Set to throttle throughput when recovering (eg. 100mb, by default 20mb):
#
# indices.recovery.max_bytes_per_sec: 20mb

# Set to limit the number of open concurrent streams when
# recovering a shard from a peer:
#
# indices.recovery.concurrent_streams: 5


################################## Discovery ##################################

# Discovery infrastructure ensures nodes can be found within a cluster
# and master node is elected. Multicast discovery is the default.

# Set to ensure a node sees N other master eligible nodes to be considered
# operational within the cluster. Its recommended to set it to a higher value
# than 1 when running more than 2 nodes in the cluster.
#
# discovery.zen.minimum_master_nodes: 1

# Set the time to wait for ping responses from other nodes when discovering.
# Set this option to a higher value on a slow or congested network
# to minimize discovery failures:
#
# discovery.zen.ping.timeout: 3s

# For more information, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html>

# Unicast discovery allows to explicitly control which nodes will be used
# to discover the cluster. It can be used when multicast is not present,
# or to restrict the cluster communication-wise.
#
# 1. Disable multicast discovery (enabled by default):
#
discovery.zen.ping.multicast.enabled: false
#
# 2. Configure an initial list of master nodes in the cluster
#    to perform discovery when new nodes (master or data) are started:
#
discovery.zen.ping.unicast.hosts: ["localhost"]

# EC2 discovery allows to use AWS EC2 API in order to perform discovery.
#
# You have to install the cloud-aws plugin for enabling the EC2 discovery.
#
# For more information, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-ec2.html>
#
# See <http://elasticsearch.org/tutorials/elasticsearch-on-ec2/>
# for a step-by-step tutorial.

# GCE discovery allows to use Google Compute Engine API in order to perform discovery.
#
# You have to install the cloud-gce plugin for enabling the GCE discovery.
#
# For more information, see <https://github.com/elasticsearch/elasticsearch-cloud-gce>.

# Azure discovery allows to use Azure API in order to perform discovery.
#
# You have to install the cloud-azure plugin for enabling the Azure discovery.
#
# For more information, see <https://github.com/elasticsearch/elasticsearch-cloud-azure>.

################################## Slow Log ##################################

# Shard level query and fetch threshold logging.

#index.search.slowlog.threshold.query.warn: 10s
#index.search.slowlog.threshold.query.info: 5s
#index.search.slowlog.threshold.query.debug: 2s
#index.search.slowlog.threshold.query.trace: 500ms

#index.search.slowlog.threshold.fetch.warn: 1s
#index.search.slowlog.threshold.fetch.info: 800ms
#index.search.slowlog.threshold.fetch.debug: 500ms
#index.search.slowlog.threshold.fetch.trace: 200ms

#index.indexing.slowlog.threshold.index.warn: 10s
#index.indexing.slowlog.threshold.index.info: 5s
#index.indexing.slowlog.threshold.index.debug: 2s
#index.indexing.slowlog.threshold.index.trace: 500ms

################################## GC Logging ################################

#monitor.jvm.gc.young.warn: 1000ms
#monitor.jvm.gc.young.info: 700ms
#monitor.jvm.gc.young.debug: 400ms

#monitor.jvm.gc.old.warn: 10s
#monitor.jvm.gc.old.info: 5s
#monitor.jvm.gc.old.debug: 2s

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Fri Jan 15, 2016 3:11 pm
by jolson
How much memory and disk space is your server configured with?

Code: Select all

free -m
df -h
[2016-01-14 14:11:46,375][WARN ][common.jna ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
This warning is concerning, but shouldn't cause Elasticsearch to shut down. After Elasticsearch stops, take a look at /var/log/messages for any relevant information:

Code: Select all

service elasticsearch restart
tail -n30 /var/log/messages

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Mon Jan 18, 2016 10:12 am
by beachbar
Here is the output you requested. Thanks

[~]$ free -m
total used free shared buffers cached
Mem: 590 413 177 0 12 153
-/+ buffers/cache: 247 343
Swap: 255 16 239

[~]$ df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 99G 2.8G 91G 3% /
udev 276M 112K 276M 1% /dev
tmpfs 296M 0 296M 0% /dev/shm
/dev/xvde1 99G 2.8G 91G 3% /
none 296M 0 296M 0% /dev/shm

[~]$ sudo service elasticsearch restart
Stopping elasticsearch: [FAILED]
Starting elasticsearch: [ OK ]

[~]$ sudo tail -n30 /var/log/messages
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 907] 0 907 29325 2 0 0 0 crond
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 908] 48 908 59892 3 0 0 0 httpd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 910] 48 910 59892 3 0 0 0 httpd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 912] 48 912 59892 3 0 0 0 httpd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 914] 48 914 59892 3 0 0 0 httpd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 916] 48 916 59892 3 0 0 0 httpd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 929] 0 929 32793 1 0 0 0 runuser
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 931] 502 931 390121 758 0 0 0 java
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1025] 0 1025 1019 1 0 0 0 agetty
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1026] 0 1026 1016 1 0 0 0 mingetty
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1028] 0 1028 1016 1 0 0 0 mingetty
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1030] 0 1030 1016 1 0 0 0 mingetty
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1032] 0 1032 1016 1 0 0 0 mingetty
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1034] 0 1034 1016 1 0 0 0 mingetty
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1036] 0 1036 1016 1 0 0 0 mingetty
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1584] 0 1584 4346 1 0 0 0 anacron
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 1971] 48 1971 59892 2 0 0 0 httpd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2020] 0 2020 23544 6 0 0 0 sshd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2022] 500 2022 23544 1 0 0 0 sshd
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2023] 500 2023 27076 4 0 0 0 bash
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2219] 502 2219 34017 2 0 0 0 crond
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2220] 502 2220 34017 2 0 0 0 crond
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2221] 502 2221 26515 2 0 0 0 sh
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2222] 502 2222 26515 2 0 0 0 sh
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2223] 502 2223 54587 51 0 0 0 php
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2224] 502 2224 54587 39 0 0 0 php
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2315] 502 2315 323212 136094 0 0 0 java
Jan 18 10:09:52 ip-172-31-15-220 kernel: [ 2328] 500 2328 27367 55 0 0 0 sudo
Jan 18 10:09:52 ip-172-31-15-220 kernel: Out of memory: Kill process 2315 (java) score 637 or sacrifice child
Jan 18 10:09:52 ip-172-31-15-220 kernel: Killed process 2315, UID 502, (java) total-vm:1292848kB, anon-rss:479556kB, file-rss:64820kB

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Mon Jan 18, 2016 1:04 pm
by jolson
Jan 18 10:09:52 ip-172-31-15-220 kernel: Out of memory: Kill process 2315 (java) score 637 or sacrifice child
Jan 18 10:09:52 ip-172-31-15-220 kernel: Killed process 2315, UID 502, (java) total-vm:1292848kB, anon-rss:479556kB, file-rss:64820kB
Your server is running out of memory and is proceeding to kill elasticsearch. Because it's a memory-heavy program, it's going to want between 4 and 8 GB of memory to run properly.

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Mon Jan 18, 2016 3:41 pm
by beachbar
That resolved the issue. I increased memory and everything is working well. Thank you for your help!

Re: Log Server EC2 instance upgrade to 1.4 fails

Posted: Mon Jan 18, 2016 3:52 pm
by tmcdonald
I'll be closing this thread now, but feel free to open another if you need anything in the future!