Nagios Log Server - Logstash process dying

Problem Description

You experience problems with the Logstash process dying. On the console of your server you observe the following::

INFO: [6a7ce4ea-e1b9-47a1-af18-1c4d47243d20] failed to get node info for [#transport#-1][sa585][inet[localhost/127.0.0.1:9300]], disconnecting...
java.lang.OutOfMemoryError: GC overhead limit exceeded

Even after adding more memory to your server you continue to receive these errors.

Logstash Tuning

There are some parameters in the /etc/init.d/logstash file that can resolve this issues.

In a terminal session on your Log Server instance(s) execute the following command to open the file in vi:

vi /etc/init.d/logstash

When using the vi editor, to make changes press i on the keyboard first to enter insert mode. Press Esc to exit insert mode.

Update the file by changing the following lines:

LS_HEAP_SIZE="1000m"
LS_OPEN_FILES=65535

When you have finished making the changes, save the changes in vi by typing:

:wq

and press Enter.

You need to restart the service using one of the commands below:

RHEL 7 + | CentOS 7 + | Debian | Ubuntu 16/18/20

systemctl daemon-reload
systemctl restart logstash.service

After making these changes the logstash process should no longer die.

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/