Nagios Support Forum

Posted: **Tue Dec 18, 2018 9:13 pm**

Hi,

I am getting the following message in the logstash logs.

{:timestamp=>"2018-12-19T08:34:50.357000+0800", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]',
but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection reset", :class=>"Manticore::SocketException", :level=>:error}

almost every 2sec generating this message

also, I can see the following message in the /var/log/messages ...

Dec 19 10:05:34 pl-pd-nls1 logstash: Dec 19, 2018 10:05:34 AM org.apache.http.impl.execchain.RetryExec execute
Dec 19 10:05:34 pl-pd-nls1 logstash: INFO: Retrying request to {}->http://localhost:9200
Dec 19 10:05:35 pl-pd-nls1 logstash: Dec 19, 2018 10:05:35 AM org.apache.http.impl.execchain.RetryExec execute
Dec 19 10:05:35 pl-pd-nls1 logstash: INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://localhost:9200: Connection reset

This message also generating almost every 2 sec.

also, I can see the following message in the elasticsearch logs.

[2018-12-19 10:14:06,646][WARN ][http.netty ] [12c3bc39-1cf0-4dfd-a2c8-d9ec25fd25e8] Caught exception while handling client http traffic, closing connection [id: 0x848a19a0, /127.0.0.1:47880 => /127.0.0.1:9200]
org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException: HTTP content length exceeded 104857600 bytes.

However as I checked, logstash and elasticsearch still running. I am able to telnet localhost 9200 too. I have core 4 CPU in my cluster and I can see
CPU usage is varying 100 - 200% always.

As I suspect, I am getting quite a high payload of syslog data. But it seems, we need this type of logs to be stored in NLS.

Can you please help me to resolve this issue? What configurations need to be optimized?

Thank you
Luke.

Posted: **Wed Dec 19, 2018 4:52 pm**

Hello, @lukedevon. Could you send in your System Profile so I can review it?
Attached is a script to gather a profile from the command line.
Copy the script to the machine and from the command line run:

chmod 755 profile.sh
./profile.sh

It will generate a file called system-profile.tar.gz in /tmp.

profile.sh

Posted: **Wed Dec 19, 2018 10:40 pm**

Hi

Thanks for the support. I PM you the files.

Current status of mem usage;

total used free shared buff/cache available
Mem: 32012 18902 250 129 12859 12433
Swap: 65531 0 65531

About to die.

Regards
Luke.

Posted: **Thu Dec 20, 2018 4:56 pm**

@lukedevon, Let's increase the http content length setting in the:

/usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml

Open the file with a text editor, and find the following setting:

# http.max_content_length: 100mb

Uncomment the line and change it to 500:

http.max_content_length: 500mb

Save the file and run the following commands:

service elasticsearch restart
service logstash restart
service httpd restart

Posted: **Fri Dec 21, 2018 1:31 am**

Hi,

I just configured as you suggested and now the error has been cleared. But after some time, I can see log receiving getting slower and free memory in nls instances are very low. sometimes 500 - 200mb.

swap also used.

Can you please help me to fix this? should I need to increase physical memory?

Currently, each nls having 32 GB RAM.

Regards
Luke.

Posted: **Fri Dec 21, 2018 2:48 pm**

@lukedevon, Yes, I suggest going up to 64Gb of ram if possible.

Nagios Support Forum

Elasticsearch appears to be unreachable or down!

Elasticsearch appears to be unreachable or down!

Re: Elasticsearch appears to be unreachable or down!

Re: Elasticsearch appears to be unreachable or down!

Re: Elasticsearch appears to be unreachable or down!

Re: Elasticsearch appears to be unreachable or down!

Re: Elasticsearch appears to be unreachable or down!