Elasticsearch appears to be unreachable or down!

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
lukedevon
Posts: 143
Joined: Sat Mar 24, 2018 9:15 am

Elasticsearch appears to be unreachable or down!

Post by lukedevon »

Hi,

I am getting the following message in the logstash logs.

{:timestamp=>"2018-12-19T08:34:50.357000+0800", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200\"]',
but Elasticsearch appears to be unreachable or down!", :error_message=>"Connection reset", :class=>"Manticore::SocketException", :level=>:error}

almost every 2sec generating this message

also, I can see the following message in the /var/log/messages ...

Dec 19 10:05:34 pl-pd-nls1 logstash: Dec 19, 2018 10:05:34 AM org.apache.http.impl.execchain.RetryExec execute
Dec 19 10:05:34 pl-pd-nls1 logstash: INFO: Retrying request to {}->http://localhost:9200
Dec 19 10:05:35 pl-pd-nls1 logstash: Dec 19, 2018 10:05:35 AM org.apache.http.impl.execchain.RetryExec execute
Dec 19 10:05:35 pl-pd-nls1 logstash: INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://localhost:9200: Connection reset


This message also generating almost every 2 sec.

also, I can see the following message in the elasticsearch logs.

[2018-12-19 10:14:06,646][WARN ][http.netty ] [12c3bc39-1cf0-4dfd-a2c8-d9ec25fd25e8] Caught exception while handling client http traffic, closing connection [id: 0x848a19a0, /127.0.0.1:47880 => /127.0.0.1:9200]
org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException: HTTP content length exceeded 104857600 bytes.



However as I checked, logstash and elasticsearch still running. I am able to telnet localhost 9200 too. I have core 4 CPU in my cluster and I can see
CPU usage is varying 100 - 200% always.

As I suspect, I am getting quite a high payload of syslog data. But it seems, we need this type of logs to be stored in NLS.

Can you please help me to resolve this issue? What configurations need to be optimized?

Thank you
Luke.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Elasticsearch appears to be unreachable or down!

Post by npolovenko »

Hello, @lukedevon. Could you send in your System Profile so I can review it?
​Attached is a script to gather a profile from the command line.
Copy the script to the machine and from the command line run:
chmod 755 profile.sh
./profile.sh
It will generate a file called system-profile.tar.gz in /tmp.
profile.sh
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
lukedevon
Posts: 143
Joined: Sat Mar 24, 2018 9:15 am

Re: Elasticsearch appears to be unreachable or down!

Post by lukedevon »

Hi

Thanks for the support. I PM you the files.

Current status of mem usage;

total used free shared buff/cache available
Mem: 32012 18902 250 129 12859 12433
Swap: 65531 0 65531


About to die.

Regards
Luke.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Elasticsearch appears to be unreachable or down!

Post by npolovenko »

@lukedevon, Let's increase the http content length setting in the:
/usr/local/nagioslogserver/elasticsearch/config/elasticsearch.yml
Open the file with a text editor, and find the following setting:
# http.max_content_length: 100mb
Uncomment the line and change it to 500:
http.max_content_length: 500mb
Save the file and run the following commands:
service elasticsearch restart
service logstash restart
service httpd restart
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
lukedevon
Posts: 143
Joined: Sat Mar 24, 2018 9:15 am

Re: Elasticsearch appears to be unreachable or down!

Post by lukedevon »

Hi,

I just configured as you suggested and now the error has been cleared. But after some time, I can see log receiving getting slower and free memory in nls instances are very low. sometimes 500 - 200mb.

swap also used.

Can you please help me to fix this? should I need to increase physical memory?

Currently, each nls having 32 GB RAM.

Regards
Luke.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Elasticsearch appears to be unreachable or down!

Post by npolovenko »

@lukedevon, Yes, I suggest going up to 64Gb of ram if possible.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked