Nagios Support Forum

Posted: **Thu Jan 15, 2015 8:36 pm**

Hello,

My interface has slowed down a great deal. This is what I have. Each node has 4 VCPU's and 8GB RAM

IP Hostname Port 1m, 5m, 15m Load CPU % Memory Used Memory Free Storage Total Storage Available Elasticsearch Logstash Actions
10.xx.x.247 xxx2nls2 9300 1.31, 0.83, 0.77 45% 23% 76% 639.8GB 536.3GB [Elasticsearch is running...] [Logstash is running...] -
10.xx.x.246 xxx2nls1 9300 0.39, 0.22, 0.32 14% 25% 74% 639.8GB 532.8GB [Elasticsearch is running...] [Logstash is running...] -
10.yy.y.246 yyy2nls1 9300 0.18, 0.34, 0.37 9% 23% 76% 639.8GB 532.7GB [Elasticsearch is running...] [Logstash is running...] -
10.yy.y.147 yyy2nls2 9300 2.13, 1.65, 1.71 19% 26% 73% 639.8GB 534.8GB [Elasticsearch is running...] [Logstash is running...] -

590,538,616 Documents
135.7GB Primary Size
265.4GB Total Size
4 Data Instances
272 Total Shards
28 Indices

4 Total Instances
0 Client
4 Master/Data
16 Processors
0% Process CPU
5.00 GBMemory Used
0 bytesSwap
2,559.20 GB Total Storage
2,272.31 G BFree Storage
8.39 G BData Read
8.73 GB Data Written
17.12 GB I/O Size

This is what the top shows:

1213 nagios 20 0 37.1g 2.1g 934m S 194.2 26.9 4051:37 /usr/bin/java -Xms256m -Xmx1g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyO
28899 nagios 39 19 2966m 392m 11m S 104.1 4.9 1634:49 /usr/bin/java -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75

One of the processes is logstash, the other one is elastic search

Has anyone run into issues like I am doing?

Posted: **Fri Jan 16, 2015 2:13 pm**

How many servers are you collecting logs from?

How many documents are you collecting per day/hour/minute?

How Long do you retain you log data?

Have considered filtering your inbound data to the logserver cluster?

Are each of your nodes on the same vmware server?

Your cpu % level would indicate your I/O bound, could be disk, or network.

Posted: **Fri Jan 16, 2015 6:12 pm**

cmerchant wrote:How many servers are you collecting logs from?

How many documents are you collecting per day/hour/minute?

How Long do you retain you log data?

Have considered filtering your inbound data to the logserver cluster?

Are each of your nodes on the same vmware server?

Your cpu % level would indicate your I/O bound, could be disk, or network.

At the moment I have 80 logsources. This includes server level logs, as well as application logs.
I am collecting anywhere between 45-90 million documents a day.
So far I have it set to 30 day retention.
What kind of filtering would you propose?
All my nodes have identical resources.

Posted: **Mon Jan 19, 2015 5:37 pm**

stecino wrote:What kind of filtering would you propose?

It all depends on what you need. Take a look at the logs coming from a source. Are there any lines you feel you do not need?
As you may be I/O bound, have you though about spinning up a second instance or increasing the speed of your disk subsystem?

Posted: **Tue Jan 20, 2015 1:13 am**

abrist wrote:
stecino wrote:What kind of filtering would you propose?
It all depends on what you need. Take a look at the logs coming from a source. Are there any lines you feel you do not need?
As you may be I/O bound, have you though about spinning up a second instance or increasing the speed of your disk subsystem?

I updates setup-linux.sh to use UDP, as oppose to TCP, I am also made sure that logsources were sending to cluster node that was in the same network.
This addressed the issue.

Posted: **Tue Jan 20, 2015 10:33 am**

As long as you understand and accept the risks associated with UDP, that will reduce the overhead somewhat.

Nagios Support Forum

Interface slow down

Interface slow down

Re: Interface slow down

Re: Interface slow down

Re: Interface slow down

Re: Interface slow down

Re: Interface slow down