Log Server Slow
Log Server Slow
We are in the middle of sizing our log server cluster. Currently we are doing about 17 million hits per 24hours, and our log server web interface is almost unusable. I didn't setup the backup repository until today, but do the backup/maintenance jobs run as soon as the repository and backup/maintenance settings are configured? I suspect that due to the size of the current data with nothing being closed, it is just struggling.
We have a two node cluster....
1. How do I keep an eye on the maintenance task and know what it is doing?
2. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
3. How much CPU should I give the machine? I tested with 2 and 4 which doesn't to be sufficient.
4. What can I do from here to make NLS usable?I am just no longer able to navigate around the web interface.
Thanks in advance. We are look for some guidance on making sure we size the cluster right.
We have a two node cluster....
1. How do I keep an eye on the maintenance task and know what it is doing?
2. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
3. How much CPU should I give the machine? I tested with 2 and 4 which doesn't to be sufficient.
4. What can I do from here to make NLS usable?I am just no longer able to navigate around the web interface.
Thanks in advance. We are look for some guidance on making sure we size the cluster right.
Re: Log Server Slow
Q: Do the backup/maintenance jobs run as soon as the repository and backup/maintenance settings are configured?
A: I ran a test on my end, and when I configured my backup settings the backup did not start. The backup did start when I ran the 'backup_maintenance' Command from the 'Command Subsystem' menu. It looks like a backup has to either be initiated manually from the Command Subsystem, or it has to reach its daily backup time.
Q. How do I keep an eye on the maintenance task and know what it is doing?
A: To my knowledge there is no way to actively 'watch' the maintenance command. Maintenance is performed through an Elasticsearch extension called 'Curator'.
Q. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
A: It's very hard to give hardware recommendations without understanding all of the variables in your network. In general, it is recommended that half of the RAM of the server is allocated to your Elasticsearch Java heap. You can configure this by uncommenting the '#ES_HEAP_SIZE=2g' variable and setting it accordingly, followed by restarting Elasticsearch.
Q. How much CPU should I give the machine? I tested with 2 and 4 which doesn't to be sufficient.
A: One thing I will point out, performance wise, is that there is only a marginal benefit of 2 nodes over a single as all data is indexed on both instances, the real load reduction benefit comes with 3+ nodes as the indexing will always only happen on 2 instances. If you are not capable of adding a third node, I recommend upgrading the CPU and RAM amounts appropriately until you see a benefit.
Q. What can I do from here to make NLS usable?I am just no longer able to navigate around the web interface.
A: I assume that the GUI is unavailable due to high resource load. Can you post the output of the following please:
EDIT: Corrected some grammar for clarity
A: I ran a test on my end, and when I configured my backup settings the backup did not start. The backup did start when I ran the 'backup_maintenance' Command from the 'Command Subsystem' menu. It looks like a backup has to either be initiated manually from the Command Subsystem, or it has to reach its daily backup time.
Q. How do I keep an eye on the maintenance task and know what it is doing?
A: To my knowledge there is no way to actively 'watch' the maintenance command. Maintenance is performed through an Elasticsearch extension called 'Curator'.
Q. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
A: It's very hard to give hardware recommendations without understanding all of the variables in your network. In general, it is recommended that half of the RAM of the server is allocated to your Elasticsearch Java heap. You can configure this by uncommenting the '#ES_HEAP_SIZE=2g' variable and setting it accordingly, followed by restarting Elasticsearch.
Code: Select all
vi /etc/sysconfig/elasticsearchA: One thing I will point out, performance wise, is that there is only a marginal benefit of 2 nodes over a single as all data is indexed on both instances, the real load reduction benefit comes with 3+ nodes as the indexing will always only happen on 2 instances. If you are not capable of adding a third node, I recommend upgrading the CPU and RAM amounts appropriately until you see a benefit.
Q. What can I do from here to make NLS usable?I am just no longer able to navigate around the web interface.
A: I assume that the GUI is unavailable due to high resource load. Can you post the output of the following please:
Code: Select all
top
free -m
df -h
ps aux
Re: Log Server Slow
Thanks for the detailed reply to my questions.
however when checking the value, it is returning with false.
Does this have any additional benefits or impact? This is to prevent Elasticsearch from swapping. I am not familiar enough with Elasticsearch to know.
For now, I've closed off some indices manually and configured the heap size, seems to be running much smoother now.
I've left this for about a day, but it did not execute. I suspect that the nagios user does not have permission to the CIFS share. I am not sure if there are logs to check further?Q: Do the backup/maintenance jobs run as soon as the repository and backup/maintenance settings are configured?
A: I ran a test on my end, and when I configured my backup settings the backup did not start. The backup did start when I ran the 'backup_maintenance' Command 'Command Subsystem' menu. It looks like it has to either be initiated manually from the Command Subsystem, or it has to reach its daily backup time.
After setting the Heap size it is working much better. I was also reviewing the elasticsearch's configuration and found their article on the mlockall memory setting. (http://www.elasticsearch.org/guide/en/e ... ation.html) Looks like mlock is set to true in the config...Q. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
A: It's very hard to give hardware recommendations without understanding all of the variables in your network. In general, it is recommended that half of the RAM of the server is allocated to your Elasticsearch Java heap. You can configure this by uncommenting the '#ES_HEAP_SIZE=2g' variable and setting it accordingly, followed by restarting Elasticsearch.
Code: Select all
bootstrap.mlockall: trueCode: Select all
"cluster_name" : "87f95151-7003-42fc-a76a-bc101723dfc0",
"nodes" : {
"_aEDEqC3Q2-_eMBdzI-RJA" : {
"name" : "51638f3e-29a3-4ead-8aa6-ee0811017f01",
"transport_address" : "inet[/10.242.13.78:9300]",
"host" : "localhost.localdomain",
"ip" : "127.0.0.1",
"version" : "1.3.2",
"build" : "dee175d",
"http_address" : "inet[localhost/127.0.0.1:9200]",
"attributes" : {
"max_local_storage_nodes" : "1"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 1390,
"max_file_descriptors" : 65535,
"mlockall" : false
}
},
"jeUh0fIzTzeUdpqlPY9puQ" : {
"name" : "16fcc224-849a-405f-bfaf-8321387b7294",
"transport_address" : "inet[/10.242.13.77:9300]",
"host" : "nls01",
"ip" : "127.0.0.1",
"version" : "1.3.2",
"build" : "dee175d",
"http_address" : "inet[localhost/127.0.0.1:9200]",
"attributes" : {
"max_local_storage_nodes" : "1"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 1451,
"max_file_descriptors" : 65535,
"mlockall" : false
For now, I've closed off some indices manually and configured the heap size, seems to be running much smoother now.
Re: Log Server Slow
There are no logs that I am aware of - can you run the following command and return the output please:I've left this for about a day, but it did not execute. I suspect that the nagios user does not have permission to the CIFS share. I am not sure if there are logs to check further?
Code: Select all
ll -d /examplebackup/Just FYI, I got it to turn on properly by performing the following:Does this have any additional benefits or impact? This is to prevent Elasticsearch from swapping. I am not familiar enough with Elasticsearch to know. Looks like mlock is set to true in the config...
however when checking the value, it is returning with false.
Code: Select all
service elasticsearch stop
ulimit -l unlimited
service elasticsearch start
curl http://localhost:9200/_nodes/process?pretty
"mlockall" : trueIt looks like it wasn't on because the nagios user didn't have perms to lock memory. Feel free to turn it on if you'd like, but please be careful while doing so.
Re: Log Server Slow
WIth the Heap Memory and the maintenance job, the server seems to run better now. Looking at the memory usage over time, it is still swapping, but not sure if this is a concern.
Looks like the backup jobs are now running and the repository is showing the backups.
Below is some stats in case you see something odd for memory usage :
Looks like the backup jobs are now running and the repository is showing the backups.
Below is some stats in case you see something odd for memory usage :
Code: Select all
Tasks: 161 total, 1 running, 160 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.5%us, 1.1%sy, 3.8%ni, 92.6%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 10261676k total, 10106288k used, 155388k free, 88616k buffers
Swap: 262136k total, 82528k used, 179608k free, 4873052k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1474 root 39 19 3985m 281m 5836 S 34.6 2.8 288:27.01 java
1402 nagios 20 0 25.0g 4.3g 206m S 16.3 44.0 253:35.73 java
23068 root 20 0 15028 1352 988 R 0.3 0.0 0:00.01 top
1 root 20 0 19232 1304 1112 S 0.0 0.0 0:01.44 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
Code: Select all
total used free shared buffers cached
Mem: 10021 9892 129 0 86 4781
-/+ buffers/cache: 5024 4996
Swap: 255 80 175
Code: Select all
Filesystem Size Used Avail Use% Mounted on
rootfs 133G 52G 80G 40% /
devtmpfs 4.9G 148K 4.9G 1% /dev
tmpfs 4.9G 0 4.9G 0% /dev/shm
/dev/sda1 133G 52G 80G 40% /
//10.242.13.110/repo/
100G 32G 69G 32% /mnt/reposhare
Code: Select all
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 19232 1304 ? Ss Mar05 0:01 /sbin/init
root 2 0.0 0.0 0 0 ? S Mar05 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Mar05 0:00 [migration/0]
root 4 0.0 0.0 0 0 ? S Mar05 0:06 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S Mar05 0:00 [migration/0]
root 6 0.0 0.0 0 0 ? S Mar05 0:00 [watchdog/0]
root 7 0.0 0.0 0 0 ? S Mar05 0:00 [migration/1]
root 8 0.0 0.0 0 0 ? S Mar05 0:00 [migration/1]
root 9 0.0 0.0 0 0 ? S Mar05 0:03 [ksoftirqd/1]
root 10 0.0 0.0 0 0 ? S Mar05 0:00 [watchdog/1]
root 11 0.0 0.0 0 0 ? S Mar05 0:00 [migration/2]
root 12 0.0 0.0 0 0 ? S Mar05 0:00 [migration/2]
root 13 0.0 0.0 0 0 ? S Mar05 0:04 [ksoftirqd/2]
root 14 0.0 0.0 0 0 ? S Mar05 0:00 [watchdog/2]
root 15 0.0 0.0 0 0 ? S Mar05 0:00 [migration/3]
root 16 0.0 0.0 0 0 ? S Mar05 0:00 [migration/3]
root 17 0.0 0.0 0 0 ? S Mar05 0:03 [ksoftirqd/3]
root 18 0.0 0.0 0 0 ? S Mar05 0:00 [watchdog/3]
root 19 0.0 0.0 0 0 ? S Mar05 0:00 [migration/4]
root 20 0.0 0.0 0 0 ? S Mar05 0:00 [migration/4]
root 21 0.0 0.0 0 0 ? S Mar05 0:03 [ksoftirqd/4]
root 22 0.0 0.0 0 0 ? S Mar05 0:00 [watchdog/4]
root 23 0.0 0.0 0 0 ? S Mar05 0:00 [migration/5]
root 24 0.0 0.0 0 0 ? S Mar05 0:00 [migration/5]
root 25 0.0 0.0 0 0 ? S Mar05 0:03 [ksoftirqd/5]
root 26 0.0 0.0 0 0 ? S Mar05 0:00 [watchdog/5]
root 27 0.0 0.0 0 0 ? S Mar05 0:04 [events/0]
root 28 0.0 0.0 0 0 ? S Mar05 0:04 [events/1]
root 29 0.0 0.0 0 0 ? S Mar05 0:28 [events/2]
root 30 0.0 0.0 0 0 ? S Mar05 0:04 [events/3]
root 31 0.0 0.0 0 0 ? S Mar05 0:04 [events/4]
root 32 0.0 0.0 0 0 ? S Mar05 0:07 [events/5]
root 33 0.0 0.0 0 0 ? S Mar05 0:00 [cgroup]
root 34 0.0 0.0 0 0 ? S Mar05 0:00 [khelper]
root 35 0.0 0.0 0 0 ? S Mar05 0:00 [netns]
root 36 0.0 0.0 0 0 ? S Mar05 0:00 [async/mgr]
root 37 0.0 0.0 0 0 ? S Mar05 0:00 [pm]
root 38 0.0 0.0 0 0 ? S Mar05 0:00 [sync_supers]
root 39 0.0 0.0 0 0 ? S Mar05 0:26 [bdi-default]
root 40 0.0 0.0 0 0 ? S Mar05 0:00 [kintegrityd/0]
root 41 0.0 0.0 0 0 ? S Mar05 0:00 [kintegrityd/1]
root 42 0.0 0.0 0 0 ? S Mar05 0:00 [kintegrityd/2]
root 43 0.0 0.0 0 0 ? S Mar05 0:00 [kintegrityd/3]
root 44 0.0 0.0 0 0 ? S Mar05 0:00 [kintegrityd/4]
root 45 0.0 0.0 0 0 ? S Mar05 0:00 [kintegrityd/5]
root 46 0.0 0.0 0 0 ? S Mar05 0:01 [kblockd/0]
root 47 0.0 0.0 0 0 ? S Mar05 0:01 [kblockd/1]
root 48 0.0 0.0 0 0 ? S Mar05 0:01 [kblockd/2]
root 49 0.0 0.0 0 0 ? S Mar05 0:01 [kblockd/3]
root 50 0.0 0.0 0 0 ? S Mar05 0:01 [kblockd/4]
root 51 0.0 0.0 0 0 ? S Mar05 0:01 [kblockd/5]
root 52 0.0 0.0 0 0 ? S Mar05 0:00 [kacpid]
root 53 0.0 0.0 0 0 ? S Mar05 0:00 [kacpi_notify]
root 54 0.0 0.0 0 0 ? S Mar05 0:00 [kacpi_hotplug]
root 55 0.0 0.0 0 0 ? S Mar05 0:00 [ata/0]
root 56 0.0 0.0 0 0 ? S Mar05 0:00 [ata/1]
root 57 0.0 0.0 0 0 ? S Mar05 0:00 [ata/2]
root 58 0.0 0.0 0 0 ? S Mar05 0:00 [ata/3]
root 59 0.0 0.0 0 0 ? S Mar05 0:00 [ata/4]
root 60 0.0 0.0 0 0 ? S Mar05 0:00 [ata/5]
root 61 0.0 0.0 0 0 ? S Mar05 0:00 [ata_aux]
root 62 0.0 0.0 0 0 ? S Mar05 0:00 [ksuspend_usbd]
root 63 0.0 0.0 0 0 ? S Mar05 0:00 [khubd]
root 64 0.0 0.0 0 0 ? S Mar05 0:00 [kseriod]
root 65 0.0 0.0 0 0 ? S Mar05 0:00 [md/0]
root 66 0.0 0.0 0 0 ? S Mar05 0:00 [md/1]
root 67 0.0 0.0 0 0 ? S Mar05 0:00 [md/2]
root 68 0.0 0.0 0 0 ? S Mar05 0:00 [md/3]
root 69 0.0 0.0 0 0 ? S Mar05 0:00 [md/4]
root 70 0.0 0.0 0 0 ? S Mar05 0:00 [md/5]
root 71 0.0 0.0 0 0 ? S Mar05 0:00 [md_misc/0]
root 72 0.0 0.0 0 0 ? S Mar05 0:00 [md_misc/1]
root 73 0.0 0.0 0 0 ? S Mar05 0:00 [md_misc/2]
root 74 0.0 0.0 0 0 ? S Mar05 0:00 [md_misc/3]
root 75 0.0 0.0 0 0 ? S Mar05 0:00 [md_misc/4]
root 76 0.0 0.0 0 0 ? S Mar05 0:00 [md_misc/5]
root 77 0.0 0.0 0 0 ? S Mar05 0:00 [khungtaskd]
root 78 0.0 0.0 0 0 ? S Mar05 0:28 [kswapd0]
root 79 0.0 0.0 0 0 ? SN Mar05 0:00 [ksmd]
root 80 0.0 0.0 0 0 ? SN Mar05 0:04 [khugepaged]
root 81 0.0 0.0 0 0 ? S Mar05 0:00 [aio/0]
root 82 0.0 0.0 0 0 ? S Mar05 0:00 [aio/1]
root 83 0.0 0.0 0 0 ? S Mar05 0:00 [aio/2]
root 84 0.0 0.0 0 0 ? S Mar05 0:00 [aio/3]
root 85 0.0 0.0 0 0 ? S Mar05 0:00 [aio/4]
root 86 0.0 0.0 0 0 ? S Mar05 0:00 [aio/5]
root 87 0.0 0.0 0 0 ? S Mar05 0:00 [crypto/0]
root 88 0.0 0.0 0 0 ? S Mar05 0:00 [crypto/1]
root 89 0.0 0.0 0 0 ? S Mar05 0:00 [crypto/2]
root 90 0.0 0.0 0 0 ? S Mar05 0:00 [crypto/3]
root 91 0.0 0.0 0 0 ? S Mar05 0:00 [crypto/4]
root 92 0.0 0.0 0 0 ? S Mar05 0:00 [crypto/5]
root 97 0.0 0.0 0 0 ? S Mar05 0:00 [kthrotld/0]
root 98 0.0 0.0 0 0 ? S Mar05 0:00 [kthrotld/1]
root 99 0.0 0.0 0 0 ? S Mar05 0:00 [kthrotld/2]
root 100 0.0 0.0 0 0 ? S Mar05 0:00 [kthrotld/3]
root 101 0.0 0.0 0 0 ? S Mar05 0:00 [kthrotld/4]
root 102 0.0 0.0 0 0 ? S Mar05 0:00 [kthrotld/5]
root 103 0.0 0.0 0 0 ? S Mar05 0:00 [pciehpd]
root 105 0.0 0.0 0 0 ? S Mar05 0:00 [kpsmoused]
root 106 0.0 0.0 0 0 ? S Mar05 0:00 [usbhid_resumer]
root 185 0.0 0.0 0 0 ? S Mar05 0:00 [scsi_eh_0]
root 186 0.0 0.0 0 0 ? S Mar05 0:00 [scsi_eh_1]
root 225 0.0 0.0 0 0 ? S Mar05 0:01 [mpt_poll_0]
root 226 0.0 0.0 0 0 ? S Mar05 0:00 [mpt/0]
root 227 0.0 0.0 0 0 ? S Mar05 0:00 [scsi_eh_2]
root 412 0.0 0.0 0 0 ? S Mar05 0:22 [jbd2/sda1-8]
root 413 0.0 0.0 0 0 ? S Mar05 0:00 [ext4-dio-unwrit]
root 414 0.0 0.0 0 0 ? S Mar05 0:00 [ext4-dio-unwrit]
root 415 0.0 0.0 0 0 ? S Mar05 0:00 [ext4-dio-unwrit]
root 416 0.0 0.0 0 0 ? S Mar05 0:00 [ext4-dio-unwrit]
root 417 0.0 0.0 0 0 ? S Mar05 0:00 [ext4-dio-unwrit]
root 418 0.0 0.0 0 0 ? S Mar05 0:00 [ext4-dio-unwrit]
root 491 0.0 0.0 11044 340 ? S<s Mar05 0:00 /sbin/udevd -d
root 667 0.0 0.0 0 0 ? S Mar05 0:01 [vmmemctl]
root 836 0.0 0.0 11040 364 ? S< Mar05 0:00 /sbin/udevd -d
root 838 0.0 0.0 0 0 ? S Mar05 0:00 [kstriped]
root 871 0.0 0.0 0 0 ? S Mar05 0:34 [flush-8:0]
root 885 0.0 0.0 0 0 ? S Mar05 0:01 [kauditd]
root 1228 0.1 0.0 179068 2444 ? S Mar05 1:10 /usr/sbin/vmtoolsd
root 1314 0.0 0.0 93176 712 ? S<sl Mar05 0:05 auditd
root 1330 0.0 0.0 333460 4796 ? Sl Mar05 0:04 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root 1365 0.0 0.0 66628 612 ? Ss Mar05 0:00 /usr/sbin/sshd
root 1373 0.0 0.0 22180 864 ? Ss Mar05 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
ntp 1381 0.0 0.0 30732 1540 ? Ss Mar05 0:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
nagios 1402 22.2 43.9 26211812 4512072 ? Sl Mar05 254:02 /usr/bin/java -Xms5g -Xmx5g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Des.cluster.name=87f95151-7003-42fc-a76a-bc101723dfc0 -Des.node.name=16fcc224-849a-405f-bfaf-8321387b7294 -Des.discovery.zen.ping.unicast.hosts=10.242.13.77,10.242.13.78 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/local/nagioslogserver/elasticsearch -cp :/usr/local/nagioslogserver/elasticsearch/lib/elasticsearch-1.3.2.jar:/usr/local/nagioslogserver/elasticsearch/lib/*:/usr/local/nagioslogserver/elasticsearch/lib/sigar/* -Des.default.path.home=/usr/local/nagioslogserver/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/usr/local/nagioslogserver/elasticsearch/data -Des.default.path.work=/usr/local/nagioslogserver/tmp/elasticsearch -Des.default.path.conf=/usr/local/nagioslogserver/elasticsearch/config org.elasticsearch.bootstrap.Elasticsearch
root 1418 0.0 0.0 83080 852 ? Ss Mar05 0:02 sendmail: accepting connections
smmsp 1426 0.0 0.0 78664 768 ? Ss Mar05 0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root 1445 0.0 0.0 238316 6184 ? Ss Mar05 0:03 /usr/sbin/httpd
root 1453 0.0 0.0 117296 748 ? Ss Mar05 0:01 crond
root 1464 0.0 0.0 131172 1116 ? SN Mar05 0:00 runuser -s /bin/sh -c exec /usr/local/nagioslogserver/logstash/bin/logstash agent -f /usr/local/nagioslogserver/logstash/etc/conf.d -l /var/log/logstash/logstash.log -w 4 root
apache 1466 0.0 0.1 245188 12336 ? S Mar05 0:12 /usr/sbin/httpd
apache 1467 0.0 0.1 245692 12724 ? S Mar05 0:12 /usr/sbin/httpd
apache 1468 0.0 0.1 244924 12412 ? S Mar05 0:12 /usr/sbin/httpd
apache 1469 0.0 0.1 245188 11888 ? S Mar05 0:12 /usr/sbin/httpd
apache 1470 0.0 0.1 244032 12564 ? S Mar05 0:12 /usr/sbin/httpd
apache 1471 0.0 0.5 289660 53244 ? S Mar05 0:13 /usr/sbin/httpd
apache 1472 0.0 0.1 244964 11792 ? S Mar05 0:13 /usr/sbin/httpd
apache 1473 0.0 0.1 243108 11200 ? S Mar05 0:12 /usr/sbin/httpd
root 1474 25.3 2.8 4081068 288104 ? SNl Mar05 288:54 /usr/bin/java -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -jar /usr/local/nagioslogserver/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/usr/local/nagioslogserver/logstash/lib /usr/local/nagioslogserver/logstash/lib/logstash/runner.rb agent -f /usr/local/nagioslogserver/logstash/etc/conf.d -l /var/log/logstash/logstash.log -w 4
root 1543 0.0 0.0 4064 508 tty1 Ss+ Mar05 0:00 /sbin/mingetty /dev/tty1
root 1545 0.0 0.0 4064 508 tty2 Ss+ Mar05 0:00 /sbin/mingetty /dev/tty2
root 1547 0.0 0.0 4064 508 tty3 Ss+ Mar05 0:00 /sbin/mingetty /dev/tty3
root 1548 0.0 0.0 11040 328 ? S< Mar05 0:00 /sbin/udevd -d
root 1550 0.0 0.0 4064 508 tty4 Ss+ Mar05 0:00 /sbin/mingetty /dev/tty4
root 1552 0.0 0.0 4064 508 tty5 Ss+ Mar05 0:00 /sbin/mingetty /dev/tty5
root 1554 0.0 0.0 4064 508 tty6 Ss+ Mar05 0:00 /sbin/mingetty /dev/tty6
root 1849 0.0 0.0 94224 3404 ? Ss Mar05 0:08 sshd: root@pts/0
root 1851 0.0 0.0 108304 1844 pts/0 Ss Mar05 0:00 -bash
root 6810 0.0 0.0 0 0 ? S Mar05 0:00 [cifsiod]
root 6862 0.0 0.0 0 0 ? S Mar05 0:28 [cifsd]
apache 11583 0.0 0.1 246044 13640 ? S Mar05 0:10 /usr/sbin/httpd
root 21426 0.0 0.0 0 0 ? S< 10:32 0:00 [kslowd005]
root 21431 0.0 0.0 0 0 ? S< 10:33 0:00 [kslowd004]
root 23158 0.0 0.0 136064 1312 ? S 10:53 0:00 CROND
root 23159 0.0 0.0 136064 1316 ? S 10:53 0:00 CROND
nagios 23160 0.0 0.0 106060 1272 ? Ss 10:53 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
nagios 23161 0.0 0.0 106060 1268 ? Ss 10:53 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 23162 0.1 0.1 216520 12168 ? S 10:53 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 23163 0.1 0.1 216008 11512 ? S 10:53 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller
root 23202 0.0 0.0 110236 1168 pts/0 R+ 10:53 0:00 ps aux
Re: Log Server Slow
Looks good to me. I spoke with a developer yesterday about this thread, his recommendation is as follows:
This will ensure that mlockall will be persistent through reboots, and you do not have to run the 'ulimit' command for this to work.
MAX_LOCKED_MEMORY=unlimited is going to be the default setting moving forward. Thanks for bringing this up!
Code: Select all
vi /etc/sysconfig/elasticsearch
uncomment ES_HEAP_SIZE=1g -replace '1g' with half of your total RAM amount (in that node).
uncomment "MAX_LOCKED_MEMORY" and change the value to "MAX_LOCKED_MEMORY=unlimited"
service elasticsearch restartMAX_LOCKED_MEMORY=unlimited is going to be the default setting moving forward. Thanks for bringing this up!
Re: Log Server Slow
Thanks again. I have made the changes and will watch for a few days.
mlockall is now showing true. Cheers.
mlockall is now showing true. Cheers.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Log Server Slow
Awesome.. Let us know if anything else pops up.
Re: Log Server Slow
Just reporting in. The changes made along with proper maintenance and backup schedules in placed resolved my issue. We 've got over 42million hits in 24hours and it is running very smooth! Thanks!