Log Server Slow

OptimusB · Post by **OptimusB** » Wed Mar 04, 2015 4:50 pm

We are in the middle of sizing our log server cluster. Currently we are doing about 17 million hits per 24hours, and our log server web interface is almost unusable. I didn't setup the backup repository until today, but do the backup/maintenance jobs run as soon as the repository and backup/maintenance settings are configured? I suspect that due to the size of the current data with nothing being closed, it is just struggling.

We have a two node cluster....

1. How do I keep an eye on the maintenance task and know what it is doing?
2. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
3. How much CPU should I give the machine? I tested with 2 and 4 which doesn't to be sufficient.
4. What can I do from here to make NLS usable?I am just no longer able to navigate around the web interface.

Thanks in advance. We are look for some guidance on making sure we size the cluster right.

jolson · Post by **jolson** » Thu Mar 05, 2015 10:52 am

Q: Do the backup/maintenance jobs run as soon as the repository and backup/maintenance settings are configured?
A: I ran a test on my end, and when I configured my backup settings the backup did not start. The backup did start when I ran the 'backup_maintenance' Command from the 'Command Subsystem' menu. It looks like a backup has to either be initiated manually from the Command Subsystem, or it has to reach its daily backup time.

Q. How do I keep an eye on the maintenance task and know what it is doing?
A: To my knowledge there is no way to actively 'watch' the maintenance command. Maintenance is performed through an Elasticsearch extension called 'Curator'.

Q. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
A: It's very hard to give hardware recommendations without understanding all of the variables in your network. In general, it is recommended that half of the RAM of the server is allocated to your Elasticsearch Java heap. You can configure this by uncommenting the '#ES_HEAP_SIZE=2g' variable and setting it accordingly, followed by restarting Elasticsearch.

Code: Select all

vi /etc/sysconfig/elasticsearch

Q. How much CPU should I give the machine? I tested with 2 and 4 which doesn't to be sufficient.
A: One thing I will point out, performance wise, is that there is only a marginal benefit of 2 nodes over a single as all data is indexed on both instances, the real load reduction benefit comes with 3+ nodes as the indexing will always only happen on 2 instances. If you are not capable of adding a third node, I recommend upgrading the CPU and RAM amounts appropriately until you see a benefit.

Q. What can I do from here to make NLS usable?I am just no longer able to navigate around the web interface.
A: I assume that the GUI is unavailable due to high resource load. Can you post the output of the following please:

Code: Select all

top
free -m
df -h
ps aux

EDIT: Corrected some grammar for clarity

OptimusB · Post by **OptimusB** » Thu Mar 05, 2015 4:27 pm

Thanks for the detailed reply to my questions.

Q: Do the backup/maintenance jobs run as soon as the repository and backup/maintenance settings are configured?
A: I ran a test on my end, and when I configured my backup settings the backup did not start. The backup did start when I ran the 'backup_maintenance' Command 'Command Subsystem' menu. It looks like it has to either be initiated manually from the Command Subsystem, or it has to reach its daily backup time.

I've left this for about a day, but it did not execute. I suspect that the nagios user does not have permission to the CIFS share. I am not sure if there are logs to check further?

Q. I am trying to determine a proper configuration for our usage. It seems that over time, the RAM gets consumed. I started with 4GB then to 8GB and set at 10GB.
A: It's very hard to give hardware recommendations without understanding all of the variables in your network. In general, it is recommended that half of the RAM of the server is allocated to your Elasticsearch Java heap. You can configure this by uncommenting the '#ES_HEAP_SIZE=2g' variable and setting it accordingly, followed by restarting Elasticsearch.

After setting the Heap size it is working much better. I was also reviewing the elasticsearch's configuration and found their article on the mlockall memory setting. (http://www.elasticsearch.org/guide/en/e ... ation.html) Looks like mlock is set to true in the config...

Code: Select all

bootstrap.mlockall: true

however when checking the value, it is returning with false.

Code: Select all

  "cluster_name" : "87f95151-7003-42fc-a76a-bc101723dfc0",
  "nodes" : {
    "_aEDEqC3Q2-_eMBdzI-RJA" : {
      "name" : "51638f3e-29a3-4ead-8aa6-ee0811017f01",
      "transport_address" : "inet[/10.242.13.78:9300]",
      "host" : "localhost.localdomain",
      "ip" : "127.0.0.1",
      "version" : "1.3.2",
      "build" : "dee175d",
      "http_address" : "inet[localhost/127.0.0.1:9200]",
      "attributes" : {
        "max_local_storage_nodes" : "1"
      },
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 1390,
        "max_file_descriptors" : 65535,
        "mlockall" : false
      }
    },
    "jeUh0fIzTzeUdpqlPY9puQ" : {
      "name" : "16fcc224-849a-405f-bfaf-8321387b7294",
      "transport_address" : "inet[/10.242.13.77:9300]",
      "host" : "nls01",
      "ip" : "127.0.0.1",
      "version" : "1.3.2",
      "build" : "dee175d",
      "http_address" : "inet[localhost/127.0.0.1:9200]",
      "attributes" : {
        "max_local_storage_nodes" : "1"
      },
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 1451,
        "max_file_descriptors" : 65535,
        "mlockall" : false

Does this have any additional benefits or impact? This is to prevent Elasticsearch from swapping. I am not familiar enough with Elasticsearch to know.

For now, I've closed off some indices manually and configured the heap size, seems to be running much smoother now.

jolson · Post by **jolson** » Thu Mar 05, 2015 5:04 pm

I've left this for about a day, but it did not execute. I suspect that the nagios user does not have permission to the CIFS share. I am not sure if there are logs to check further?

There are no logs that I am aware of - can you run the following command and return the output please:

Code: Select all

ll -d /examplebackup/

The directory will need to be accessible by the nagios user (GID/UID 500) and the nagios user needs rwx permissions for the directory.

Does this have any additional benefits or impact? This is to prevent Elasticsearch from swapping. I am not familiar enough with Elasticsearch to know. Looks like mlock is set to true in the config...
however when checking the value, it is returning with false.

Just FYI, I got it to turn on properly by performing the following:

Code: Select all

service elasticsearch stop
ulimit -l unlimited
service elasticsearch start
curl http://localhost:9200/_nodes/process?pretty
"mlockall" : true

The above workaround came from the page that you linked me to: http://www.elasticsearch.org/guide/en/e ... ation.html
It looks like it wasn't on because the nagios user didn't have perms to lock memory. Feel free to turn it on if you'd like, but please be careful while doing so.

OptimusB · Post by **OptimusB** » Fri Mar 06, 2015 1:56 pm

WIth the Heap Memory and the maintenance job, the server seems to run better now. Looking at the memory usage over time, it is still swapping, but not sure if this is a concern.

Looks like the backup jobs are now running and the repository is showing the backups.

Below is some stats in case you see something odd for memory usage :

Code: Select all

Tasks: 161 total,   1 running, 160 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.5%us,  1.1%sy,  3.8%ni, 92.6%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  10261676k total, 10106288k used,   155388k free,    88616k buffers
Swap:   262136k total,    82528k used,   179608k free,  4873052k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1474 root      39  19 3985m 281m 5836 S 34.6  2.8 288:27.01 java
 1402 nagios    20   0 25.0g 4.3g 206m S 16.3 44.0 253:35.73 java
23068 root      20   0 15028 1352  988 R  0.3  0.0   0:00.01 top
    1 root      20   0 19232 1304 1112 S  0.0  0.0   0:01.44 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kthreadd

Code: Select all

             total       used       free     shared    buffers     cached
Mem:         10021       9892        129          0         86       4781
-/+ buffers/cache:       5024       4996
Swap:          255         80        175

Code: Select all

Filesystem            Size  Used Avail Use% Mounted on
rootfs                133G   52G   80G  40% /
devtmpfs              4.9G  148K  4.9G   1% /dev
tmpfs                 4.9G     0  4.9G   0% /dev/shm
/dev/sda1             133G   52G   80G  40% /
//10.242.13.110/repo/
                      100G   32G   69G  32% /mnt/reposhare

Code: Select all

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  19232  1304 ?        Ss   Mar05   0:01 /sbin/init
root         2  0.0  0.0      0     0 ?        S    Mar05   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/0]
root         4  0.0  0.0      0     0 ?        S    Mar05   0:06 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/0]
root         6  0.0  0.0      0     0 ?        S    Mar05   0:00 [watchdog/0]
root         7  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/1]
root         8  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/1]
root         9  0.0  0.0      0     0 ?        S    Mar05   0:03 [ksoftirqd/1]
root        10  0.0  0.0      0     0 ?        S    Mar05   0:00 [watchdog/1]
root        11  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/2]
root        12  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/2]
root        13  0.0  0.0      0     0 ?        S    Mar05   0:04 [ksoftirqd/2]
root        14  0.0  0.0      0     0 ?        S    Mar05   0:00 [watchdog/2]
root        15  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/3]
root        16  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/3]
root        17  0.0  0.0      0     0 ?        S    Mar05   0:03 [ksoftirqd/3]
root        18  0.0  0.0      0     0 ?        S    Mar05   0:00 [watchdog/3]
root        19  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/4]
root        20  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/4]
root        21  0.0  0.0      0     0 ?        S    Mar05   0:03 [ksoftirqd/4]
root        22  0.0  0.0      0     0 ?        S    Mar05   0:00 [watchdog/4]
root        23  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/5]
root        24  0.0  0.0      0     0 ?        S    Mar05   0:00 [migration/5]
root        25  0.0  0.0      0     0 ?        S    Mar05   0:03 [ksoftirqd/5]
root        26  0.0  0.0      0     0 ?        S    Mar05   0:00 [watchdog/5]
root        27  0.0  0.0      0     0 ?        S    Mar05   0:04 [events/0]
root        28  0.0  0.0      0     0 ?        S    Mar05   0:04 [events/1]
root        29  0.0  0.0      0     0 ?        S    Mar05   0:28 [events/2]
root        30  0.0  0.0      0     0 ?        S    Mar05   0:04 [events/3]
root        31  0.0  0.0      0     0 ?        S    Mar05   0:04 [events/4]
root        32  0.0  0.0      0     0 ?        S    Mar05   0:07 [events/5]
root        33  0.0  0.0      0     0 ?        S    Mar05   0:00 [cgroup]
root        34  0.0  0.0      0     0 ?        S    Mar05   0:00 [khelper]
root        35  0.0  0.0      0     0 ?        S    Mar05   0:00 [netns]
root        36  0.0  0.0      0     0 ?        S    Mar05   0:00 [async/mgr]
root        37  0.0  0.0      0     0 ?        S    Mar05   0:00 [pm]
root        38  0.0  0.0      0     0 ?        S    Mar05   0:00 [sync_supers]
root        39  0.0  0.0      0     0 ?        S    Mar05   0:26 [bdi-default]
root        40  0.0  0.0      0     0 ?        S    Mar05   0:00 [kintegrityd/0]
root        41  0.0  0.0      0     0 ?        S    Mar05   0:00 [kintegrityd/1]
root        42  0.0  0.0      0     0 ?        S    Mar05   0:00 [kintegrityd/2]
root        43  0.0  0.0      0     0 ?        S    Mar05   0:00 [kintegrityd/3]
root        44  0.0  0.0      0     0 ?        S    Mar05   0:00 [kintegrityd/4]
root        45  0.0  0.0      0     0 ?        S    Mar05   0:00 [kintegrityd/5]
root        46  0.0  0.0      0     0 ?        S    Mar05   0:01 [kblockd/0]
root        47  0.0  0.0      0     0 ?        S    Mar05   0:01 [kblockd/1]
root        48  0.0  0.0      0     0 ?        S    Mar05   0:01 [kblockd/2]
root        49  0.0  0.0      0     0 ?        S    Mar05   0:01 [kblockd/3]
root        50  0.0  0.0      0     0 ?        S    Mar05   0:01 [kblockd/4]
root        51  0.0  0.0      0     0 ?        S    Mar05   0:01 [kblockd/5]
root        52  0.0  0.0      0     0 ?        S    Mar05   0:00 [kacpid]
root        53  0.0  0.0      0     0 ?        S    Mar05   0:00 [kacpi_notify]
root        54  0.0  0.0      0     0 ?        S    Mar05   0:00 [kacpi_hotplug]
root        55  0.0  0.0      0     0 ?        S    Mar05   0:00 [ata/0]
root        56  0.0  0.0      0     0 ?        S    Mar05   0:00 [ata/1]
root        57  0.0  0.0      0     0 ?        S    Mar05   0:00 [ata/2]
root        58  0.0  0.0      0     0 ?        S    Mar05   0:00 [ata/3]
root        59  0.0  0.0      0     0 ?        S    Mar05   0:00 [ata/4]
root        60  0.0  0.0      0     0 ?        S    Mar05   0:00 [ata/5]
root        61  0.0  0.0      0     0 ?        S    Mar05   0:00 [ata_aux]
root        62  0.0  0.0      0     0 ?        S    Mar05   0:00 [ksuspend_usbd]
root        63  0.0  0.0      0     0 ?        S    Mar05   0:00 [khubd]
root        64  0.0  0.0      0     0 ?        S    Mar05   0:00 [kseriod]
root        65  0.0  0.0      0     0 ?        S    Mar05   0:00 [md/0]
root        66  0.0  0.0      0     0 ?        S    Mar05   0:00 [md/1]
root        67  0.0  0.0      0     0 ?        S    Mar05   0:00 [md/2]
root        68  0.0  0.0      0     0 ?        S    Mar05   0:00 [md/3]
root        69  0.0  0.0      0     0 ?        S    Mar05   0:00 [md/4]
root        70  0.0  0.0      0     0 ?        S    Mar05   0:00 [md/5]
root        71  0.0  0.0      0     0 ?        S    Mar05   0:00 [md_misc/0]
root        72  0.0  0.0      0     0 ?        S    Mar05   0:00 [md_misc/1]
root        73  0.0  0.0      0     0 ?        S    Mar05   0:00 [md_misc/2]
root        74  0.0  0.0      0     0 ?        S    Mar05   0:00 [md_misc/3]
root        75  0.0  0.0      0     0 ?        S    Mar05   0:00 [md_misc/4]
root        76  0.0  0.0      0     0 ?        S    Mar05   0:00 [md_misc/5]
root        77  0.0  0.0      0     0 ?        S    Mar05   0:00 [khungtaskd]
root        78  0.0  0.0      0     0 ?        S    Mar05   0:28 [kswapd0]
root        79  0.0  0.0      0     0 ?        SN   Mar05   0:00 [ksmd]
root        80  0.0  0.0      0     0 ?        SN   Mar05   0:04 [khugepaged]
root        81  0.0  0.0      0     0 ?        S    Mar05   0:00 [aio/0]
root        82  0.0  0.0      0     0 ?        S    Mar05   0:00 [aio/1]
root        83  0.0  0.0      0     0 ?        S    Mar05   0:00 [aio/2]
root        84  0.0  0.0      0     0 ?        S    Mar05   0:00 [aio/3]
root        85  0.0  0.0      0     0 ?        S    Mar05   0:00 [aio/4]
root        86  0.0  0.0      0     0 ?        S    Mar05   0:00 [aio/5]
root        87  0.0  0.0      0     0 ?        S    Mar05   0:00 [crypto/0]
root        88  0.0  0.0      0     0 ?        S    Mar05   0:00 [crypto/1]
root        89  0.0  0.0      0     0 ?        S    Mar05   0:00 [crypto/2]
root        90  0.0  0.0      0     0 ?        S    Mar05   0:00 [crypto/3]
root        91  0.0  0.0      0     0 ?        S    Mar05   0:00 [crypto/4]
root        92  0.0  0.0      0     0 ?        S    Mar05   0:00 [crypto/5]
root        97  0.0  0.0      0     0 ?        S    Mar05   0:00 [kthrotld/0]
root        98  0.0  0.0      0     0 ?        S    Mar05   0:00 [kthrotld/1]
root        99  0.0  0.0      0     0 ?        S    Mar05   0:00 [kthrotld/2]
root       100  0.0  0.0      0     0 ?        S    Mar05   0:00 [kthrotld/3]
root       101  0.0  0.0      0     0 ?        S    Mar05   0:00 [kthrotld/4]
root       102  0.0  0.0      0     0 ?        S    Mar05   0:00 [kthrotld/5]
root       103  0.0  0.0      0     0 ?        S    Mar05   0:00 [pciehpd]
root       105  0.0  0.0      0     0 ?        S    Mar05   0:00 [kpsmoused]
root       106  0.0  0.0      0     0 ?        S    Mar05   0:00 [usbhid_resumer]
root       185  0.0  0.0      0     0 ?        S    Mar05   0:00 [scsi_eh_0]
root       186  0.0  0.0      0     0 ?        S    Mar05   0:00 [scsi_eh_1]
root       225  0.0  0.0      0     0 ?        S    Mar05   0:01 [mpt_poll_0]
root       226  0.0  0.0      0     0 ?        S    Mar05   0:00 [mpt/0]
root       227  0.0  0.0      0     0 ?        S    Mar05   0:00 [scsi_eh_2]
root       412  0.0  0.0      0     0 ?        S    Mar05   0:22 [jbd2/sda1-8]
root       413  0.0  0.0      0     0 ?        S    Mar05   0:00 [ext4-dio-unwrit]
root       414  0.0  0.0      0     0 ?        S    Mar05   0:00 [ext4-dio-unwrit]
root       415  0.0  0.0      0     0 ?        S    Mar05   0:00 [ext4-dio-unwrit]
root       416  0.0  0.0      0     0 ?        S    Mar05   0:00 [ext4-dio-unwrit]
root       417  0.0  0.0      0     0 ?        S    Mar05   0:00 [ext4-dio-unwrit]
root       418  0.0  0.0      0     0 ?        S    Mar05   0:00 [ext4-dio-unwrit]
root       491  0.0  0.0  11044   340 ?        S<s  Mar05   0:00 /sbin/udevd -d
root       667  0.0  0.0      0     0 ?        S    Mar05   0:01 [vmmemctl]
root       836  0.0  0.0  11040   364 ?        S<   Mar05   0:00 /sbin/udevd -d
root       838  0.0  0.0      0     0 ?        S    Mar05   0:00 [kstriped]
root       871  0.0  0.0      0     0 ?        S    Mar05   0:34 [flush-8:0]
root       885  0.0  0.0      0     0 ?        S    Mar05   0:01 [kauditd]
root      1228  0.1  0.0 179068  2444 ?        S    Mar05   1:10 /usr/sbin/vmtoolsd
root      1314  0.0  0.0  93176   712 ?        S<sl Mar05   0:05 auditd
root      1330  0.0  0.0 333460  4796 ?        Sl   Mar05   0:04 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root      1365  0.0  0.0  66628   612 ?        Ss   Mar05   0:00 /usr/sbin/sshd
root      1373  0.0  0.0  22180   864 ?        Ss   Mar05   0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
ntp       1381  0.0  0.0  30732  1540 ?        Ss   Mar05   0:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
nagios    1402 22.2 43.9 26211812 4512072 ?    Sl   Mar05 254:02 /usr/bin/java -Xms5g -Xmx5g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Des.cluster.name=87f95151-7003-42fc-a76a-bc101723dfc0 -Des.node.name=16fcc224-849a-405f-bfaf-8321387b7294 -Des.discovery.zen.ping.unicast.hosts=10.242.13.77,10.242.13.78 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/local/nagioslogserver/elasticsearch -cp :/usr/local/nagioslogserver/elasticsearch/lib/elasticsearch-1.3.2.jar:/usr/local/nagioslogserver/elasticsearch/lib/*:/usr/local/nagioslogserver/elasticsearch/lib/sigar/* -Des.default.path.home=/usr/local/nagioslogserver/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/usr/local/nagioslogserver/elasticsearch/data -Des.default.path.work=/usr/local/nagioslogserver/tmp/elasticsearch -Des.default.path.conf=/usr/local/nagioslogserver/elasticsearch/config org.elasticsearch.bootstrap.Elasticsearch
root      1418  0.0  0.0  83080   852 ?        Ss   Mar05   0:02 sendmail: accepting connections
smmsp     1426  0.0  0.0  78664   768 ?        Ss   Mar05   0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      1445  0.0  0.0 238316  6184 ?        Ss   Mar05   0:03 /usr/sbin/httpd
root      1453  0.0  0.0 117296   748 ?        Ss   Mar05   0:01 crond
root      1464  0.0  0.0 131172  1116 ?        SN   Mar05   0:00 runuser -s /bin/sh -c exec /usr/local/nagioslogserver/logstash/bin/logstash agent -f /usr/local/nagioslogserver/logstash/etc/conf.d -l /var/log/logstash/logstash.log  -w 4 root
apache    1466  0.0  0.1 245188 12336 ?        S    Mar05   0:12 /usr/sbin/httpd
apache    1467  0.0  0.1 245692 12724 ?        S    Mar05   0:12 /usr/sbin/httpd
apache    1468  0.0  0.1 244924 12412 ?        S    Mar05   0:12 /usr/sbin/httpd
apache    1469  0.0  0.1 245188 11888 ?        S    Mar05   0:12 /usr/sbin/httpd
apache    1470  0.0  0.1 244032 12564 ?        S    Mar05   0:12 /usr/sbin/httpd
apache    1471  0.0  0.5 289660 53244 ?        S    Mar05   0:13 /usr/sbin/httpd
apache    1472  0.0  0.1 244964 11792 ?        S    Mar05   0:13 /usr/sbin/httpd
apache    1473  0.0  0.1 243108 11200 ?        S    Mar05   0:12 /usr/sbin/httpd
root      1474 25.3  2.8 4081068 288104 ?      SNl  Mar05 288:54 /usr/bin/java -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -jar /usr/local/nagioslogserver/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/usr/local/nagioslogserver/logstash/lib /usr/local/nagioslogserver/logstash/lib/logstash/runner.rb agent -f /usr/local/nagioslogserver/logstash/etc/conf.d -l /var/log/logstash/logstash.log -w 4
root      1543  0.0  0.0   4064   508 tty1     Ss+  Mar05   0:00 /sbin/mingetty /dev/tty1
root      1545  0.0  0.0   4064   508 tty2     Ss+  Mar05   0:00 /sbin/mingetty /dev/tty2
root      1547  0.0  0.0   4064   508 tty3     Ss+  Mar05   0:00 /sbin/mingetty /dev/tty3
root      1548  0.0  0.0  11040   328 ?        S<   Mar05   0:00 /sbin/udevd -d
root      1550  0.0  0.0   4064   508 tty4     Ss+  Mar05   0:00 /sbin/mingetty /dev/tty4
root      1552  0.0  0.0   4064   508 tty5     Ss+  Mar05   0:00 /sbin/mingetty /dev/tty5
root      1554  0.0  0.0   4064   508 tty6     Ss+  Mar05   0:00 /sbin/mingetty /dev/tty6
root      1849  0.0  0.0  94224  3404 ?        Ss   Mar05   0:08 sshd: root@pts/0 
root      1851  0.0  0.0 108304  1844 pts/0    Ss   Mar05   0:00 -bash
root      6810  0.0  0.0      0     0 ?        S    Mar05   0:00 [cifsiod]
root      6862  0.0  0.0      0     0 ?        S    Mar05   0:28 [cifsd]
apache   11583  0.0  0.1 246044 13640 ?        S    Mar05   0:10 /usr/sbin/httpd
root     21426  0.0  0.0      0     0 ?        S<   10:32   0:00 [kslowd005]
root     21431  0.0  0.0      0     0 ?        S<   10:33   0:00 [kslowd004]
root     23158  0.0  0.0 136064  1312 ?        S    10:53   0:00 CROND
root     23159  0.0  0.0 136064  1316 ?        S    10:53   0:00 CROND
nagios   23160  0.0  0.0 106060  1272 ?        Ss   10:53   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
nagios   23161  0.0  0.0 106060  1268 ?        Ss   10:53   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   23162  0.1  0.1 216520 12168 ?        S    10:53   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   23163  0.1  0.1 216008 11512 ?        S    10:53   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller
root     23202  0.0  0.0 110236  1168 pts/0    R+   10:53   0:00 ps aux

jolson · Post by **jolson** » Fri Mar 06, 2015 2:41 pm

Looks good to me. I spoke with a developer yesterday about this thread, his recommendation is as follows:

Code: Select all

vi /etc/sysconfig/elasticsearch
uncomment ES_HEAP_SIZE=1g -replace '1g' with half of your total RAM amount (in that node).
uncomment "MAX_LOCKED_MEMORY" and change the value to "MAX_LOCKED_MEMORY=unlimited"
service elasticsearch restart

This will ensure that mlockall will be persistent through reboots, and you do not have to run the 'ulimit' command for this to work.
MAX_LOCKED_MEMORY=unlimited is going to be the default setting moving forward. Thanks for bringing this up!

OptimusB · Post by **OptimusB** » Fri Mar 06, 2015 6:09 pm

Thanks again. I have made the changes and will watch for a few days.
mlockall is now showing true. Cheers.

scottwilkerson · Post by **scottwilkerson** » Mon Mar 09, 2015 10:34 am

Awesome.. Let us know if anything else pops up.

OptimusB · Post by **OptimusB** » Tue Mar 10, 2015 11:03 am

Just reporting in. The changes made along with proper maintenance and backup schedules in placed resolved my issue. We 've got over 42million hits in 24hours and it is running very smooth! Thanks!

jolson · Post by **jolson** » Thu Mar 12, 2015 2:38 pm

Glad to hear it, thanks very much!

Nagios Support Forum

Log Server Slow

Log Server Slow

Re: Log Server Slow

Re: Log Server Slow

Re: Log Server Slow

Re: Log Server Slow

Re: Log Server Slow

Re: Log Server Slow

Re: Log Server Slow

Re: Log Server Slow

Re: Log Server Slow