Multiple (40+) poller cron jobs running
-
GhostRider2110
- Posts: 193
- Joined: Thu Oct 30, 2014 8:04 am
- Location: Indiana
- Contact:
Re: Multiple (40+) poller cron jobs running
PM with file attached sent. Thanks
See-ya
Mitch
See-ya
Mitch
Re: Multiple (40+) poller cron jobs running
Mitch,
I should have included this in my previous post - could you also run the following:
pending tasks:
see recovery:
A tail of the jobs logs (a few minutes):
check knapsack state:
Best,
Jesse
I should have included this in my previous post - could you also run the following:
pending tasks:
Code: Select all
curl 'localhost:9200/_cat/pending_tasks?v'Code: Select all
curl -XGET 'localhost:9200/_cat/recovery?v'Code: Select all
tail -f /usr/local/nagioslogserver/var/jobs.logCode: Select all
curl -XPOST 'http://localhost:9200/_export/state'Jesse
-
GhostRider2110
- Posts: 193
- Joined: Thu Oct 30, 2014 8:04 am
- Location: Indiana
- Contact:
Re: Multiple (40+) poller cron jobs running
Pending Tasks:
recovery
(file attached)
knapsack state:
See-ya
Mitch
Code: Select all
insertOrder timeInQueue priority sourceCode: Select all
[root@IGAnagioslog ~]# tail -f /usr/local/nagioslogserver/var/jobs.log
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Code: Select all
{"count":1,"states":[{"mode":"export","started":"2015-04-21T17:06:47.142Z","path":"file:///store/backups/nagioslogserver/1429636007/kibana-int.tar.gz","node_name":"bb8f313e-98b6-4e1d-8ac4-19e6421ac511"}]}Mitch
You do not have the required permissions to view the files attached to this post.
-
GhostRider2110
- Posts: 193
- Joined: Thu Oct 30, 2014 8:04 am
- Location: Indiana
- Contact:
Re: Multiple (40+) poller cron jobs running
ps -ef is now showing:
Let me know when you want me to try to restart/reset
See-ya
Mitch
Code: Select all
nagios 1640 1638 16 Apr16 ? 1-23:21:14 /usr/bin/java -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true
root 1690 1 0 Apr16 tty1 00:00:00 /sbin/mingetty /dev/tty1
root 1692 1 0 Apr16 tty2 00:00:00 /sbin/mingetty /dev/tty2
root 1694 1 0 Apr16 tty3 00:00:00 /sbin/mingetty /dev/tty3
root 1696 1 0 Apr16 tty4 00:00:00 /sbin/mingetty /dev/tty4
root 1698 1 0 Apr16 tty5 00:00:00 /sbin/mingetty /dev/tty5
root 1701 457 0 Apr16 ? 00:00:00 /sbin/udevd -d
root 1702 457 0 Apr16 ? 00:00:00 /sbin/udevd -d
root 1703 1 0 Apr16 tty6 00:00:00 /sbin/mingetty /dev/tty6
root 1716 2 0 Apr16 ? 00:04:49 [flush-253:0]
root 2733 1613 0 Apr21 ? 00:00:00 CROND
nagios 2735 2733 0 Apr21 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 2738 2735 0 Apr21 ? 00:00:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 2799 2738 0 Apr21 ? 00:01:38 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root 9385 1506 0 09:56 ? 00:00:00 sshd: root@pts/1
root 9387 9385 0 09:56 pts/1 00:00:00 -bash
apache 11539 1602 0 Apr26 ? 00:00:09 /usr/sbin/httpd
apache 11540 1602 0 Apr26 ? 00:00:09 /usr/sbin/httpd
apache 11541 1602 0 Apr26 ? 00:00:09 /usr/sbin/httpd
apache 11542 1602 0 Apr26 ? 00:00:08 /usr/sbin/httpd
apache 11543 1602 0 Apr26 ? 00:00:09 /usr/sbin/httpd
apache 11544 1602 0 Apr26 ? 00:00:09 /usr/sbin/httpd
apache 11545 1602 0 Apr26 ? 00:00:09 /usr/sbin/httpd
apache 11546 1602 0 Apr26 ? 00:00:09 /usr/sbin/httpd
apache 15651 1602 0 07:51 ? 00:00:01 /usr/sbin/httpd
root 17593 1613 0 10:14 ? 00:00:00 CROND
root 17594 1613 0 10:14 ? 00:00:00 CROND
nagios 17595 17594 0 10:14 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
nagios 17596 17595 0 10:14 ? 00:00:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller
nagios 17597 17593 0 10:14 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 17599 17597 0 10:14 ? 00:00:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 17681 2799 0 10:14 ? 00:00:00 sleep 5
nagios 17682 20187 0 10:14 ? 00:00:00 sleep 5
nagios 17709 26852 0 10:14 ? 00:00:00 sleep 5
nagios 17716 55509 0 10:14 ? 00:00:00 sleep 5
nagios 17717 37393 0 10:14 ? 00:00:00 sleep 5
nagios 17721 22517 0 10:14 ? 00:00:00 sleep 5
nagios 17725 46589 0 10:14 ? 00:00:00 sleep 5
root 17726 63780 0 10:14 pts/0 00:00:00 ps -ef
root 20146 1613 0 Apr26 ? 00:00:00 CROND
nagios 20149 20146 0 Apr26 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 20150 20149 0 Apr26 ? 00:00:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 20187 20150 0 Apr26 ? 00:00:26 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root 22466 1613 0 Apr27 ? 00:00:00 CROND
nagios 22470 22466 0 Apr27 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 22471 22470 0 Apr27 ? 00:00:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 22517 22471 0 Apr27 ? 00:00:12 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root 26830 1613 0 Apr25 ? 00:00:00 CROND
nagios 26832 26830 0 Apr25 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 26833 26832 0 Apr25 ? 00:00:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 26852 26833 0 Apr25 ? 00:00:39 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root 37130 1613 0 Apr24 ? 00:00:00 CROND
nagios 37133 37130 0 Apr24 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 37135 37133 0 Apr24 ? 00:00:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 37393 37135 0 Apr24 ? 00:00:55 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root 46403 1613 0 Apr23 ? 00:00:00 CROND
nagios 46406 46403 0 Apr23 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 46408 46406 0 Apr23 ? 00:00:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 46589 46408 0 Apr23 ? 00:01:07 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root 55316 1613 0 Apr22 ? 00:00:00 CROND
nagios 55318 55316 0 Apr22 ? 00:00:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios 55320 55318 0 Apr22 ? 00:00:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios 55509 55320 0 Apr22 ? 00:01:21 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root 63778 1506 0 Apr27 ? 00:00:00 sshd: root@pts/0
root 63780 63778 0 Apr27 pts/0 00:00:00 -bash
See-ya
Mitch
Re: Multiple (40+) poller cron jobs running
Mitch,
I much appreciate all of the information. I'll try and get some traction here and see what I can do. Feel free to restart your jobs if you need the backups up and working - but I would like to keep your machine in a 'broken' state to see whether or not we can troubleshoot this, since it's a problem that's been experienced before. Of course, restarting elasticsearch isn't the solution we're looking for.
Can you think of anything significant that may have impacted or otherwise interrupted the backup process?
I much appreciate all of the information. I'll try and get some traction here and see what I can do. Feel free to restart your jobs if you need the backups up and working - but I would like to keep your machine in a 'broken' state to see whether or not we can troubleshoot this, since it's a problem that's been experienced before. Of course, restarting elasticsearch isn't the solution we're looking for.
Can you think of anything significant that may have impacted or otherwise interrupted the backup process?
-
GhostRider2110
- Posts: 193
- Joined: Thu Oct 30, 2014 8:04 am
- Location: Indiana
- Contact:
Re: Multiple (40+) poller cron jobs running
Have been looking at this.. This does seem to be a different issue that earlier. Here is it just the backups that are not completing. Before poller was hanging.
If I am following the code in the backup script correctly, it seems to be hanging on the elasticsearch export part.
Only one sleep command in the code and I have the same number of "sleep 5" commands as I do "create_backup.sh" commands.
Also what is strange, I configured in the "Backup & Maintenance" screen to use the repository "nagiosls1" and that is defined as /usr/local/nagioslogserver_bkuprep file system.
So why would it want to use "BACKUP_DIR="/store/backups/nagioslogserver"" to place the backups?
Now I am just wondering if I should have started a new thread on this since I do believe it to be a different problem than what this thread started out with.
See-ya
Mitch
If I am following the code in the backup script correctly, it seems to be hanging on the elasticsearch export part.
Code: Select all
# Wait for elasticsearch export jobs to finish...
echo "Waiting for backup. This may take a while."
count=3
while [[ $count -gt 0 ]]; do
curl -s -XPOST 'http://localhost:9200/_export/state' > state.json
count=$(python -m jsonselect.__main__ .count < state.json)
echo -n "."
sleep 5
done
Also what is strange, I configured in the "Backup & Maintenance" screen to use the repository "nagiosls1" and that is defined as /usr/local/nagioslogserver_bkuprep file system.
So why would it want to use "BACKUP_DIR="/store/backups/nagioslogserver"" to place the backups?
Now I am just wondering if I should have started a new thread on this since I do believe it to be a different problem than what this thread started out with.
See-ya
Mitch
You do not have the required permissions to view the files attached to this post.
Re: Multiple (40+) poller cron jobs running
Mitch,
To be clear - the backup script that is hanging is /usr/local/nagioslogserver/scripts/create_backup.sh. This script controls configuration backup, not the backup of your actual logs. If you take a look at the filesizes of the backups in /store/backups/nagioslogserver/ you will see what I mean:
These backups are not controlled from the 'Backup and Maintenance' screen - they are pre-configured and automatic. For more information on this issue, another customer is experiencing similar symptoms: http://support.nagios.com/forum/viewtop ... 38&t=32218
I am having trouble reproducing this problem in lab, and due to that I can't open a bug report on it. I need to know how/why this is happening. Any chance you could collect the following information for me?
-The amount of nodes in your cluster
-The amount of logs coming in per 10 minutes (you can see this on your dashboard)
-Any custom logstash inputs/filters/outputs? If so, could you list them?
-Any customizations on the box you can think of? E.g. NRPE installed, non-default syslogger, etc.
Could you please post the results of these commands:
Thanks for your patience here Mitch, I'm hoping we can hunt whatever this is down and have it resolved soon.
To be clear - the backup script that is hanging is /usr/local/nagioslogserver/scripts/create_backup.sh. This script controls configuration backup, not the backup of your actual logs. If you take a look at the filesizes of the backups in /store/backups/nagioslogserver/ you will see what I mean:
Code: Select all
[root@localhost ~]# ls -lh /store/backups/nagioslogserver/
-rw-r--r-- 1 nagios users 155K Apr 8 09:55 nagioslogserver.2015-04-07.1428419172.tar.gz
-rw-r--r-- 1 nagios users 300K Apr 8 10:06 nagioslogserver.2015-04-08.1428505577.tar.gz
-rw-r--r-- 1 nagios users 1.3M Apr 15 10:06 nagioslogserver.2015-04-15.1429110402.tar.gz
-rw-r--r-- 1 nagios users 1.5M Apr 16 10:06 nagioslogserver.2015-04-16.1429196807.tar.gz
-rw-r--r-- 1 nagios users 1.6M Apr 17 10:07 nagioslogserver.2015-04-17.1429283221.tar.gzI am having trouble reproducing this problem in lab, and due to that I can't open a bug report on it. I need to know how/why this is happening. Any chance you could collect the following information for me?
-The amount of nodes in your cluster
-The amount of logs coming in per 10 minutes (you can see this on your dashboard)
-Any custom logstash inputs/filters/outputs? If so, could you list them?
-Any customizations on the box you can think of? E.g. NRPE installed, non-default syslogger, etc.
Could you please post the results of these commands:
Code: Select all
cat /var/log/elasticsearch/*.logCode: Select all
cat /var/log/httpd/error_logCode: Select all
ls -lh /store/backups/nagioslogserver/-
GhostRider2110
- Posts: 193
- Joined: Thu Oct 30, 2014 8:04 am
- Location: Indiana
- Contact:
Re: Multiple (40+) poller cron jobs running
Thanks for the explanation. Then I have another issue since I'm not getting any backups in the setup repository space, but that is for another timejolson wrote:Mitch,
To be clear - the backup script that is hanging is /usr/local/nagioslogserver/scripts/create_backup.sh. This script controls configuration backup, not the backup of your actual logs. If you take a look at the filesizes of the backups in /store/backups/nagioslogserver/ you will see what I mean:These backups are not controlled from the 'Backup and Maintenance' screen - they are pre-configured and automatic. For more information on this issue, another customer is experiencing similar symptoms: http://support.nagios.com/forum/viewtop ... 38&t=32218Code: Select all
[root@localhost ~]# ls -lh /store/backups/nagioslogserver/ -rw-r--r-- 1 nagios users 155K Apr 8 09:55 nagioslogserver.2015-04-07.1428419172.tar.gz -rw-r--r-- 1 nagios users 300K Apr 8 10:06 nagioslogserver.2015-04-08.1428505577.tar.gz -rw-r--r-- 1 nagios users 1.3M Apr 15 10:06 nagioslogserver.2015-04-15.1429110402.tar.gz -rw-r--r-- 1 nagios users 1.5M Apr 16 10:06 nagioslogserver.2015-04-16.1429196807.tar.gz -rw-r--r-- 1 nagios users 1.6M Apr 17 10:07 nagioslogserver.2015-04-17.1429283221.tar.gz
Only 1jolson wrote:I am having trouble reproducing this problem in lab, and due to that I can't open a bug report on it. I need to know how/why this is happening. Any chance you could collect the following information for me?
-The amount of nodes in your cluster
count per 10m | (7442402 hits) Is that what you are asking for?jolson wrote:-The amount of logs coming in per 10 minutes (you can see this on your dashboard)
jolson wrote:-Any custom logstash inputs/filters/outputs? If so, could you list them?
Code: Select all
#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 29 Apr 2015 12:39:20 -0400
#
#
# Global Configuration
#
input {
syslog {
type => 'syslog'
port => 5544
}
tcp {
type => 'eventlog'
port => 3515
codec => json {
charset => 'CP1252'
}
}
tcp {
type => 'import_raw'
tags => 'import_raw'
port => 2056
}
tcp {
type => 'import_json'
tags => 'import_json'
port => 2057
codec => json
}
syslog {
type => 'syslog'
port => 514
}
syslog {
type => 'asa'
port => 6544
}
}
filter {
if [program] == 'apache_access' {
grok {
match => [ 'message', '%{COMBINEDAPACHELOG}']
}
date {
match => [ 'timestamp', 'dd/MMM/yyyy:HH:mm:ss Z' ]
}
mutate {
replace => [ 'type', 'apache_access' ]
convert => [ 'bytes', 'integer' ]
convert => [ 'response', 'integer' ]
}
}
if [program] == 'apache_error' {
grok {
match => [ 'message', '\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:class}\] \[%{WORD:originator} %{IP:clientip}\] %{GREEDYDATA:errmsg}']
}
mutate {
replace => [ 'type', 'apache_error' ]
}
}
if [program] == 'TrexSyncPubRep' {
mutate {
replace => [ 'type', 'TrexSyncPubRep' ]
}
}
if [type] == 'asa' {
grok{
match => ['message', '%{SYSLOG5424PRI}%%{WORD:LogType}-%{INT:LogSeverity}-%{INT:LogMessageNumber}: Group = %{IPORHOST:Group}, Username = %{IPORHOST:username}, IP = %{IP:IPAddress}, Session disconnected. Session Type: %{WORD:SessionType}, Duration: %{CUSTOM1:DurationDays=[0-9]?}%{CUSTOM2=d? ?}%{INT:DurationHours:int}h:%{INT:DurationMinutes:int}m:%{INT:DurationSeconds:int}s, Bytes xmt: %{INT:BytesTransmitted:int}, Bytes rcv: %{INT:BytesReceived:int}, Reason: %{GREEDYDATA:Reason}']
}
geoip {
source => "IPAddress"
}
}
if [program] == 'apache_access' {
geoip {
source => 'clientip'
}
}
if [program] == 'TrexSyncRep' {
mutate {
replace => [ 'type', 'TrexSyncRep' ]
}
}
if [program] == 'Jupiter_log' {
mutate {
replace => [ 'type', 'Jupiter' ]
}
}
if [program] == 'diablo_in1_video_management' {
mutate {
replace => [ 'type', 'diablo' ]
}
}
if [program] == 'PUB_API_ACCESS' {
mutate {
replace => [ 'type', 'APIaccess' ]
}
}
if [program] == 'sudo' {
mutate {
replace => [ 'type', 'sudo' ]
}
}
if [program] == 'opt_lrms_logs_cmgopher' {
mutate {
replace => [ 'type', 'CMGopher_LRMS' ]
}
}
if [program] == 'var_opt_lrms_log_uam' {
mutate {
replace => [ 'type', 'UAMGopher_LRMS' ]
}
}
if [program] == 'opt_lrms_logs_uam' {
mutate {
replace => [ 'type', 'UAMGopher_LRMS' ]
}
}
if [program] == 'opt_lrms_logs_cm' {
mutate {
replace => [ 'type', 'CM_LRMS' ]
}
}
if [program] == 'Epsy_log' {
mutate {
replace => [ 'type', 'Epsy_log' ]
}
}
if [program] == 'Wowzastream_access' {
mutate {
replace => [ 'type', 'wowzastream' ]
}
}
if [program] == 'Wowzastream_error' {
mutate {
replace => [ 'type', 'wowzastream' ]
}
}
}
#
# Local Configuration
#
I do have the nagiosXI client install on it and except for OS security patches nothing else I can think of. I did move the basejolson wrote:-Any customizations on the box you can think of? E.g. NRPE installed, non-default syslogger, etc.
install to a larger disk but that was a while back so things have been working since then.
jolson wrote:Could you please post the results of these commands:Code: Select all
cat /var/log/elasticsearch/*.log
Code: Select all
[root@IGAnagioslog local]# cat /var/log/elasticsearch/*.log
[2015-04-29 03:57:35,898][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [apache_access] (dynamic)
[2015-04-29 05:14:40,084][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [eventlog] (dynamic)
[2015-04-29 05:14:51,559][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [eventlog] (dynamic)
[2015-04-29 05:14:53,163][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [eventlog] (dynamic)
[2015-04-29 05:15:58,783][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [eventlog] (dynamic)
[2015-04-29 05:17:32,378][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [eventlog] (dynamic)
[2015-04-29 08:20:42,535][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [eventlog] (dynamic)
[2015-04-29 08:20:43,490][INFO ][cluster.metadata ] [bb8f313e-98b6-4e1d-8ac4-19e6421ac511] [logstash-2015.04.29] update_mapping [eventlog] (dynamic)jolson wrote:Code: Select all
cat /var/log/httpd/error_log
Code: Select all
[root@IGAnagioslog local]# cat /var/log/httpd/error_log
[Sun Apr 26 03:34:03 2015] [notice] Digest: generating secret for digest authentication ...
[Sun Apr 26 03:34:03 2015] [notice] Digest: done
[Sun Apr 26 03:34:03 2015] [notice] Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 configured -- resuming normal operationsjolson wrote:Code: Select all
ls -lh /store/backups/nagioslogserver/
Code: Select all
[root@IGAnagioslog local]# ls -lh /store/backups/nagioslogserver/
total 451M
drwxrwxrwx 2 nagios users 4.0K Apr 21 13:06 1429636007
drwxrwxrwx 2 nagios users 4.0K Apr 22 13:06 1429722407
drwxrwxrwx 2 nagios users 4.0K Apr 23 13:06 1429808811
drwxrwxrwx 2 nagios users 4.0K Apr 24 13:06 1429895212
drwxrwxrwx 2 nagios users 4.0K Apr 25 13:07 1429981621
drwxrwxrwx 2 nagios users 4.0K Apr 26 13:07 1430068022
drwxrwxrwx 2 nagios users 4.0K Apr 27 13:07 1430154427
drwxrwxrwx 2 nagios users 4.0K Apr 28 13:07 1430240831
-rw-r--r-- 1 nagios users 21M Mar 27 13:06 nagioslogserver.2015-03-27.1427475951.tar.gz
-rw-r--r-- 1 nagios users 5.3M Mar 28 13:05 nagioslogserver.2015-03-28.1427562351.tar.gz
-rw-r--r-- 1 nagios users 5.6M Mar 29 13:05 nagioslogserver.2015-03-29.1427648752.tar.gz
-rw-r--r-- 1 nagios users 21M Mar 30 13:06 nagioslogserver.2015-03-30.1427735162.tar.gz
-rw-r--r-- 1 nagios users 5.6M Mar 31 13:06 nagioslogserver.2015-03-31.1427821562.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 1 13:06 nagioslogserver.2015-04-01.1427907966.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 2 13:06 nagioslogserver.2015-04-02.1427994366.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 3 13:06 nagioslogserver.2015-04-03.1428080766.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 4 13:06 nagioslogserver.2015-04-04.1428167167.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 5 13:06 nagioslogserver.2015-04-05.1428253572.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 6 13:06 nagioslogserver.2015-04-06.1428339977.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 7 13:06 nagioslogserver.2015-04-07.1428426381.tar.gz
-rw-r--r-- 1 nagios users 5.6M Apr 8 13:06 nagioslogserver.2015-04-08.1428512781.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 9 13:06 nagioslogserver.2015-04-09.1428599182.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 10 13:06 nagioslogserver.2015-04-10.1428685587.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 11 13:06 nagioslogserver.2015-04-11.1428771991.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 12 13:06 nagioslogserver.2015-04-12.1428858392.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 13 13:07 nagioslogserver.2015-04-13.1428944796.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 14 13:07 nagioslogserver.2015-04-14.1429031196.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 15 13:07 nagioslogserver.2015-04-15.1429117596.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 16 13:07 nagioslogserver.2015-04-16.1429203997.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 17 13:07 nagioslogserver.2015-04-17.1429290397.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 18 13:07 nagioslogserver.2015-04-18.1429376797.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 19 13:07 nagioslogserver.2015-04-19.1429463201.tar.gz
-rw-r--r-- 1 nagios users 21M Apr 20 13:07 nagioslogserver.2015-04-20.1429549602.tar.gz
[root@IGAnagioslog local]# df -kh
Filesystem Size Used Avail Use% Mounted on
rootfs 99G 5.7G 92G 6% /
devtmpfs 2.0G 208K 2.0G 1% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
/dev/sda1 99G 5.7G 92G 6% /
/dev/mapper/vg_nagioslog-lv_nagioslog_data
477G 256G 222G 54% /usr/local/nagioslogserver
/dev/mapper/vg_nagioslog-lv_nagioslog_rep
49G 33M 49G 1% /usr/local/nagioslogserver_bkuprep
jolson wrote:Thanks for your patience here Mitch, I'm hoping we can hunt whatever this is down and have it resolved soon.
No problem.
See-ya
Mitch
Re: Multiple (40+) poller cron jobs running
Mitch,
In the other thread, it was pointed out that his MAX_LOCKED_MEMORY variable was not set to 'unlimited'. Could you please cat your elasticsearch config file and show us what that variable is set to?
I would like you to change: MAX_LOCKED_MEMORY=X
to: MAX_LOCKED_MEMORY=unlimited
I know we discussed this change in a previous thread, but I'm not sure if it was implemented.
After this change, restart elasticsearch.
I am ultimately looking for similarities between the two systems, and so far there is nothing that stands out. The MAX_LOCKED_MEMORY could be notable if yours was not set to unlimited as well.
In the other thread, it was pointed out that his MAX_LOCKED_MEMORY variable was not set to 'unlimited'. Could you please cat your elasticsearch config file and show us what that variable is set to?
Code: Select all
cat /etc/sysconfig/elasticsearchto: MAX_LOCKED_MEMORY=unlimited
I know we discussed this change in a previous thread, but I'm not sure if it was implemented.
After this change, restart elasticsearch.
Code: Select all
service elasticsearch restart-
GhostRider2110
- Posts: 193
- Joined: Thu Oct 30, 2014 8:04 am
- Location: Indiana
- Contact:
Re: Multiple (40+) poller cron jobs running
Sorry, it was set a while back:
Code: Select all
[root@IGAnagioslog nagioslogserver]# cat /etc/sysconfig/elasticsearch
# Directory where the Elasticsearch binary distribution resides
APP_DIR="/usr/local/nagioslogserver"
ES_HOME="$APP_DIR/elasticsearch"
# Heap Size (defaults to 256m min, 1g max)
ES_HEAP_SIZE=2g
# Heap new generation
#ES_HEAP_NEWSIZE=
# max direct memory
#ES_DIRECT_SIZE=
# Additional Java OPTS
#ES_JAVA_OPTS=
# Maximum number of open files
MAX_OPEN_FILES=65535
# Maximum amount of locked memory
MAX_LOCKED_MEMORY=unlimited
# Maximum number of VMA (Virtual Memory Areas) a process can own
MAX_MAP_COUNT=262144
# Elasticsearch log directory
LOG_DIR=/var/log/elasticsearch
# Elasticsearch data directory
DATA_DIR="$ES_HOME/data"
# Elasticsearch work directory
WORK_DIR="$APP_DIR/tmp/elasticsearch"
# Elasticsearch conf directory
CONF_DIR="$ES_HOME/config"
# Elasticsearch configuration file (elasticsearch.yml)
CONF_FILE="$ES_HOME/config/elasticsearch.yml"
# User to run as, change this to a specific elasticsearch user if possible
# Also make sure, this user can write into the log directories in case you change them
# This setting only works for the init script, but has to be configured separately for systemd startup
ES_USER=nagios
ES_GROUP=nagios
# Configure restart on package upgrade (true, every other setting will lead to not restarting)
#RESTART_ON_UPGRADE=true
if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" -o "x$1" == "xforce-reload" ];then
GET_ES_CONFIG_MESSAGE="$( php $APP_DIR/scripts/get_es_config.php )"
GET_ES_CONFIG_RETURN=$?
if [ "$GET_ES_CONFIG_RETURN" != "0" ]; then
echo $GET_ES_CONFIG_MESSAGE
exit 1
else
ES_JAVA_OPTS="$GET_ES_CONFIG_MESSAGE"
fi
fi