Page 3 of 3

Re: Logs stop coming in

Posted: Wed Jan 21, 2015 11:41 am
by tmcdonald
I am trying to confirm whether you and the OP in this topic are the same person. We can't close the topic if the OP's issue has not been resolved.

Re: Logs stop coming in

Posted: Wed Jan 21, 2015 4:31 pm
by globalgiving
This is OP. The other poster was not me or someone in my organization.

I increased the heap size and set the max locked memory as per the response I got, way back on the first page of this thread.

That did in fact solve my problem. I have not hit an issue where logs have stopped coming in anymore.

However, checking today on the system running NLS, I see that there are a lot of create_backup.sh processes running. They have been building up for a long time.. has not built to the point of causing elasticsearch to stop working.. but I am guessing that it will still eventually get there.

Process list on the NLS server:

Code: Select all

nagios     707  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     708  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     709  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     710  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     711  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     712  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     713  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     714  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     715  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     716  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
root       733  0.0  0.0      0     0 ?        S     2014   7:48 [jbd2/sda3-8]
root       734  0.0  0.0      0     0 ?        S     2014   0:00 [ext4-dio-unwrit]
nagios     752  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     753  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     754  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     755  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     756  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     757  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     758  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     759  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     760  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     761  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     762  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
root       764  0.0  0.0 140176  1728 ?        S    16:04   0:00 CROND
root       765  0.0  0.0 140176  1728 ?        S    16:04   0:00 CROND
nagios     766  0.0  0.0 106092  1136 ?        Ss   16:04   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
nagios     767  2.0  0.0 216176 10636 ?        S    16:04   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller
nagios     768  0.0  0.0 106092  1132 ?        Ss   16:04   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios     769  2.0  0.0 216176 10708 ?        S    16:04   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root       822  0.0  0.0  10736   784 ?        S<s   2014   0:00 /sbin/udevd -d
root      1596  0.0  0.0      0     0 ?        S     2014   0:00 [kjournald]
root      1630  0.0  0.0      0     0 ?        S     2014  10:29 [flush-8:0]
root      1640  0.0  0.0      0     0 ?        S     2014   0:52 [kauditd]
apache    1706  0.0  0.0 246188 15564 ?        S    Jan18   0:17 /usr/sbin/httpd
apache    1707  0.0  0.1 247164 16420 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1708  0.0  0.0 244768 13892 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1709  0.0  0.0 244772 13832 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1711  0.0  0.0 246592 14264 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1712  0.0  0.0 243236 12520 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1713  0.0  0.0 241592 11176 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1714  0.0  0.0 243256 11556 ?        S    Jan18   0:16 /usr/sbin/httpd
root      1772  0.0  0.0      0     0 ?        S     2014  12:28 [bond0]
root      1951  0.0  0.0  93200   872 ?        S<sl  2014   2:55 auditd
root      1967  0.0  0.0 485592  8308 ?        Sl    2014   1:57 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root      1981  0.0  0.0  10876   696 ?        Ss    2014  21:16 irqbalance --pid=/var/run/irqbalance.pid
rpc       1995  0.0  0.0  18976   816 ?        Ss    2014   0:04 rpcbind
rpcuser   2013  0.0  0.0  23348  1192 ?        Ss    2014   0:00 rpc.statd
dbus      2127  0.0  0.0  21404   684 ?        Ss    2014   0:00 dbus-daemon --system
root      2171  0.0  0.0   4080   576 ?        Ss    2014   0:00 /usr/sbin/acpid
root      2243  0.0  0.0 385736  3368 ?        Ssl   2014   0:42 automount --pid-file /var/run/autofs.pid
root      2260  0.0  0.0 197948  4700 ?        S     2014  16:11 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
root      2272  0.0  0.0  66616  1136 ?        Ss    2014   0:00 /usr/sbin/sshd
root      2280  0.0  0.0  22180   848 ?        Ss    2014   0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root      2418  0.0  0.0  83060  2496 ?        Ss    2014   1:12 sendmail: accepting connections
smmsp     2426  0.0  0.0  78656  2004 ?        Ss    2014   0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      2449  0.0  0.0 110316   944 ?        Ss    2014   0:00 /usr/sbin/abrtd
root      2491  0.0  0.0 238860  7880 ?        Ss    2014   2:02 /usr/sbin/httpd
root      2499  0.0  0.0 117292  1260 ?        Ss    2014   0:36 crond
root      2518  0.0  0.0 131172  1372 ?        SN    2014   0:00 runuser -s /bin/sh -c exec /usr/local/nagioslogserver/logstash/bin/logstash agent -f /usr/local/nagioslogserver/logstash/etc/conf.d -l /var/log/lo
nagios    2520  8.1  5.0 14044052 817496 ?     SNsl  2014 4830:27 /usr/bin/java -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSI
root      2581  0.0  0.0 342036  7580 ?        Sl    2014  12:55 /usr/bin/python /usr/bin/fail2ban-server -b -s /var/run/fail2ban/fail2ban.sock
root      2583  0.0  0.0   9364  1040 ?        S     2014   1:23 /usr/libexec/gam_server
root      2598  0.0  0.0  21540   484 ?        Ss    2014   0:00 /usr/sbin/atd
root      2612  0.0  0.0   4064   528 tty1     Ss+   2014   0:00 /sbin/mingetty /dev/tty1
root      2614  0.0  0.0   4064   528 tty2     Ss+   2014   0:00 /sbin/mingetty /dev/tty2
root      2616  0.0  0.0   4064   528 tty3     Ss+   2014   0:00 /sbin/mingetty /dev/tty3
root      2618  0.0  0.0   4064   532 tty4     Ss+   2014   0:00 /sbin/mingetty /dev/tty4
root      2620  0.0  0.0   4064   532 tty5     Ss+   2014   0:00 /sbin/mingetty /dev/tty5
root      2622  0.0  0.0   4064   528 tty6     Ss+   2014   0:00 /sbin/mingetty /dev/tty6
root      2624  0.0  0.0  10732   748 ?        S<    2014   0:00 /sbin/udevd -d
root      2625  0.0  0.0  10732   736 ?        S<    2014   0:00 /sbin/udevd -d
root      7540  0.0  0.0 140176  1728 ?        S    Jan09   0:00 CROND
nagios    7545  0.0  0.0 106092  1132 ?        Ss   Jan09   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios    7546  0.0  0.0 218504 10960 ?        S    Jan09   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios    7779  0.0  0.0 106096  1292 ?        S    Jan09   2:22 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root      8271  0.0  0.0 140176  1728 ?        S    Jan01   0:00 CROND
nagios    8277  0.0  0.0 106092  1136 ?        Ss   Jan01   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios    8278  0.0  0.0 219020 11356 ?        S    Jan01   0:05 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios    8299  0.0  0.0 106096  1292 ?        S    Jan01   3:53 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root      9399  0.0  0.0 140176  1728 ?        S    Jan15   0:00 CROND
nagios    9401  0.0  0.0 106092  1136 ?        Ss   Jan15   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios    9403  0.0  0.0 218248 10960 ?        S    Jan15   0:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios    9924  0.0  0.0 106096  1292 ?        S    Jan15   1:11 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     11056  0.0  0.0 140176  1728 ?        S    Jan13   0:00 CROND
nagios   11058  0.0  0.0 106092  1132 ?        Ss   Jan13   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   11064  0.0  0.0 218248 10744 ?        S    Jan13   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   11508  0.0  0.0 106096  1292 ?        S    Jan13   1:34 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     11547  0.0  0.0 140176  1728 ?        S    Jan08   0:00 CROND
nagios   11552  0.0  0.0 106092  1132 ?        Ss   Jan08   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   11555  0.0  0.0 218504 10976 ?        S    Jan08   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   11732  0.0  0.0 106096  1296 ?        S    Jan08   2:34 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     12996  0.0  0.0 140176  1728 ?        S    Jan18   0:00 CROND
nagios   13000  0.0  0.0 106092  1132 ?        Ss   Jan18   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   13001  0.0  0.0 218248 10912 ?        S    Jan18   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
apache   13157  0.0  0.0 241072 10604 ?        S    12:38   0:02 /usr/sbin/httpd
nagios   13683  0.0  0.0 106096  1292 ?        S    Jan18   0:37 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     14317  0.0  0.0 140176  1728 ?        S    Jan20   0:00 CROND
nagios   14321  0.0  0.0 106092  1132 ?        Ss   Jan20   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   14324  0.0  0.0 218248 10880 ?        S    Jan20   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   15077  0.0  0.0 106096  1292 ?        S    Jan20   0:14 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     15111  0.0  0.0 140176  1728 ?        S    Jan11   0:00 CROND
nagios   15114  0.0  0.0 106092  1136 ?        Ss   Jan11   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   15116  0.0  0.0 218504 10924 ?        S    Jan11   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   15460  0.0  0.0 106096  1296 ?        S    Jan11   1:58 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     18635  0.0  0.0 140176  1728 ?        S    Jan06   0:00 CROND
nagios   18639  0.0  0.0 106092  1132 ?        Ss   Jan06   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   18643  0.0  0.0 218504 11004 ?        S    Jan06   0:04 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   18762  0.0  0.0 106096  1292 ?        S    Jan06   2:57 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
apache   18784  0.0  0.0 241072 10420 ?        S    Jan20   0:16 /usr/sbin/httpd
root     19786  0.0  0.0 140176  1728 ?        S    Jan17   0:00 CROND
nagios   19788  0.0  0.0 106092  1132 ?        Ss   Jan17   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   19792  0.0  0.0 218248 10932 ?        S    Jan17   0:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root     20016  0.0  0.0 140176  1728 ?        S    Jan04   0:00 CROND
nagios   20019  0.0  0.0 106092  1132 ?        Ss   Jan04   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   20020  0.0  0.0 219020 11304 ?        S    Jan04   0:04 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   20086  0.0  0.0 106096  1296 ?        S    Jan04   3:20 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
nagios   20439  0.0  0.0 106096  1296 ?        S    Jan17   0:49 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     20811  0.0  0.0 140176  1728 ?        S    Jan07   0:00 CROND
nagios   20815  0.0  0.0 106092  1128 ?        Ss   Jan07   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   20818  0.0  0.0 218504 10992 ?        S    Jan07   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   20987  0.0  0.0 106096  1296 ?        S    Jan07   2:44 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
apache   21858  0.0  0.0 246628 14808 ?        S    Jan20   0:16 /usr/sbin/httpd
root     24883  0.0  0.0 140176  1728 ?        S    Jan10   0:00 CROND
nagios   24886  0.0  0.0 106092  1132 ?        Ss   Jan10   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   24889  0.0  0.0 218504 10944 ?        S    Jan10   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   25198  0.0  0.0 106096  1296 ?        S    Jan10   2:10 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     27361  0.0  0.0 140176  1728 ?        S    Jan19   0:00 CROND
root     27362  0.0  0.0 140176  1728 ?        S    Jan12   0:00 CROND
nagios   27366  0.0  0.0 106092  1136 ?        Ss   Jan12   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   27367  0.0  0.0 106092  1132 ?        Ss   Jan19   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   27370  0.0  0.0 218504 10912 ?        S    Jan12   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   27372  0.0  0.0 218248 10896 ?        S    Jan19   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   27735  0.0  0.0 106096  1292 ?        S    Jan12   1:46 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
nagios   28088  0.0  0.0 106096  1296 ?        S    Jan19   0:25 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     32334  0.0  0.0 140176  1728 ?        S    Jan03   0:00 CROND
nagios   32338  0.0  0.0 106092  1132 ?        Ss   Jan03   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   32340  0.0  0.0 219020 11316 ?        S    Jan03   0:05 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   32390  0.0  0.0 106096  1292 ?        S    Jan03   3:31 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     35912  0.0  0.0 140176  1728 ?        S    Jan05   0:00 CROND
nagios   35918  0.0  0.0 106092  1128 ?        Ss   Jan05   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   35921  0.0  0.0 219020 11288 ?        S    Jan05   0:04 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   36003  0.0  0.0 106096  1292 ?        S    Jan05   3:07 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     36918  0.0  0.0 140176  1728 ?        S    Jan02   0:00 CROND
nagios   36921  0.0  0.0 106092  1132 ?        Ss   Jan02   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   36923  0.0  0.0 219020 11332 ?        S    Jan02   0:05 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   36950  0.0  0.0 106096  1296 ?        S    Jan02   3:45 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
nagios   37015 14.0 58.6 49121612 9548632 ?    SLl   2014 4459:41 /usr/bin/java -Xms8g -Xmx8g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:
root     39376  0.0  0.0 140176  1728 ?        S    Jan16   0:00 CROND
nagios   39380  0.0  0.0 106092  1128 ?        Ss   Jan16   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   39382  0.0  0.0 218248 10940 ?        S    Jan16   0:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   39938  0.0  0.0 106096  1292 ?        S    Jan16   1:00 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     47038  0.0  0.0 140176  1728 ?        S    Jan14   0:00 CROND
nagios   47042  0.0  0.0 106092  1132 ?        Ss   Jan14   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   47045  0.0  0.0 218248 10980 ?        S    Jan14   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   47545  0.0  0.0 106096  1296 ?        S    Jan14   1:24 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     47765  0.0  0.0 140176  1728 ?        S    11:00   0:00 CROND
nagios   47769  0.0  0.0 106092  1132 ?        Ss   11:00   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   47773  0.0  0.0 218248 11124 ?        S    11:00   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root     48439  0.0  0.0 140176  1728 ?        S    16:03   0:00 CROND
nagios   48440  0.0  0.0 106092  1132 ?        Ss   16:03   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
nagios   48443  0.0  0.0 216176 10640 ?        S    16:03   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller
nagios   48586  0.0  0.0 106096  1308 ?        S    11:00   0:02 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
ll /store/backups/nagioslogserver

Code: Select all

drwxrwxrwx 2 nagios nagios    4096 Jan  1 11:00 1420128007
drwxrwxrwx 2 nagios nagios    4096 Jan  2 11:00 1420214412
drwxrwxrwx 2 nagios nagios    4096 Jan  3 11:00 1420300816
drwxrwxrwx 2 nagios nagios    4096 Jan  4 11:00 1420387216
drwxrwxrwx 2 nagios nagios    4096 Jan  5 11:00 1420473617
drwxrwxrwx 2 nagios nagios    4096 Jan  6 11:00 1420560022
drwxrwxrwx 2 nagios nagios    4096 Jan  7 11:00 1420646426
drwxrwxrwx 2 nagios nagios    4096 Jan  8 11:00 1420732827
drwxrwxrwx 2 nagios nagios    4096 Jan  9 11:00 1420819232
drwxrwxrwx 2 nagios nagios    4096 Jan 10 11:00 1420905636
drwxrwxrwx 2 nagios nagios    4096 Jan 11 11:00 1420992036
drwxrwxrwx 2 nagios nagios    4096 Jan 12 11:00 1421078437
drwxrwxrwx 2 nagios nagios    4096 Jan 13 11:00 1421164842
drwxrwxrwx 2 nagios nagios    4096 Jan 14 11:00 1421251241
drwxrwxrwx 2 nagios nagios    4096 Jan 15 11:00 1421337641
drwxrwxrwx 2 nagios nagios    4096 Jan 16 11:00 1421424043
drwxrwxrwx 2 nagios nagios    4096 Jan 17 11:00 1421510447
drwxrwxrwx 2 nagios nagios    4096 Jan 18 11:00 1421596846
drwxrwxrwx 2 nagios nagios    4096 Jan 19 11:00 1421683246
drwxrwxrwx 2 nagios nagios    4096 Jan 20 11:00 1421769646
drwxrwxrwx 2 nagios nagios    4096 Jan 21 11:00 1421856046
-rw-r--r-- 1 nagios nagios 5319621 Dec 30 11:00 nagioslogserver.2014-12-30.1419955201.tar.gz
-rw-r--r-- 1 nagios nagios 5323235 Dec 31 11:00 nagioslogserver.2014-12-31.1420041602.tar.gz
All the state.json files contain the same thing:

Code: Select all

{"count":2,"states":[{"mode":"export","started":"2015-01-01T16:00:07.430Z","path":"file:///store/backups/nagioslogserver/1420128007/nagioslogserver.tar.gz","node_name":"3cb77924-3178-4ba3-8952-1f329efab29c"},{"mode":"export","started":"2015-01-14T16:00:41.732Z","path":"file:///store/backups/nagioslogserver/1421251241/nagioslogserver.tar.gz","node_name":"3cb77924-3178-4ba3-8952-1f329efab29c"}]}
Where should I look from here to figure out why these processes are not cleaning up?

Re: Logs stop coming in

Posted: Wed Jan 21, 2015 4:59 pm
by cmerchant
It might be not enough space to finish the backup (tar.gz). Could you issue the following command on your NLS:

Code: Select all

df -h

Re: Logs stop coming in

Posted: Thu Jan 22, 2015 3:24 pm
by globalgiving
Plenty of space:

Code: Select all

$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             908G  127G  736G  15% /
tmpfs                 7.8G     0  7.8G   0% /dev/shm
/dev/sda1             243M   57M  173M  25% /boot

Re: Logs stop coming in

Posted: Thu Jan 22, 2015 4:00 pm
by scottwilkerson
Lets try this

Code: Select all

service elasticsearch restart

Re: Logs stop coming in

Posted: Thu Jan 22, 2015 4:18 pm
by globalgiving
So that did cause all of the backup processes to complete. The /store/backups/nagioslogserver directory now has completed backups in it. I will post tomorrow if tonight's backup gets stuck.

Re: Logs stop coming in

Posted: Fri Jan 23, 2015 10:51 am
by scottwilkerson
We have seen this in house too and are looking for resolution in future versions.