Logs stop coming in

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Logs stop coming in

Post by tmcdonald »

I am trying to confirm whether you and the OP in this topic are the same person. We can't close the topic if the OP's issue has not been resolved.
Former Nagios employee
User avatar
globalgiving
Posts: 25
Joined: Thu Aug 28, 2014 9:57 am
Location: Washington, DC
Contact:

Re: Logs stop coming in

Post by globalgiving »

This is OP. The other poster was not me or someone in my organization.

I increased the heap size and set the max locked memory as per the response I got, way back on the first page of this thread.

That did in fact solve my problem. I have not hit an issue where logs have stopped coming in anymore.

However, checking today on the system running NLS, I see that there are a lot of create_backup.sh processes running. They have been building up for a long time.. has not built to the point of causing elasticsearch to stop working.. but I am guessing that it will still eventually get there.

Process list on the NLS server:

Code: Select all

nagios     707  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     708  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     709  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     710  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     711  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     712  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     713  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     714  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     715  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     716  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
root       733  0.0  0.0      0     0 ?        S     2014   7:48 [jbd2/sda3-8]
root       734  0.0  0.0      0     0 ?        S     2014   0:00 [ext4-dio-unwrit]
nagios     752  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     753  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     754  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     755  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     756  0.0  0.0 100904   572 ?        S    16:04   0:00 sleep 5
nagios     757  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     758  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     759  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     760  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     761  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
nagios     762  0.0  0.0 100904   568 ?        S    16:04   0:00 sleep 5
root       764  0.0  0.0 140176  1728 ?        S    16:04   0:00 CROND
root       765  0.0  0.0 140176  1728 ?        S    16:04   0:00 CROND
nagios     766  0.0  0.0 106092  1136 ?        Ss   16:04   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
nagios     767  2.0  0.0 216176 10636 ?        S    16:04   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller
nagios     768  0.0  0.0 106092  1132 ?        Ss   16:04   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios     769  2.0  0.0 216176 10708 ?        S    16:04   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root       822  0.0  0.0  10736   784 ?        S<s   2014   0:00 /sbin/udevd -d
root      1596  0.0  0.0      0     0 ?        S     2014   0:00 [kjournald]
root      1630  0.0  0.0      0     0 ?        S     2014  10:29 [flush-8:0]
root      1640  0.0  0.0      0     0 ?        S     2014   0:52 [kauditd]
apache    1706  0.0  0.0 246188 15564 ?        S    Jan18   0:17 /usr/sbin/httpd
apache    1707  0.0  0.1 247164 16420 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1708  0.0  0.0 244768 13892 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1709  0.0  0.0 244772 13832 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1711  0.0  0.0 246592 14264 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1712  0.0  0.0 243236 12520 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1713  0.0  0.0 241592 11176 ?        S    Jan18   0:16 /usr/sbin/httpd
apache    1714  0.0  0.0 243256 11556 ?        S    Jan18   0:16 /usr/sbin/httpd
root      1772  0.0  0.0      0     0 ?        S     2014  12:28 [bond0]
root      1951  0.0  0.0  93200   872 ?        S<sl  2014   2:55 auditd
root      1967  0.0  0.0 485592  8308 ?        Sl    2014   1:57 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root      1981  0.0  0.0  10876   696 ?        Ss    2014  21:16 irqbalance --pid=/var/run/irqbalance.pid
rpc       1995  0.0  0.0  18976   816 ?        Ss    2014   0:04 rpcbind
rpcuser   2013  0.0  0.0  23348  1192 ?        Ss    2014   0:00 rpc.statd
dbus      2127  0.0  0.0  21404   684 ?        Ss    2014   0:00 dbus-daemon --system
root      2171  0.0  0.0   4080   576 ?        Ss    2014   0:00 /usr/sbin/acpid
root      2243  0.0  0.0 385736  3368 ?        Ssl   2014   0:42 automount --pid-file /var/run/autofs.pid
root      2260  0.0  0.0 197948  4700 ?        S     2014  16:11 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
root      2272  0.0  0.0  66616  1136 ?        Ss    2014   0:00 /usr/sbin/sshd
root      2280  0.0  0.0  22180   848 ?        Ss    2014   0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root      2418  0.0  0.0  83060  2496 ?        Ss    2014   1:12 sendmail: accepting connections
smmsp     2426  0.0  0.0  78656  2004 ?        Ss    2014   0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      2449  0.0  0.0 110316   944 ?        Ss    2014   0:00 /usr/sbin/abrtd
root      2491  0.0  0.0 238860  7880 ?        Ss    2014   2:02 /usr/sbin/httpd
root      2499  0.0  0.0 117292  1260 ?        Ss    2014   0:36 crond
root      2518  0.0  0.0 131172  1372 ?        SN    2014   0:00 runuser -s /bin/sh -c exec /usr/local/nagioslogserver/logstash/bin/logstash agent -f /usr/local/nagioslogserver/logstash/etc/conf.d -l /var/log/lo
nagios    2520  8.1  5.0 14044052 817496 ?     SNsl  2014 4830:27 /usr/bin/java -Djava.io.tmpdir=/usr/local/nagioslogserver/tmp -Xmx500m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSI
root      2581  0.0  0.0 342036  7580 ?        Sl    2014  12:55 /usr/bin/python /usr/bin/fail2ban-server -b -s /var/run/fail2ban/fail2ban.sock
root      2583  0.0  0.0   9364  1040 ?        S     2014   1:23 /usr/libexec/gam_server
root      2598  0.0  0.0  21540   484 ?        Ss    2014   0:00 /usr/sbin/atd
root      2612  0.0  0.0   4064   528 tty1     Ss+   2014   0:00 /sbin/mingetty /dev/tty1
root      2614  0.0  0.0   4064   528 tty2     Ss+   2014   0:00 /sbin/mingetty /dev/tty2
root      2616  0.0  0.0   4064   528 tty3     Ss+   2014   0:00 /sbin/mingetty /dev/tty3
root      2618  0.0  0.0   4064   532 tty4     Ss+   2014   0:00 /sbin/mingetty /dev/tty4
root      2620  0.0  0.0   4064   532 tty5     Ss+   2014   0:00 /sbin/mingetty /dev/tty5
root      2622  0.0  0.0   4064   528 tty6     Ss+   2014   0:00 /sbin/mingetty /dev/tty6
root      2624  0.0  0.0  10732   748 ?        S<    2014   0:00 /sbin/udevd -d
root      2625  0.0  0.0  10732   736 ?        S<    2014   0:00 /sbin/udevd -d
root      7540  0.0  0.0 140176  1728 ?        S    Jan09   0:00 CROND
nagios    7545  0.0  0.0 106092  1132 ?        Ss   Jan09   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios    7546  0.0  0.0 218504 10960 ?        S    Jan09   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios    7779  0.0  0.0 106096  1292 ?        S    Jan09   2:22 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root      8271  0.0  0.0 140176  1728 ?        S    Jan01   0:00 CROND
nagios    8277  0.0  0.0 106092  1136 ?        Ss   Jan01   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios    8278  0.0  0.0 219020 11356 ?        S    Jan01   0:05 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios    8299  0.0  0.0 106096  1292 ?        S    Jan01   3:53 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root      9399  0.0  0.0 140176  1728 ?        S    Jan15   0:00 CROND
nagios    9401  0.0  0.0 106092  1136 ?        Ss   Jan15   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios    9403  0.0  0.0 218248 10960 ?        S    Jan15   0:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios    9924  0.0  0.0 106096  1292 ?        S    Jan15   1:11 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     11056  0.0  0.0 140176  1728 ?        S    Jan13   0:00 CROND
nagios   11058  0.0  0.0 106092  1132 ?        Ss   Jan13   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   11064  0.0  0.0 218248 10744 ?        S    Jan13   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   11508  0.0  0.0 106096  1292 ?        S    Jan13   1:34 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     11547  0.0  0.0 140176  1728 ?        S    Jan08   0:00 CROND
nagios   11552  0.0  0.0 106092  1132 ?        Ss   Jan08   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   11555  0.0  0.0 218504 10976 ?        S    Jan08   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   11732  0.0  0.0 106096  1296 ?        S    Jan08   2:34 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     12996  0.0  0.0 140176  1728 ?        S    Jan18   0:00 CROND
nagios   13000  0.0  0.0 106092  1132 ?        Ss   Jan18   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   13001  0.0  0.0 218248 10912 ?        S    Jan18   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
apache   13157  0.0  0.0 241072 10604 ?        S    12:38   0:02 /usr/sbin/httpd
nagios   13683  0.0  0.0 106096  1292 ?        S    Jan18   0:37 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     14317  0.0  0.0 140176  1728 ?        S    Jan20   0:00 CROND
nagios   14321  0.0  0.0 106092  1132 ?        Ss   Jan20   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   14324  0.0  0.0 218248 10880 ?        S    Jan20   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   15077  0.0  0.0 106096  1292 ?        S    Jan20   0:14 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     15111  0.0  0.0 140176  1728 ?        S    Jan11   0:00 CROND
nagios   15114  0.0  0.0 106092  1136 ?        Ss   Jan11   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   15116  0.0  0.0 218504 10924 ?        S    Jan11   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   15460  0.0  0.0 106096  1296 ?        S    Jan11   1:58 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     18635  0.0  0.0 140176  1728 ?        S    Jan06   0:00 CROND
nagios   18639  0.0  0.0 106092  1132 ?        Ss   Jan06   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   18643  0.0  0.0 218504 11004 ?        S    Jan06   0:04 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   18762  0.0  0.0 106096  1292 ?        S    Jan06   2:57 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
apache   18784  0.0  0.0 241072 10420 ?        S    Jan20   0:16 /usr/sbin/httpd
root     19786  0.0  0.0 140176  1728 ?        S    Jan17   0:00 CROND
nagios   19788  0.0  0.0 106092  1132 ?        Ss   Jan17   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   19792  0.0  0.0 218248 10932 ?        S    Jan17   0:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root     20016  0.0  0.0 140176  1728 ?        S    Jan04   0:00 CROND
nagios   20019  0.0  0.0 106092  1132 ?        Ss   Jan04   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   20020  0.0  0.0 219020 11304 ?        S    Jan04   0:04 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   20086  0.0  0.0 106096  1296 ?        S    Jan04   3:20 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
nagios   20439  0.0  0.0 106096  1296 ?        S    Jan17   0:49 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     20811  0.0  0.0 140176  1728 ?        S    Jan07   0:00 CROND
nagios   20815  0.0  0.0 106092  1128 ?        Ss   Jan07   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   20818  0.0  0.0 218504 10992 ?        S    Jan07   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   20987  0.0  0.0 106096  1296 ?        S    Jan07   2:44 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
apache   21858  0.0  0.0 246628 14808 ?        S    Jan20   0:16 /usr/sbin/httpd
root     24883  0.0  0.0 140176  1728 ?        S    Jan10   0:00 CROND
nagios   24886  0.0  0.0 106092  1132 ?        Ss   Jan10   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   24889  0.0  0.0 218504 10944 ?        S    Jan10   0:03 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   25198  0.0  0.0 106096  1296 ?        S    Jan10   2:10 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     27361  0.0  0.0 140176  1728 ?        S    Jan19   0:00 CROND
root     27362  0.0  0.0 140176  1728 ?        S    Jan12   0:00 CROND
nagios   27366  0.0  0.0 106092  1136 ?        Ss   Jan12   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   27367  0.0  0.0 106092  1132 ?        Ss   Jan19   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   27370  0.0  0.0 218504 10912 ?        S    Jan12   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   27372  0.0  0.0 218248 10896 ?        S    Jan19   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   27735  0.0  0.0 106096  1292 ?        S    Jan12   1:46 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
nagios   28088  0.0  0.0 106096  1296 ?        S    Jan19   0:25 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     32334  0.0  0.0 140176  1728 ?        S    Jan03   0:00 CROND
nagios   32338  0.0  0.0 106092  1132 ?        Ss   Jan03   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   32340  0.0  0.0 219020 11316 ?        S    Jan03   0:05 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   32390  0.0  0.0 106096  1292 ?        S    Jan03   3:31 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     35912  0.0  0.0 140176  1728 ?        S    Jan05   0:00 CROND
nagios   35918  0.0  0.0 106092  1128 ?        Ss   Jan05   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   35921  0.0  0.0 219020 11288 ?        S    Jan05   0:04 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   36003  0.0  0.0 106096  1292 ?        S    Jan05   3:07 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     36918  0.0  0.0 140176  1728 ?        S    Jan02   0:00 CROND
nagios   36921  0.0  0.0 106092  1132 ?        Ss   Jan02   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   36923  0.0  0.0 219020 11332 ?        S    Jan02   0:05 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   36950  0.0  0.0 106096  1296 ?        S    Jan02   3:45 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
nagios   37015 14.0 58.6 49121612 9548632 ?    SLl   2014 4459:41 /usr/bin/java -Xms8g -Xmx8g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:
root     39376  0.0  0.0 140176  1728 ?        S    Jan16   0:00 CROND
nagios   39380  0.0  0.0 106092  1128 ?        Ss   Jan16   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   39382  0.0  0.0 218248 10940 ?        S    Jan16   0:01 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   39938  0.0  0.0 106096  1292 ?        S    Jan16   1:00 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     47038  0.0  0.0 140176  1728 ?        S    Jan14   0:00 CROND
nagios   47042  0.0  0.0 106092  1132 ?        Ss   Jan14   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   47045  0.0  0.0 218248 10980 ?        S    Jan14   0:02 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
nagios   47545  0.0  0.0 106096  1296 ?        S    Jan14   1:24 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
root     47765  0.0  0.0 140176  1728 ?        S    11:00   0:00 CROND
nagios   47769  0.0  0.0 106092  1132 ?        Ss   11:00   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
nagios   47773  0.0  0.0 218248 11124 ?        S    11:00   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root     48439  0.0  0.0 140176  1728 ?        S    16:03   0:00 CROND
nagios   48440  0.0  0.0 106092  1132 ?        Ss   16:03   0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
nagios   48443  0.0  0.0 216176 10640 ?        S    16:03   0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller
nagios   48586  0.0  0.0 106096  1308 ?        S    11:00   0:02 /bin/sh /usr/local/nagioslogserver/scripts/create_backup.sh
ll /store/backups/nagioslogserver

Code: Select all

drwxrwxrwx 2 nagios nagios    4096 Jan  1 11:00 1420128007
drwxrwxrwx 2 nagios nagios    4096 Jan  2 11:00 1420214412
drwxrwxrwx 2 nagios nagios    4096 Jan  3 11:00 1420300816
drwxrwxrwx 2 nagios nagios    4096 Jan  4 11:00 1420387216
drwxrwxrwx 2 nagios nagios    4096 Jan  5 11:00 1420473617
drwxrwxrwx 2 nagios nagios    4096 Jan  6 11:00 1420560022
drwxrwxrwx 2 nagios nagios    4096 Jan  7 11:00 1420646426
drwxrwxrwx 2 nagios nagios    4096 Jan  8 11:00 1420732827
drwxrwxrwx 2 nagios nagios    4096 Jan  9 11:00 1420819232
drwxrwxrwx 2 nagios nagios    4096 Jan 10 11:00 1420905636
drwxrwxrwx 2 nagios nagios    4096 Jan 11 11:00 1420992036
drwxrwxrwx 2 nagios nagios    4096 Jan 12 11:00 1421078437
drwxrwxrwx 2 nagios nagios    4096 Jan 13 11:00 1421164842
drwxrwxrwx 2 nagios nagios    4096 Jan 14 11:00 1421251241
drwxrwxrwx 2 nagios nagios    4096 Jan 15 11:00 1421337641
drwxrwxrwx 2 nagios nagios    4096 Jan 16 11:00 1421424043
drwxrwxrwx 2 nagios nagios    4096 Jan 17 11:00 1421510447
drwxrwxrwx 2 nagios nagios    4096 Jan 18 11:00 1421596846
drwxrwxrwx 2 nagios nagios    4096 Jan 19 11:00 1421683246
drwxrwxrwx 2 nagios nagios    4096 Jan 20 11:00 1421769646
drwxrwxrwx 2 nagios nagios    4096 Jan 21 11:00 1421856046
-rw-r--r-- 1 nagios nagios 5319621 Dec 30 11:00 nagioslogserver.2014-12-30.1419955201.tar.gz
-rw-r--r-- 1 nagios nagios 5323235 Dec 31 11:00 nagioslogserver.2014-12-31.1420041602.tar.gz
All the state.json files contain the same thing:

Code: Select all

{"count":2,"states":[{"mode":"export","started":"2015-01-01T16:00:07.430Z","path":"file:///store/backups/nagioslogserver/1420128007/nagioslogserver.tar.gz","node_name":"3cb77924-3178-4ba3-8952-1f329efab29c"},{"mode":"export","started":"2015-01-14T16:00:41.732Z","path":"file:///store/backups/nagioslogserver/1421251241/nagioslogserver.tar.gz","node_name":"3cb77924-3178-4ba3-8952-1f329efab29c"}]}
Where should I look from here to figure out why these processes are not cleaning up?
Justin Rupp
Senior Systems Ninja
GlobalGiving Foundation
cmerchant
Posts: 546
Joined: Wed Sep 24, 2014 11:19 am

Re: Logs stop coming in

Post by cmerchant »

It might be not enough space to finish the backup (tar.gz). Could you issue the following command on your NLS:

Code: Select all

df -h
User avatar
globalgiving
Posts: 25
Joined: Thu Aug 28, 2014 9:57 am
Location: Washington, DC
Contact:

Re: Logs stop coming in

Post by globalgiving »

Plenty of space:

Code: Select all

$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             908G  127G  736G  15% /
tmpfs                 7.8G     0  7.8G   0% /dev/shm
/dev/sda1             243M   57M  173M  25% /boot
Justin Rupp
Senior Systems Ninja
GlobalGiving Foundation
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Logs stop coming in

Post by scottwilkerson »

Lets try this

Code: Select all

service elasticsearch restart
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
globalgiving
Posts: 25
Joined: Thu Aug 28, 2014 9:57 am
Location: Washington, DC
Contact:

Re: Logs stop coming in

Post by globalgiving »

So that did cause all of the backup processes to complete. The /store/backups/nagioslogserver directory now has completed backups in it. I will post tomorrow if tonight's backup gets stuck.
Justin Rupp
Senior Systems Ninja
GlobalGiving Foundation
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Logs stop coming in

Post by scottwilkerson »

We have seen this in house too and are looking for resolution in future versions.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked