Page 2 of 3
Re: Snapshots
Posted: Thu May 28, 2015 1:14 am
by teirekos
I have two nodes.
Below is the mount from the other node:
[root@NagiosLogServer2 ~]# mount
rootfs on / type rootfs (rw)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,relatime)
devtmpfs on /dev type devtmpfs (rw,relatime,size=8224900k,nr_inodes=2056225,mode=755)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /dev/shm type tmpfs (rw,relatime)
/dev/sda1 on / type ext4 (rw,noatime,barrier=1,data=ordered)
/proc/bus/usb on /proc/bus/usb type usbfs (rw,relatime)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
10.1.11.10:/NLSBackup on /NLSBackup type nfs (rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.1.11.10,mountvers=3,mountport=42491,mountproto=tcp,local_lock=all,addr=10.1.11.10)
Re: Snapshots
Posted: Thu May 28, 2015 2:34 pm
by tgriep
On the NagiosLogServer2, can you delete a file logged in as the nagios user?
Re: Snapshots
Posted: Fri May 29, 2015 6:09 am
by teirekos
Yes I can.
Re: Snapshots
Posted: Fri May 29, 2015 11:47 am
by tgriep
Can you go to the Command Subsystem screen and post a screen capture of it?
You may want to try clicking on the "Reset All Jobs" button in case they were not running correctly.
Re: Snapshots
Posted: Tue Jun 02, 2015 3:50 am
by teirekos
I've reset all jobs. And then run the backup&maint (screenshots attached) Still my filesystem is 100%
[root@NagiosLogServer2 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 99G 92G 6.4G 94% /
devtmpfs 7.9G 148K 7.9G 1% /dev
tmpfs 7.9G 0 7.9G 0% /dev/shm
/dev/sda1 99G 92G 6.4G 94% /
10.1.11.10:/NLSBackup
79G 75G 1.0M 100% /NLSBackup
---
I have noticed the following ... it has to do with the structure of the files inside my backup filesystem (/NLSBackup).
[root@NagiosLogServer NLSBackup]# ls -ltr
total 128
-rw-r--r-- 1 nagios users 61 May 25 07:51 metadata-logstash-2015.05.24
-rw-r--r-- 1 nagios users 189 May 25 07:52 snapshot-logstash-2015.05.24
-rw-r--r-- 1 nagios users 61 May 26 14:31 metadata-logstash-2015.05.03
-rw-r--r-- 1 nagios users 189 May 26 14:32 snapshot-logstash-2015.05.03
-rw-r--r-- 1 nagios users 61 May 26 14:32 metadata-logstash-2015.05.04
-rw-r--r-- 1 nagios users 192 May 26 14:33 snapshot-logstash-2015.05.04
-rw-r--r-- 1 nagios users 61 May 26 14:33 metadata-logstash-2015.05.05
-rw-r--r-- 1 nagios users 188 May 26 14:34 snapshot-logstash-2015.05.05
-rw-r--r-- 1 nagios users 61 May 26 14:34 metadata-logstash-2015.05.06
-rw-r--r-- 1 nagios users 190 May 26 14:36 snapshot-logstash-2015.05.06
-rw-r--r-- 1 nagios users 61 May 26 14:36 metadata-logstash-2015.05.07
-rw-r--r-- 1 nagios users 190 May 26 14:38 snapshot-logstash-2015.05.07
-rw-r--r-- 1 nagios users 61 May 26 14:38 metadata-logstash-2015.05.08
-rw-r--r-- 1 nagios users 190 May 26 14:40 snapshot-logstash-2015.05.08
-rw-r--r-- 1 nagios users 61 May 26 14:40 metadata-logstash-2015.05.09
-rw-r--r-- 1 nagios users 189 May 26 14:41 snapshot-logstash-2015.05.09
-rw-r--r-- 1 nagios users 61 May 26 14:41 metadata-logstash-2015.05.10
-rw-r--r-- 1 nagios users 190 May 26 14:42 snapshot-logstash-2015.05.10
-rw-r--r-- 1 nagios users 61 May 26 14:42 metadata-logstash-2015.05.11
-rw-r--r-- 1 nagios users 189 May 26 14:43 snapshot-logstash-2015.05.11
-rw-r--r-- 1 nagios users 61 May 26 14:43 metadata-logstash-2015.05.12
-rw-r--r-- 1 nagios users 190 May 26 14:45 snapshot-logstash-2015.05.12
-rw-r--r-- 1 nagios users 61 May 26 14:45 metadata-logstash-2015.05.13
-rw-r--r-- 1 nagios users 190 May 26 14:47 snapshot-logstash-2015.05.13
-rw-r--r-- 1 nagios users 61 May 26 14:47 metadata-logstash-2015.05.14
-rw-r--r-- 1 nagios users 190 May 26 14:49 snapshot-logstash-2015.05.14
-rw-r--r-- 1 nagios users 61 May 26 14:49 metadata-logstash-2015.05.15
-rw-r--r-- 1 nagios users 191 May 26 14:52 snapshot-logstash-2015.05.15
-rw-r--r-- 1 nagios users 61 May 26 14:52 metadata-logstash-2015.05.16
-rw-r--r-- 1 nagios users 190 May 26 14:53 snapshot-logstash-2015.05.16
-rw-r--r-- 1 nagios users 61 May 26 14:53 metadata-logstash-2015.05.25
drwxr-xr-x 80 nagios nagios 4096 May 26 14:53 indices
The file "index" is missing!
I also attach you the indices.txt which lists the files inside the folder indices. Are all these logstash directories needed for 10 days snapshots (as configured in the GUI "Delete backups older than 10 days")?
Re: Snapshots
Posted: Tue Jun 02, 2015 10:48 am
by jolson
Let's try the following.
Manually delete the following backups:
-rw-r--r-- 1 nagios users 61 May 26 14:31 metadata-logstash-2015.05.03
-rw-r--r-- 1 nagios users 189 May 26 14:32 snapshot-logstash-2015.05.03
-rw-r--r-- 1 nagios users 61 May 26 14:32 metadata-logstash-2015.05.04
-rw-r--r-- 1 nagios users 192 May 26 14:33 snapshot-logstash-2015.05.04
Really, any backups are fine - I selected the above because they're the oldest.
After the backups have been deleted, follow-tail jobs.log on all of your nodes and force a backup from the GUI.
Code: Select all
tail -f /usr/local/nagioslogserver/var/jobs.log
Re: Snapshots
Posted: Wed Jun 03, 2015 4:14 am
by teirekos
Before doing that can you tell me please a safe way to backup those files i.e. on an external drive so when restoring to have access to the logs. I mean which files exactly do I have to move out. Only "snapshot-logstash-2015.05.03" & "metadata-logstash-2015.05.03". What about the "logstash-2015.05.03" &
"logstash-2015.05.04" directories under /indices? How important is the "index" file that is missing?
Thanx
Re: Snapshots
Posted: Wed Jun 03, 2015 10:54 am
by jolson
You will need to move the snapshot file and the metadata file out of the backup directory. You will also need to move the directory under 'indices' to somewhere else - the movements should be reflected in the Web GUI immediately. If you move those files back to their appropriate places, the changes should be reflected in the Web GUI immediately.
You may need to perform the movements as the 'nagios' user. If I had a backup called logstash-2015.04.04, my commands might looks something like this.
Code: Select all
su - nagios
cd /mnt/nlsback
mv *logstash-2015.04.04 /home/nagios
mv indices/logstash-2015.04.04/ /home/nagios/
Re: Snapshots
Posted: Thu Jun 04, 2015 9:15 am
by teirekos
I have followed you instructions deleting manually 2 dates.
I've forced the backup & maintenance from the GUI and I attach the backup_main_out.txt from the jobs tail.
Then as you can see from this txt file I've deleted the problematic logstash-2015.05.30 and I rerun the backup and maintenance and it worked fine (I attach the backup_main_out2.txt). The question is what caused the problem in the first place since I have the problem before the 30/05? The problem resulted in a 100% of my backup storage space.
Thanx.
Re: Snapshots
Posted: Thu Jun 04, 2015 9:33 am
by jolson
So in your jobs.log, curator attempted to create a snapshot (logstash-05-20-2015) and failed. The automatic deletion then subsequently failed, which caused the backup process to stop.
I have a hunch that the creation failed in the first place due to a lack of space on your backup drive. Any chance that you could move/delete all irrelevant snapshots off of the backup drive and re-run the backup script?
Was your index file re-created?
If this is pressing and you would like a remote, I'd be happy to perform a remote with you. Just email
[email protected] with a reference to this forum post.