Backup snapshots disappeared

batzos · Post by **batzos** » Wed Feb 17, 2016 9:08 am

I have a 2 instances cluster. I stopped and restarted elasticsearch service from the CLI of the 1st server and since then I cannot see my backup snapshots in "Backup & Maintenance". I reset all jobs from the "Command Subsystem", but nothing. In no instance they are visible. I do not know if the following has an impact, but the last weeks, in the 1st instance I get the message from the system status that both elasticsearch and logstash are stopped (!) and in the second one only logstash is stopped.

jolson · Post by **jolson** » Wed Feb 17, 2016 10:36 am

in the 1st instance I get the message from the system status that both elasticsearch and logstash are stopped

That could matter. I'm interested in the following information from that node:

Code: Select all

df -h
free -m
top | head -n5
cat /etc/sysconfig/logstash
tail -n300 /var/log/elasticsearch/*.log

batzos · Post by **batzos** » Tue Feb 23, 2016 3:01 am

I may have found the reason of the "disappearance" of the snapshots from my first server in the cluster. This cluster consists of 2 servers in LAN. After this, I installed another Nagios log server in DMZi. I have assigned to the server in DMZi the same backup repository as this one for the 1st cluster, even though it is not part of the first cluster, it is a separate NLS. I guess, when I restarted elasticsearch in the 1st one, the snapshots were lost and now I can see them all in the DMZi NLS. How can I get them back to the 1st server?
Below are the results of the commands:

Code: Select all

[root@ ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg01-rootvol
                      252G   20G  220G   8% /
tmpfs                 7.8G     0  7.8G   0% /dev/shm
/dev/sda1             248M   76M  160M  33% /boot
[root@eicillp095 ~]#



[root@ ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         15947      15684        263          0        220       4799
-/+ buffers/cache:      10663       5283
Swap:         1023         83        940
[root@eicillp095 ~]#

[root@ ~]# top | head -n5
top - 08:53:08 up 47 days, 10 min,  1 user,  load average: 0.09, 0.04, 0.01
Tasks: 199 total,   1 running, 198 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.6%us,  0.8%sy,  0.4%ni, 96.0%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16330152k total, 16060412k used,   269740k free,   225932k buffers
Swap:  1048572k total,    85872k used,   962700k free,  4915352k cached


[root@ ~]# cat /etc/sysconfig/logstash
###############################
# Default settings for logstash
###############################

# Override Java location
#JAVACMD=/usr/bin/java

# Set a home directory
APP_DIR=/usr/local/nagioslogserver
LS_HOME="$APP_DIR/logstash"

# set ES_CLUSTER
ES_CLUSTER=$(cat $APP_DIR/var/cluster_uuid)

# Arguments to pass to java
#LS_HEAP_SIZE="256m"
LS_JAVA_OPTS="-Djava.io.tmpdir=$APP_DIR/tmp"

# Logstash filter worker threads
#LS_WORKER_THREADS=1

# pidfiles aren't used for upstart; this is for sysv users.
#LS_PIDFILE=/var/run/logstash.pid

# user id to be invoked as; for upstart: edit /etc/init/logstash.conf
LS_USER=root
LS_GROUP=nagios

# logstash logging
#LS_LOG_FILE=/var/log/logstash/logstash.log
#LS_USE_GC_LOGGING="true"

# logstash configuration directory
LS_CONF_DIR="$LS_HOME/etc/conf.d"

# Open file limit; cannot be overridden in upstart
#LS_OPEN_FILES=2048

# Nice level
#LS_NICE=0

# Increase Filter workers to 4 threads
LS_OPTS=" -w 4"

if [ "x$1" == "xstart" -o "x$1" == "xrestart" -o "x$1" == "xreload" ];then
        GET_LOGSTASH_CONFIG_MESSAGE=$( php /usr/local/nagioslogserver/scripts/get_logstash_config.php )
        GET_LOGSTASH_CONFIG_RETURN=$?
        if [ "$GET_LOGSTASH_CONFIG_RETURN" != "0" ]; then
                echo $GET_LOGSTASH_CONFIG_MESSAGE
                exit 1
        fi
fi

[root@ ~]# tail -n300 /var/log/elasticsearch/*.log
==> /var/log/elasticsearch/5a2aeff9-fe3d-4f48-bd79-118614f9436d_index_indexing_slowlog.log <==

==> /var/log/elasticsearch/5a2aeff9-fe3d-4f48-bd79-118614f9436d_index_search_slowlog.log <==

==> /var/log/elasticsearch/5a2aeff9-fe3d-4f48-bd79-118614f9436d.log <==

jolson · Post by **jolson** » Tue Feb 23, 2016 5:17 pm

I have assigned to the server in DMZi the same backup repository as this one for the 1st cluster

What backup repository are you using - I assume an NFS server or similar? Also, I'm interested in the mount-point you're using on Nagios Log Server.

I guess, when I restarted elasticsearch in the 1st one, the snapshots were lost and now I can see them all in the DMZi NLS

You mean your old snapshots appear on the second cluster and not the first? That's very strange behavior. It's worth noting that two distinct clusters must never be connected to the same share - they have the possibility of overwriting each others data.

If you disconnect the second cluster from your backup repository, I'm willing to bet that your first cluster will re-acquire the data once it runs another backup.

batzos · Post by **batzos** » Thu Mar 10, 2016 11:21 am

I am using a CIFS share with NFS access. With mount point do you mean the "location" in the backup configuration? In that case it is: /net/logs.../.../...
The snapshots disappeared from both instances from the same cluster.
Regarding the bet, you would lose it, because the backup reappeared only when I removed the repository from the primary instance and remounted it. It did not work when I removed it from the 2nd cluster. Anyway all now is back to normal.
Some last issues before you close this thread.
- When we add new instances in the same cluster, the backup is automatically assigned to them as it is in the first instance or do we have to do it manually each time? I did it manually in the 2nd instance and in Backup snapshots list I get "Created / Name (Click )" and instead of the name I have "N/A i" and if I click on the i icon I get the name.
- I also got as a first entry a "curator" for logs of the past 10 days (I have set to close indexes after 10 days). Is it because backup is taken every day for the current logs and not after the period that is set to close them? I have lost though the backup of a period of 15 days before that, but there is no problem since they are test logs.

jolson · Post by **jolson** » Thu Mar 10, 2016 5:29 pm

- When we add new instances in the same cluster, the backup is automatically assigned to them as it is in the first instance or do we have to do it manually each time? I did it manually in the 2nd instance and in Backup snapshots list I get "Created / Name (Click )" and instead of the name I have "N/A i" and if I click on the i icon I get the name.

Currently the process of mounting your backup share must be repeated manually on each new instance added to the cluster. This is because any instance in the cluster may pick up and run the backup job, and there's no telling which instance it will be.

The backup process was upgraded in Nagios Log Server 1.4.0, and the 'N/A' fields you are seeing are likely from backups taken before the upgrade. I'll refrain from betting - but did you recently upgrade your cluster? All of your old backups will still function properly, but they will have missing information (indicated by N/A) that is present in newer backups.

- I also got as a first entry a "curator" for logs of the past 10 days (I have set to close indexes after 10 days). Is it because backup is taken every day for the current logs and not after the period that is set to close them? I have lost though the backup of a period of 15 days before that, but there is no problem since they are test logs.

Would you please send us a screenshot of what this looks like on your end? The new backups will list all of the indices backed up every time the backup process runs - the new process is incremental. Even though it looks like there are duplicate backups being taken daily, there are not. A screenshot would help clarify your question.

Thanks!

Jesse

Nagios Support Forum

Backup snapshots disappeared

Backup snapshots disappeared

Re: Backup snapshots disappeared

Re: Backup snapshots disappeared

Re: Backup snapshots disappeared

Re: Backup snapshots disappeared

Re: Backup snapshots disappeared