Page 1 of 3
backups not functioning after update to 1.4.0
Posted: Thu Feb 04, 2016 7:57 am
by krobertson71
I found the other thread where the old backups were showing as n/a. I created an 'oldbackups' dir and copied everything into it.
I then set the back up job to run about 1 hour in the future last night. Still nothing in the backup snapshots. Screenshot below.
nlsbackups-1.png
I have verified that nagios can write to that directory by touching a file and then deleting it.
Re: backups not functioning after update to 1.4.0
Posted: Thu Feb 04, 2016 3:11 pm
by jolson
Try running the following on your command line:
Code: Select all
which curator
ls -l /usr/lib/python2.6/site-packages/curator/curator.py
curator --help
curator snapshot --repository nlsback indices --prefix logstash
I'm interested in seeing the error output. Thanks!
Re: backups not functioning after update to 1.4.0
Posted: Thu Feb 04, 2016 3:22 pm
by krobertson71
Here you go
Code: Select all
[nagios@nagilgp01 ~]$ which curator
/usr/bin/curator
Code: Select all
[nagios@nagilgp01 ~]$ ls -l /usr/lib/python2.6/site-packages/curator/curator.py
-rw-r--r-- 1 root root 80 Feb 2 11:49 /usr/lib/python2.6/site-packages/curator/curator.py
Code: Select all
[nagios@nagilgp01 ~]$ curator --help
Usage: curator [OPTIONS] COMMAND [ARGS]...
Curator for Elasticsearch indices.
See http://elastic.co/guide/en/elasticsearch/client/curator/current
Options:
--host TEXT Elasticsearch host.
--url_prefix TEXT Elasticsearch http url prefix.
--port INTEGER Elasticsearch port.
--use_ssl Connect to Elasticsearch through SSL.
--certificate TEXT Path to certificate to use for SSL validation.
(OPTIONAL)
--ssl-no-validate Do not validate SSL certificate
--http_auth TEXT Use Basic Authentication ex: user:pass
--timeout INTEGER Connection timeout in seconds.
--master-only Only operate on elected master node.
--dry-run Do not perform any changes.
--debug Debug mode
--loglevel TEXT Log level
--logfile TEXT log file
--logformat TEXT Log output format [default|logstash].
--quiet Suppress command-line output.
--version Show the version and exit.
--help Show this message and exit.
Commands:
alias Index Aliasing
allocation Index Allocation
bloom Disable bloom filter cache
close Close indices
delete Delete indices or snapshots
open Open indices
optimize Optimize Indices
replicas Replica Count Per-shard
seal Seal indices (Synced flush: ES 1.6.0+ only)
show Show indices or snapshots
snapshot Take snapshots of indices (Backup)
Code: Select all
[nagios@nagilgp01 ~]$ curator snapshot --repository nlsback indices --prefix logstash
2016-02-04 15:21:02,522 INFO Job starting: snapshot indices
2016-02-04 15:21:02,522 WARNING Overriding default connection timeout. New timeout: 21600
2016-02-04 15:21:02,581 INFO Action snapshot will be performed on the following indices: [u'logstash-2016.01.26', u'logstash-2016.01.27', u'logstash-2016.01.28', u'logstash-2016.01.29', u'logstash-2016.01.30', u'logstash-2016.01.31', u'logstash-2016.02.01', u'logstash-2016.02.02', u'logstash-2016.02.03', u'logstash-2016.02.04', u'logstash-2016.02.05']
2016-02-04 15:21:02,905 ERROR Failed to verify all nodes have repository access.
2016-02-04 15:21:02,906 WARNING Job did not complete successfully.
That last one seems to be the issue. Never had a problem with this before. We just had the "old version" backup issues with partials etc..
Re: backups not functioning after update to 1.4.0
Posted: Thu Feb 04, 2016 7:55 pm
by krobertson71
Feel like a moron. Been busy today and just saw that I did not put in the name of my repository. No difference.. Still the same error.
The /backups directory is owned by nagios:nagios. Group nagios contains apache and nagios.
Code: Select all
[nagios@nagilgp01 ~]$ curator snapshot --repository backups indices --prefix logstash
2016-02-04 19:51:43,604 INFO Job starting: snapshot indices
2016-02-04 19:51:43,605 WARNING Overriding default connection timeout. New timeout: 21600
2016-02-04 19:51:43,627 INFO Action snapshot will be performed on the following indices: [u'logstash-2016.01.26', u'logstash-2016.01.27', u'logstash-2016.01.28', u'logstash-2016.01.29', u'logstash-2016.01.30', u'logstash-2016.01.31', u'logstash-2016.02.01', u'logstash-2016.02.02', u'logstash-2016.02.03', u'logstash-2016.02.04', u'logstash-2016.02.05']
2016-02-04 19:51:43,987 ERROR Failed to verify all nodes have repository access.
2016-02-04 19:51:43,988 WARNING Job did not complete successfully.
Re: backups not functioning after update to 1.4.0
Posted: Thu Feb 04, 2016 9:12 pm
by Box293
Can I get you to confirm that /backups is a NFS share mounted on all log server nodes (or other similar common share).
Re: backups not functioning after update to 1.4.0
Posted: Fri Feb 05, 2016 10:32 am
by krobertson71
I can confirm they are not, but they were not in the past. Each node has it's own /backups directory. In the past each node would backup to it's local /backup directory.
Re: backups not functioning after update to 1.4.0
Posted: Fri Feb 05, 2016 2:00 pm
by hsmith
Have you verified the permissions on the "/backup" directory across all of the nodes?
Re: backups not functioning after update to 1.4.0
Posted: Fri Feb 05, 2016 4:45 pm
by krobertson71
yes they are both the same
node 1
Code: Select all
drwxrwxr-x 2 nagios nagios 4096 Feb 4 22:40 backups
nagios group has 'apache' and 'nagios' as members
node 2
Code: Select all
drwxrwxr-x 3 nagios nagios 4096 Feb 4 22:42 backups
same as above
Re: backups not functioning after update to 1.4.0
Posted: Fri Feb 05, 2016 4:57 pm
by krobertson71
also drive space has plenty of room
Re: backups not functioning after update to 1.4.0
Posted: Sun Feb 07, 2016 8:05 pm
by Box293
Box293 wrote:Can I get you to confirm that /backups is a NFS share mounted on all log server nodes (or other similar common share).
krobertson71 wrote:I can confirm they are not, but they were not in the past. Each node has it's own /backups directory. In the past each node would backup to it's local /backup directory.
This is the source of your problem.
Here is the information detailing this when you go to add a backup repository:
Screenshot.png
While it may have worked in previous versions, you need to correct this so it is a shared repository. The backup process is executed on ONE of the nodes, not every one of the nodes. Hence all nodes need access to a share repository.