Page 1 of 1

Backup & Maintenance

Posted: Tue Mar 31, 2015 3:20 pm
by blariv
Hi,

I'm trying to setup the backup repo and nothing ever shows up in the snapshots.

I have a 2 node cluster (both downloaded appliances)
Both servers have access to the NFS mount and can read/write to the share
I have reset all jobs
Screenshot of config attached
I have tailed the log while running the backup job and it completes

Tail:

Running command do_backups with args ' ' for job id: backups
SUCCESS
Running command run_alerts with args ' ' for job id: run_all_alerts
SUCCESS

Permissions:

[root@nagioslogserver ~]# ll -d /logrepo/logserver/
drwxr-xr-x 2 nagios users 1024 Mar 30 13:43 /logrepo/logserver/
[root@nagioslogserver ~]# ll -d /logrepo/
drwxrwxrwx 6 nagios nagios 1024 Mar 26 10:34 /logrepo

Please let me know what else I can check.

Thanks,

Brian

Re: Backup & Maintenance

Posted: Tue Mar 31, 2015 3:25 pm
by jolson
Please tail -f /usr/local/nagioslogserver/var/jobs.log and force a backup_maintenance command from the web GUI (Command Subsystem). What appears in the jobs.log?

Re: Backup & Maintenance

Posted: Tue Mar 31, 2015 3:44 pm
by blariv
attached

Re: Backup & Maintenance

Posted: Tue Mar 31, 2015 3:52 pm
by jolson
It looks like several of your indices are closed - backups can not be done on closed indices. Please see the attached image - what is your "Close indexes older than" setting set to?

It looks like the index logstash-2015.03.12 caused the python trace we see in your jobs.log file:

Code: Select all

Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 736, in <module>
    main()
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 731, in main
    arguments.func(client, **argdict)
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 585, in command_loop
    skipped = op(client, index_name, **kwargs)
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 406, in _create_snapshot
    client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
  File "/usr/lib/python2.6/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/lib/python2.6/site-packages/elasticsearch/client/snapshot.py", line 22, in create
    repository, snapshot), params=params, body=body)
  File "/usr/lib/python2.6/site-packages/elasticsearch/transport.py", line 307, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/usr/lib/python2.6/site-packages/elasticsearch/connection/http_urllib3.py", line 86, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/lib/python2.6/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(503, u'ConcurrentSnapshotExecutionException[[logrepo:logstash-2015.03.12] a snapshot is already running]')
So it looks like you may have a snapshot that is stuck in running state. Please run the following and PM it to me:

Code: Select all

curl -s 'http://localhost:9200/_cluster/state?pretty'

Re: Backup & Maintenance

Posted: Tue Mar 31, 2015 4:02 pm
by blariv
they are set to close older than 20 days. PM sent.

Re: Backup & Maintenance

Posted: Tue Mar 31, 2015 4:17 pm
by jolson
Please re-send that PM - it was corrupted. Thanks!

Re: Backup & Maintenance

Posted: Wed Apr 01, 2015 9:26 am
by jolson
blariv,

It looks like there's a backup stuck in the INIT state, which could cause the backup/maintenance holdups we are seeing. The snapshot information is as follows:

Code: Select all

"snapshot" : "logstash-2015.03.02",
"state" : "INIT",
"indices" : [ "logstash-2015.03.02" ],
Could you please try deleting the snapshot and re-running your backup_maintenance job? Be sure to tail jobs.log while you run backup_maintenance.

The command to delete the snapshot is as follows:

Code: Select all

curl -XDELETE "localhost:9200/_snapshot/logrepo/logstash-2015.03.02"
Let me know if that works for you. Thanks!

Re: Backup & Maintenance

Posted: Wed Apr 01, 2015 12:35 pm
by blariv
it worked!! thank you

Re: Backup & Maintenance

Posted: Wed Apr 01, 2015 12:53 pm
by jolson
Glad I could help. I'll lock this thread and mark it as resolved - if you need any further help please open a new thread. Thanks! ;)