Backup & Maintenance

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
blariv
Posts: 190
Joined: Wed Sep 26, 2012 11:55 am

Backup & Maintenance

Post by blariv »

Hi,

I'm trying to setup the backup repo and nothing ever shows up in the snapshots.

I have a 2 node cluster (both downloaded appliances)
Both servers have access to the NFS mount and can read/write to the share
I have reset all jobs
Screenshot of config attached
I have tailed the log while running the backup job and it completes

Tail:

Running command do_backups with args ' ' for job id: backups
SUCCESS
Running command run_alerts with args ' ' for job id: run_all_alerts
SUCCESS

Permissions:

[root@nagioslogserver ~]# ll -d /logrepo/logserver/
drwxr-xr-x 2 nagios users 1024 Mar 30 13:43 /logrepo/logserver/
[root@nagioslogserver ~]# ll -d /logrepo/
drwxrwxrwx 6 nagios nagios 1024 Mar 26 10:34 /logrepo

Please let me know what else I can check.

Thanks,

Brian
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Backup & Maintenance

Post by jolson »

Please tail -f /usr/local/nagioslogserver/var/jobs.log and force a backup_maintenance command from the web GUI (Command Subsystem). What appears in the jobs.log?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
blariv
Posts: 190
Joined: Wed Sep 26, 2012 11:55 am

Re: Backup & Maintenance

Post by blariv »

attached
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Backup & Maintenance

Post by jolson »

It looks like several of your indices are closed - backups can not be done on closed indices. Please see the attached image - what is your "Close indexes older than" setting set to?

It looks like the index logstash-2015.03.12 caused the python trace we see in your jobs.log file:

Code: Select all

Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 736, in <module>
    main()
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 731, in main
    arguments.func(client, **argdict)
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 585, in command_loop
    skipped = op(client, index_name, **kwargs)
  File "/usr/lib/python2.6/site-packages/curator/curator.py", line 406, in _create_snapshot
    client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
  File "/usr/lib/python2.6/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/lib/python2.6/site-packages/elasticsearch/client/snapshot.py", line 22, in create
    repository, snapshot), params=params, body=body)
  File "/usr/lib/python2.6/site-packages/elasticsearch/transport.py", line 307, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/usr/lib/python2.6/site-packages/elasticsearch/connection/http_urllib3.py", line 86, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/lib/python2.6/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(503, u'ConcurrentSnapshotExecutionException[[logrepo:logstash-2015.03.12] a snapshot is already running]')
So it looks like you may have a snapshot that is stuck in running state. Please run the following and PM it to me:

Code: Select all

curl -s 'http://localhost:9200/_cluster/state?pretty'
You do not have the required permissions to view the files attached to this post.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
blariv
Posts: 190
Joined: Wed Sep 26, 2012 11:55 am

Re: Backup & Maintenance

Post by blariv »

they are set to close older than 20 days. PM sent.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Backup & Maintenance

Post by jolson »

Please re-send that PM - it was corrupted. Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Backup & Maintenance

Post by jolson »

blariv,

It looks like there's a backup stuck in the INIT state, which could cause the backup/maintenance holdups we are seeing. The snapshot information is as follows:

Code: Select all

"snapshot" : "logstash-2015.03.02",
"state" : "INIT",
"indices" : [ "logstash-2015.03.02" ],
Could you please try deleting the snapshot and re-running your backup_maintenance job? Be sure to tail jobs.log while you run backup_maintenance.

The command to delete the snapshot is as follows:

Code: Select all

curl -XDELETE "localhost:9200/_snapshot/logrepo/logstash-2015.03.02"
Let me know if that works for you. Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
blariv
Posts: 190
Joined: Wed Sep 26, 2012 11:55 am

Re: Backup & Maintenance

Post by blariv »

it worked!! thank you
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Backup & Maintenance

Post by jolson »

Glad I could help. I'll lock this thread and mark it as resolved - if you need any further help please open a new thread. Thanks! ;)
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked