backup
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
thanks.
now nls1 nagios uid is 8004 and nls2 nagios uid is 8005.
when nls1 write something, in nls2 it will show 8004. but nls2 can alsow write to nfs, will that affect backup?
or every node can write to nfs and no need same UID?
thanks for your help.
now nls1 nagios uid is 8004 and nls2 nagios uid is 8005.
when nls1 write something, in nls2 it will show 8004. but nls2 can alsow write to nfs, will that affect backup?
or every node can write to nfs and no need same UID?
thanks for your help.
Re: backup
I would get the UID's synchronized personally - I believe it's the best solution here, and will help us avoid a lot of potential complications. Any chance you can change the 'nagios' user UID's on your NLS nodes?
This is a great resource regarding how to do so: https://muffinresearch.co.uk/linux-chan ... -for-user/
I tested the below on one of my NLS nodes, and I cannot see any problems. I change 'nagios' UID from 500 to 501:
Gracefully shut down NLS and company:
Pkill hanging processes:
Change UID Permissions:
Start up all processes:
You will have to change GID permissions if the GID is messed up as well - but in general, 'nagios' should belong to GID 100 - users.
This is a great resource regarding how to do so: https://muffinresearch.co.uk/linux-chan ... -for-user/
I tested the below on one of my NLS nodes, and I cannot see any problems. I change 'nagios' UID from 500 to 501:
Gracefully shut down NLS and company:
Code: Select all
service elasticsearch stop
service logstash stop
service httpd stop
service crond stopCode: Select all
pkill -u nagiosCode: Select all
usermod -u 501 nagios
find / -user 500 -exec chown -h 501 {} \;Code: Select all
service elasticsearch start
service logstash start
service httpd start
service crond start-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
thanks. this is a great way to do that i think.
by the way, the backup job will backup all the log?
by the way, the backup job will backup all the log?
Re: backup
Correct - the backup job should backup all of the logs to the repository. This way, you are able to restore them from backup if you wind up losing some information.
You can read more about the process here: http://assets.nagios.com/downloads/nagi ... enance.pdf
You can read more about the process here: http://assets.nagios.com/downloads/nagi ... enance.pdf
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
i just started to backup and only two dates are showing in progress since last night. it was 24 hours already
Name State Indexes Actions
logstash-2015.02.11 IN_PROGRESS logstash-2015.02.11 restore delete
logstash-2014.12.18 IN_PROGRESS logstash-2014.12.18 restore delete
backups Waiting SUCCESS 05/11/2015 17:50:13 1 day 05/12/2015 15:50:13 System Edit
backup_maintenance Waiting SUCCESS 05/12/2015 02:14:23 1 day 05/13/2015 02:14:23 System Edit
is it successful or we need wait more time?
Name State Indexes Actions
logstash-2015.02.11 IN_PROGRESS logstash-2015.02.11 restore delete
logstash-2014.12.18 IN_PROGRESS logstash-2014.12.18 restore delete
backups Waiting SUCCESS 05/11/2015 17:50:13 1 day 05/12/2015 15:50:13 System Edit
backup_maintenance Waiting SUCCESS 05/12/2015 02:14:23 1 day 05/13/2015 02:14:23 System Edit
is it successful or we need wait more time?
Re: backup
How large are the shards that you're backing up? The speed will depend on your storage medium, and on the size of your information.
Please run the following command and report the status to us:
Please run the following command and report the status to us:
Code: Select all
curl -s -XGET 'http://localhost:9200/_cluster/state?pretty' | grep snapshot -A 100-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
please find the log
You do not have the required permissions to view the files attached to this post.
Re: backup
It looks like your 'logstash-2015.02.11' snapshot might be stuck initializing. Let's try and kill the snapshot and re-running your backup while following the logs:
Start a tail of jobs.log on all of your nodes:
jobs.log will be 0b in size because it's truncated often. This is normal. Please run the above command before continuing.
Navigate to 'Administration -> Command Subsystem' and run the 'Reset all Jobs' command. After that is finished, run the 'Backup and Maintenance' command. Let us know the results. Thanks!
Code: Select all
curator delete --prefix logstash-2015.02.11 --older-than 1Code: Select all
/usr/local/nagioslogserver/var/jobs.logNavigate to 'Administration -> Command Subsystem' and run the 'Reset all Jobs' command. After that is finished, run the 'Backup and Maintenance' command. Let us know the results. Thanks!
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
2015-05-13 14:26:55,905 INFO Job starting...
2015-05-13 14:26:55,913 INFO Beginning DELETE operations...
2015-05-13 14:26:55,928 ERROR Could not find a valid timestamp for logstash-2015.02.11 with timestring %Y.%m.%d
2015-05-13 14:26:55,928 INFO DELETE index operations completed.
2015-05-13 14:26:55,928 INFO Done in 0:00:00.044038.
Running command do_maintenance with args ' ' for job id: backup_maintenance
2015-05-13 14:28:47,236 INFO Job starting...
2015-05-13 14:28:47,251 INFO Beginning BLOOM operations...
2015-05-13 14:28:47,287 INFO Attempting to disable bloom filter for index logstash-2014.09.08.
2015-05-13 14:28:47,291 INFO Skipping index logstash-2014.09.08: Already closed.
2015-05-13 14:28:47,291 INFO Attempting to disable bloom filter for index logstash-2014.09.09.
2015-05-13 14:28:47,293 INFO Skipping index logstash-2014.09.09: Already closed.
2015-05-13 14:28:47,293 INFO Attempting to disable bloom filter for index logstash-2014.09.10.
2015-05-13 14:28:47,295 INFO Skipping index logstash-2014.09.10: Already closed.
2015-05-13 14:28:47,295 INFO Attempting to disable bloom filter for index logstash-2014.09.11.
2015-05-13 14:28:47,296 INFO Skipping index logstash-2014.09.11: Already closed.
2015-05-13 14:28:47,297 INFO Attempting to disable bloom filter for index logstash-2014.09.12.
2015-05-13 14:28:47,298 INFO Skipping index logstash-2014.09.12: Already closed.
2015-05-13 14:28:47,298 INFO Attempting to disable bloom filter for index logstash-2014.09.13.
2015-05-13 14:28:47,300 INFO Skipping index logstash-2014.09.13: Already closed.
2015-05-13 14:28:47,300 INFO Attempting to disable bloom filter for index logstash-2014.09.14.
2015-05-13 14:28:47,302 INFO Skipping index logstash-2014.09.14: Already closed.
2015-05-13 14:32:16,692 INFO Attempting to optimize index logstash-2015.02.23.
2015-05-13 14:32:16,985 INFO Skipping index logstash-2015.02.23: Already optimized.
2015-05-13 14:32:16,986 INFO Attempting to optimize index logstash-2015.02.24.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 736, in <module>
main()
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 731, in main
arguments.func(client, **argdict)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 585, in command_loop
skipped = op(client, index_name, **kwargs)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 406, in _create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/snapshot.py", line 22, in create
repository, snapshot), params=params, body=body)
File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 301, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 82, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(503, u'ConcurrentSnapshotExecutionException[[nlsbackup:logstash-2015.02.13] a snapshot is already running]')
2015-05-13 14:26:55,913 INFO Beginning DELETE operations...
2015-05-13 14:26:55,928 ERROR Could not find a valid timestamp for logstash-2015.02.11 with timestring %Y.%m.%d
2015-05-13 14:26:55,928 INFO DELETE index operations completed.
2015-05-13 14:26:55,928 INFO Done in 0:00:00.044038.
Running command do_maintenance with args ' ' for job id: backup_maintenance
2015-05-13 14:28:47,236 INFO Job starting...
2015-05-13 14:28:47,251 INFO Beginning BLOOM operations...
2015-05-13 14:28:47,287 INFO Attempting to disable bloom filter for index logstash-2014.09.08.
2015-05-13 14:28:47,291 INFO Skipping index logstash-2014.09.08: Already closed.
2015-05-13 14:28:47,291 INFO Attempting to disable bloom filter for index logstash-2014.09.09.
2015-05-13 14:28:47,293 INFO Skipping index logstash-2014.09.09: Already closed.
2015-05-13 14:28:47,293 INFO Attempting to disable bloom filter for index logstash-2014.09.10.
2015-05-13 14:28:47,295 INFO Skipping index logstash-2014.09.10: Already closed.
2015-05-13 14:28:47,295 INFO Attempting to disable bloom filter for index logstash-2014.09.11.
2015-05-13 14:28:47,296 INFO Skipping index logstash-2014.09.11: Already closed.
2015-05-13 14:28:47,297 INFO Attempting to disable bloom filter for index logstash-2014.09.12.
2015-05-13 14:28:47,298 INFO Skipping index logstash-2014.09.12: Already closed.
2015-05-13 14:28:47,298 INFO Attempting to disable bloom filter for index logstash-2014.09.13.
2015-05-13 14:28:47,300 INFO Skipping index logstash-2014.09.13: Already closed.
2015-05-13 14:28:47,300 INFO Attempting to disable bloom filter for index logstash-2014.09.14.
2015-05-13 14:28:47,302 INFO Skipping index logstash-2014.09.14: Already closed.
2015-05-13 14:32:16,692 INFO Attempting to optimize index logstash-2015.02.23.
2015-05-13 14:32:16,985 INFO Skipping index logstash-2015.02.23: Already optimized.
2015-05-13 14:32:16,986 INFO Attempting to optimize index logstash-2015.02.24.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 736, in <module>
main()
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 731, in main
arguments.func(client, **argdict)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 585, in command_loop
skipped = op(client, index_name, **kwargs)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 406, in _create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/snapshot.py", line 22, in create
repository, snapshot), params=params, body=body)
File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 301, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 82, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(503, u'ConcurrentSnapshotExecutionException[[nlsbackup:logstash-2015.02.13] a snapshot is already running]')
Re: backup
Is this where your backup procedure stopped? If so, let's try killing the above snapshot and re-running the backup procedure once more as per my previous post.[[nlsbackup:logstash-2015.02.13] a snapshot is already running]
Code: Select all
curator delete --prefix logstash-2015.02.13 --older-than 1