backup

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

thanks.
now nls1 nagios uid is 8004 and nls2 nagios uid is 8005.
when nls1 write something, in nls2 it will show 8004. but nls2 can alsow write to nfs, will that affect backup?
or every node can write to nfs and no need same UID?
thanks for your help.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

I would get the UID's synchronized personally - I believe it's the best solution here, and will help us avoid a lot of potential complications. Any chance you can change the 'nagios' user UID's on your NLS nodes?

This is a great resource regarding how to do so: https://muffinresearch.co.uk/linux-chan ... -for-user/

I tested the below on one of my NLS nodes, and I cannot see any problems. I change 'nagios' UID from 500 to 501:

Gracefully shut down NLS and company:

Code: Select all

service elasticsearch stop
service logstash stop
service httpd stop
service crond stop
Pkill hanging processes:

Code: Select all

pkill -u nagios
Change UID Permissions:

Code: Select all

usermod -u 501 nagios
find / -user 500 -exec chown -h 501 {} \;
Start up all processes:

Code: Select all

service elasticsearch start
service logstash start
service httpd start
service crond start
You will have to change GID permissions if the GID is messed up as well - but in general, 'nagios' should belong to GID 100 - users.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

thanks. this is a great way to do that i think.

by the way, the backup job will backup all the log?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

Correct - the backup job should backup all of the logs to the repository. This way, you are able to restore them from backup if you wind up losing some information.

You can read more about the process here: http://assets.nagios.com/downloads/nagi ... enance.pdf
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

i just started to backup and only two dates are showing in progress since last night. it was 24 hours already

Name State Indexes Actions
logstash-2015.02.11 IN_PROGRESS logstash-2015.02.11 restore delete
logstash-2014.12.18 IN_PROGRESS logstash-2014.12.18 restore delete


backups Waiting SUCCESS 05/11/2015 17:50:13 1 day 05/12/2015 15:50:13 System Edit
backup_maintenance Waiting SUCCESS 05/12/2015 02:14:23 1 day 05/13/2015 02:14:23 System Edit

is it successful or we need wait more time?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

How large are the shards that you're backing up? The speed will depend on your storage medium, and on the size of your information.

Please run the following command and report the status to us:

Code: Select all

curl -s -XGET 'http://localhost:9200/_cluster/state?pretty' | grep snapshot -A 100
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

please find the log
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

It looks like your 'logstash-2015.02.11' snapshot might be stuck initializing. Let's try and kill the snapshot and re-running your backup while following the logs:

Code: Select all

curator delete --prefix logstash-2015.02.11 --older-than 1
Start a tail of jobs.log on all of your nodes:

Code: Select all

/usr/local/nagioslogserver/var/jobs.log
jobs.log will be 0b in size because it's truncated often. This is normal. Please run the above command before continuing.

Navigate to 'Administration -> Command Subsystem' and run the 'Reset all Jobs' command. After that is finished, run the 'Backup and Maintenance' command. Let us know the results. Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

2015-05-13 14:26:55,905 INFO Job starting...
2015-05-13 14:26:55,913 INFO Beginning DELETE operations...
2015-05-13 14:26:55,928 ERROR Could not find a valid timestamp for logstash-2015.02.11 with timestring %Y.%m.%d
2015-05-13 14:26:55,928 INFO DELETE index operations completed.
2015-05-13 14:26:55,928 INFO Done in 0:00:00.044038.


Running command do_maintenance with args ' ' for job id: backup_maintenance
2015-05-13 14:28:47,236 INFO Job starting...
2015-05-13 14:28:47,251 INFO Beginning BLOOM operations...
2015-05-13 14:28:47,287 INFO Attempting to disable bloom filter for index logstash-2014.09.08.
2015-05-13 14:28:47,291 INFO Skipping index logstash-2014.09.08: Already closed.
2015-05-13 14:28:47,291 INFO Attempting to disable bloom filter for index logstash-2014.09.09.
2015-05-13 14:28:47,293 INFO Skipping index logstash-2014.09.09: Already closed.
2015-05-13 14:28:47,293 INFO Attempting to disable bloom filter for index logstash-2014.09.10.
2015-05-13 14:28:47,295 INFO Skipping index logstash-2014.09.10: Already closed.
2015-05-13 14:28:47,295 INFO Attempting to disable bloom filter for index logstash-2014.09.11.
2015-05-13 14:28:47,296 INFO Skipping index logstash-2014.09.11: Already closed.
2015-05-13 14:28:47,297 INFO Attempting to disable bloom filter for index logstash-2014.09.12.
2015-05-13 14:28:47,298 INFO Skipping index logstash-2014.09.12: Already closed.
2015-05-13 14:28:47,298 INFO Attempting to disable bloom filter for index logstash-2014.09.13.
2015-05-13 14:28:47,300 INFO Skipping index logstash-2014.09.13: Already closed.
2015-05-13 14:28:47,300 INFO Attempting to disable bloom filter for index logstash-2014.09.14.
2015-05-13 14:28:47,302 INFO Skipping index logstash-2014.09.14: Already closed.


2015-05-13 14:32:16,692 INFO Attempting to optimize index logstash-2015.02.23.
2015-05-13 14:32:16,985 INFO Skipping index logstash-2015.02.23: Already optimized.
2015-05-13 14:32:16,986 INFO Attempting to optimize index logstash-2015.02.24.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 736, in <module>
main()
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 731, in main
arguments.func(client, **argdict)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 585, in command_loop
skipped = op(client, index_name, **kwargs)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 406, in _create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/snapshot.py", line 22, in create
repository, snapshot), params=params, body=body)
File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 301, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 82, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(503, u'ConcurrentSnapshotExecutionException[[nlsbackup:logstash-2015.02.13] a snapshot is already running]')
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

[[nlsbackup:logstash-2015.02.13] a snapshot is already running]
Is this where your backup procedure stopped? If so, let's try killing the above snapshot and re-running the backup procedure once more as per my previous post.

Code: Select all

curator delete --prefix logstash-2015.02.13 --older-than 1
Start the jobs.log tail, and re-run the backup command from the GUI.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked