backup
Re: backup
Okay - now we are at a point where jobs.log should output the information that we need to get this issue moving.
Please take a tail of jobs.log from all nodes - it needs to be a follow tail because the jobs.log files are truncated very often.
Before attempting a backup job, use the following command on every node:
tail -f /usr/local/nagioslogserver/var/jobs.log
Then, force a backup_maintenance command from the Command Subsystem. One of your nodes will take the job and run with it - it should display errors if the backup process isn't working. Please let me know what those errors are. Thanks!
Please take a tail of jobs.log from all nodes - it needs to be a follow tail because the jobs.log files are truncated very often.
Before attempting a backup job, use the following command on every node:
tail -f /usr/local/nagioslogserver/var/jobs.log
Then, force a backup_maintenance command from the Command Subsystem. One of your nodes will take the job and run with it - it should display errors if the backup process isn't working. Please let me know what those errors are. Thanks!
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
For backup, which node will perform backup? any script i can run in command mode ?
i see the /usr/local/nagioslogserver/scripts/create_backup.sh backup directory is BACKUP_DIR="/store/backups/nagioslogserver" and that is different with my setting in GUI.
can i select the day and do the backup in command mode manually?
thanks
i see the /usr/local/nagioslogserver/scripts/create_backup.sh backup directory is BACKUP_DIR="/store/backups/nagioslogserver" and that is different with my setting in GUI.
can i select the day and do the backup in command mode manually?
thanks
Re: backup
Running the create_backup.sh isn't what you want to do. It only backups the configuration and not the logs.
Here is what you need to do.
Login to each of the nodes as root in a shell and run the following.
Then on one of the nodes, login to logserver's GUI and go to, "Administration" > "Command Subsystem"
Click on "Edit" for the backups and change the "Next Run Time" to be 5 minutes in the future and save the settings.
On one of the nodes, that the tail -f command is running, you should see the backup happening, post that here so we can look at the errors.
Here is what you need to do.
Login to each of the nodes as root in a shell and run the following.
Code: Select all
tail -f /usr/local/nagioslogserver/var/jobs.logClick on "Edit" for the backups and change the "Next Run Time" to be 5 minutes in the future and save the settings.
On one of the nodes, that the tail -f command is running, you should see the backup happening, post that here so we can look at the errors.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
here is
2015-05-06 13:21:14,173 ERROR Error: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]]
[cluster/snapshot/get]]; nested: RepositoryMissingException[[nls-backup] missing]; ')
2015-05-06 13:22:14,946 INFO Attempting to optimize index logstash-2015.04.20.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 736, in <module>
main()
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 731, in main
arguments.func(client, **argdict)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 585, in command_loop
skipped = op(client, index_name, **kwargs)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 406, in _create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/snapshot.py", line 22, in create
repository, snapshot), params=params, body=body)
File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 307, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 86, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.NotFoundError: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]][
cluster/snapshot/create]]; nested: RepositoryMissingException[[nls-backup] missing]; ')
2015-05-06 13:22:02,289 INFO logstash-2015.02.05: Successfully closed.
2015-05-06 13:22:02,290 INFO logstash-2015.02.06 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.07 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.08 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.09 is within the threshold period (90 days).
2015-05-06 13:22:02,291 INFO logstash-2015.02.10 is within the threshold period (90 days).
2015-05-06 13:22:02,291 INFO logstash-2015.02.11 is within the threshold period (90 days).
2015-05-06 13:21:14,173 ERROR Error: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]]
[cluster/snapshot/get]]; nested: RepositoryMissingException[[nls-backup] missing]; ')
2015-05-06 13:22:14,946 INFO Attempting to optimize index logstash-2015.04.20.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 736, in <module>
main()
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 731, in main
arguments.func(client, **argdict)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 585, in command_loop
skipped = op(client, index_name, **kwargs)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 406, in _create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/snapshot.py", line 22, in create
repository, snapshot), params=params, body=body)
File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 307, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 86, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.NotFoundError: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]][
cluster/snapshot/create]]; nested: RepositoryMissingException[[nls-backup] missing]; ')
2015-05-06 13:22:02,289 INFO logstash-2015.02.05: Successfully closed.
2015-05-06 13:22:02,290 INFO logstash-2015.02.06 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.07 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.08 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.09 is within the threshold period (90 days).
2015-05-06 13:22:02,291 INFO logstash-2015.02.10 is within the threshold period (90 days).
2015-05-06 13:22:02,291 INFO logstash-2015.02.11 is within the threshold period (90 days).
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
We have four nodes, can we force one selected one to do backup?
Re: backup
Currently backups are run as 'global jobs' - this means that one node in your cluster will randomly pick the job up and run it. This distributes the jobs, and is also the reason why all nodes need access to the backup repository.We have four nodes, can we force one selected one to do backup?
There are a few errors that I want to bring to your attention:
It's possible that this index is corrupt, since it's generating a traceback. You could remove it with the following command:2015-05-06 13:22:14,946 INFO Attempting to optimize index logstash-2015.04.20.
Traceback (most recent call last):
curl -XDELETE 'http://localhost:9200/logstash-2015.04.20/'
After deletion, try running the backup again.
This error indicates that your repository could be missing. Please ensure that the 'nagios' user has read and write privileges to your repository.elasticsearch.exceptions.NotFoundError: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]][
cluster/snapshot/create]]; nested: RepositoryMissingException[[nls-backup] missing]; ')
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
will this remove all log on 4-20?
can we fix one node to do backup?
can we fix one node to do backup?
Re: backup
That is correct - all log information from 4-20 would be erased.will this remove all log on 4-20?
All nodes will need to have proper access to the repository to do backups - not just one. This is because the job can be picked up by any single node, therefore all nodes will need proper access.can we fix one node to do backup?
-
pccwglobalit
- Posts: 105
- Joined: Wed Mar 11, 2015 9:00 pm
Re: backup
Our backup respository is using NFS server and four nodes uid is different, how can we solve it? change all hosts uid to same? or any other option?
Re: backup
The most manageable solution would be to use something similar to NIS to manage consistent UID's across your network. You can change the UIDs manually, but that can cause a lot of pain administratively.Our backup respository is using NFS server and four nodes uid is different, how can we solve it? change all hosts uid to same? or any other option?
There are many other ways to approach this problem. Below is one such way:
You can avoid the problem of UID's by granting NFS permissions based on the IP address of the client in question. Let's assume three servers - NFS, NLS1, and NLS2. NLS1 has an IP of 10.0.0.1 and NLS2 has an IP of 10.0.0.2.
NFS:
Code: Select all
cat /etc/exports
/nlsbackup 10.0.0.1/32(rw) 10.0.0.2/32(rw)Code: Select all
cat /etc/fstab
NFSIP:/nlsbackup /mnt/nlsback nfs defaults 0 0