backup

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

Okay - now we are at a point where jobs.log should output the information that we need to get this issue moving.

Please take a tail of jobs.log from all nodes - it needs to be a follow tail because the jobs.log files are truncated very often.

Before attempting a backup job, use the following command on every node:
tail -f /usr/local/nagioslogserver/var/jobs.log

Then, force a backup_maintenance command from the Command Subsystem. One of your nodes will take the job and run with it - it should display errors if the backup process isn't working. Please let me know what those errors are. Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

For backup, which node will perform backup? any script i can run in command mode ?
i see the /usr/local/nagioslogserver/scripts/create_backup.sh backup directory is BACKUP_DIR="/store/backups/nagioslogserver" and that is different with my setting in GUI.

can i select the day and do the backup in command mode manually?

thanks
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: backup

Post by tgriep »

Running the create_backup.sh isn't what you want to do. It only backups the configuration and not the logs.
Here is what you need to do.
Login to each of the nodes as root in a shell and run the following.

Code: Select all

tail -f /usr/local/nagioslogserver/var/jobs.log
Then on one of the nodes, login to logserver's GUI and go to, "Administration" > "Command Subsystem"
Click on "Edit" for the backups and change the "Next Run Time" to be 5 minutes in the future and save the settings.

On one of the nodes, that the tail -f command is running, you should see the backup happening, post that here so we can look at the errors.
Be sure to check out our Knowledgebase for helpful articles and solutions!
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

here is
2015-05-06 13:21:14,173 ERROR Error: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]]
[cluster/snapshot/get]]; nested: RepositoryMissingException[[nls-backup] missing]; ')

2015-05-06 13:22:14,946 INFO Attempting to optimize index logstash-2015.04.20.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 736, in <module>
main()
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 731, in main
arguments.func(client, **argdict)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 585, in command_loop
skipped = op(client, index_name, **kwargs)
File "/usr/lib/python2.7/site-packages/curator/curator.py", line 406, in _create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body=body, wait_for_completion=wait_for_completion)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python2.7/site-packages/elasticsearch/client/snapshot.py", line 22, in create
repository, snapshot), params=params, body=body)
File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 307, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 86, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.NotFoundError: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]][
cluster/snapshot/create]]; nested: RepositoryMissingException[[nls-backup] missing]; ')


2015-05-06 13:22:02,289 INFO logstash-2015.02.05: Successfully closed.
2015-05-06 13:22:02,290 INFO logstash-2015.02.06 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.07 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.08 is within the threshold period (90 days).
2015-05-06 13:22:02,290 INFO logstash-2015.02.09 is within the threshold period (90 days).
2015-05-06 13:22:02,291 INFO logstash-2015.02.10 is within the threshold period (90 days).
2015-05-06 13:22:02,291 INFO logstash-2015.02.11 is within the threshold period (90 days).
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

We have four nodes, can we force one selected one to do backup?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

We have four nodes, can we force one selected one to do backup?
Currently backups are run as 'global jobs' - this means that one node in your cluster will randomly pick the job up and run it. This distributes the jobs, and is also the reason why all nodes need access to the backup repository.

There are a few errors that I want to bring to your attention:
2015-05-06 13:22:14,946 INFO Attempting to optimize index logstash-2015.04.20.
Traceback (most recent call last):
It's possible that this index is corrupt, since it's generating a traceback. You could remove it with the following command:
curl -XDELETE 'http://localhost:9200/logstash-2015.04.20/'
After deletion, try running the backup again.
elasticsearch.exceptions.NotFoundError: TransportError(404, u'RemoteTransportException[[181841a1-d717-437c-bd36-6d4a8344abe6][inet[/192.168.78.10:9300]][
cluster/snapshot/create]]; nested: RepositoryMissingException[[nls-backup] missing]; ')
This error indicates that your repository could be missing. Please ensure that the 'nagios' user has read and write privileges to your repository.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

will this remove all log on 4-20?

can we fix one node to do backup?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

will this remove all log on 4-20?
That is correct - all log information from 4-20 would be erased.
can we fix one node to do backup?
All nodes will need to have proper access to the repository to do backups - not just one. This is because the job can be picked up by any single node, therefore all nodes will need proper access.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pccwglobalit
Posts: 105
Joined: Wed Mar 11, 2015 9:00 pm

Re: backup

Post by pccwglobalit »

Our backup respository is using NFS server and four nodes uid is different, how can we solve it? change all hosts uid to same? or any other option?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: backup

Post by jolson »

Our backup respository is using NFS server and four nodes uid is different, how can we solve it? change all hosts uid to same? or any other option?
The most manageable solution would be to use something similar to NIS to manage consistent UID's across your network. You can change the UIDs manually, but that can cause a lot of pain administratively.

There are many other ways to approach this problem. Below is one such way:

You can avoid the problem of UID's by granting NFS permissions based on the IP address of the client in question. Let's assume three servers - NFS, NLS1, and NLS2. NLS1 has an IP of 10.0.0.1 and NLS2 has an IP of 10.0.0.2.

NFS:

Code: Select all

cat /etc/exports
/nlsbackup            10.0.0.1/32(rw) 10.0.0.2/32(rw)
NLS1&NLS2:

Code: Select all

cat /etc/fstab
NFSIP:/nlsbackup /mnt/nlsback nfs defaults 0 0
This should work appropriately, but requires a little more management overhead. It also requires on source IP addresses for security - which could be undesirable. Let me know if you need any help along the way. Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked