Page 1 of 2
Backup Jobs
Posted: Wed Aug 03, 2016 7:55 pm
by jspink
I've got an NFS mount point configured on all 10 of my instances, I can create and delete files/directors in that mount point from each instance with no issues, however after having my backup jobs setup with the repository and the scheduled job stating "successful" each night
tail of the job.log shows variations of the below across 10 nodes
Code: Select all
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 99 0 99 0 55 3143 1746 --:--:-- --:--:-- --:--:-- 1466
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 99 0 99 0 55 3188 1771 --:--:-- --:--:-- --:--:-- 1419
Yet, I never had any backups that show up under snapshots for me.
Pretty new to the backup portion of NLS, so it's very possible I've missed something somewhere.
result of:
curl -s -XGET localhost:9200/_nodes/?pretty | grep path -A7
Code: Select all
"path" : {
"data" : "/extdata/data",
"work" : "/usr/local/nagioslogserver/tmp/elasticsearch",
"home" : "/usr/local/nagioslogserver/elasticsearch",
"conf" : "/usr/local/nagioslogserver/elasticsearch/config",
"logs" : "/var/log/elasticsearch",
"repo" : "/"
},
Where do I start on the troubleshooting of this?
Re: Backup Jobs
Posted: Wed Aug 03, 2016 9:02 pm
by Box293
What is the output of these commands on ALL of your nodes:
Code: Select all
grep nag /etc/passwd
grep nag /etc/group
The following KB article will help with troubleshooting:
https://support.nagios.com/kb/article.php?id=494
Let us know what steps you followed in the troubleshooting article and what output was produced.
Re: Backup Jobs
Posted: Thu Aug 04, 2016 6:49 am
by jspink
Box293 wrote:What is the output of these commands on ALL of your nodes:
CODE: SELECT ALL
grep nag /etc/passwd
grep nag /etc/group
Return from each of the nodes is as follows:
Code: Select all
grep nag /etc/passwd
nagios:x:500:100::/home/nagios:/bin/bash
grep nag /etc/group
apache:x:48:nagios
nagios:x:500:nagios,apache
Working through steps in the KB and will return results
Re: Backup Jobs
Posted: Thu Aug 04, 2016 6:58 am
by jspink
Appears to be permissions of some sort:
Code: Select all
[root@xxx-nag01 ~]# curator snapshot --repository "Backups" indices --all-indices
2016-08-04 07:57:12,616 INFO Job starting: snapshot indices
2016-08-04 07:57:12,616 WARNING Overriding default connection timeout. New timeout: 21600
2016-08-04 07:57:12,649 INFO Matching all indices. Ignoring flags other than --exclude.
2016-08-04 07:57:12,649 INFO Action snapshot will be performed on the following indices: [u'kibana-int', u'logstash-2016.07.19', u'logstash-2016.07.20', u'logstash-2016.07.21', u'logstash-2016.07.22', u'logstash-2016.07.23', u'logstash-2016.07.24', u'logstash-2016.07.25', u'logstash-2016.07.26', u'logstash-2016.07.27', u'logstash-2016.07.28', u'logstash-2016.07.29', u'logstash-2016.07.30', u'logstash-2016.07.31', u'logstash-2016.08.01', u'logstash-2016.08.02', u'logstash-2016.08.03', u'logstash-2016.08.04', u'nagioslogserver', u'nagioslogserver_log']
2016-08-04 07:57:14,085 ERROR Failed to verify all nodes have repository access.
2016-08-04 07:57:14,085 WARNING Job did not complete successfully.
Reviewing each of the nodes for permissions yet again
Re: Backup Jobs
Posted: Thu Aug 04, 2016 7:18 am
by jspink
If i'm missing something here, please let me know - I'm a Windows admin with limited but growing Nix skills.
I've used fstab to mount a share using an AD user/pass that has full permissions to the share.
I can access the mount from each node, and as shown below, I can create a folder (and delete other folders created on other nodes) from each node.
See below for permissions.
Yet I'm still getting the notice Failed to verify all nodes have repository access.
Code: Select all
[root@xxx-nag10 backups]# mkdir nag10
[root@xxx-nag10 backups]# ls -la
total 8
drwxr-xr-x 1 root root 4096 Aug 4 08:15 .
dr-xr-xr-x. 25 root root 4096 Aug 2 15:28 ..
drwxr-xr-x 0 root root 0 Aug 4 08:14 nag01
drwxr-xr-x 0 root root 0 Aug 4 08:13 nag02
drwxr-xr-x 0 root root 0 Aug 4 08:13 nag03
drwxr-xr-x 0 root root 0 Aug 4 08:13 nag04
drwxr-xr-x 0 root root 0 Aug 4 08:14 nag05
drwxr-xr-x 0 root root 0 Aug 4 08:14 nag06
drwxr-xr-x 0 root root 0 Aug 4 08:14 nag07
drwxr-xr-x 0 root root 0 Aug 4 08:14 nag08
drwxr-xr-x 0 root root 0 Aug 4 08:15 nag09
drwxr-xr-x 0 root root 0 Aug 4 08:15 nag10
[root@xxx-nag10 backups]#
Re: Backup Jobs
Posted: Thu Aug 04, 2016 10:01 am
by jspink
I've reviewed permissions, and verified every way I know how - any further suggestions on troubleshooting?
Re: Backup Jobs
Posted: Thu Aug 04, 2016 10:54 am
by rkennedy
It looks like your permissions are set to root:root, and other does NOT have write permissions. You'll want to adjust this to be nagios:nagios as the nagios user is whom the backups will run as.
Re: Backup Jobs
Posted: Thu Aug 04, 2016 11:07 am
by jspink
rkennedy wrote:It looks like your permissions are set to root:root, and other does NOT have write permissions. You'll want to adjust this to be nagios:nagios as the nagios user is whom the backups will run as.
Not sure where to do that.
Re: Backup Jobs
Posted: Thu Aug 04, 2016 11:12 am
by rkennedy
jspink wrote:rkennedy wrote:It looks like your permissions are set to root:root, and other does NOT have write permissions. You'll want to adjust this to be nagios:nagios as the nagios user is whom the backups will run as.
Not sure where to do that.
Going based off of this, where you list your permissions on your mounted folder -
Code: Select all
[root@xxx-nag10 backups]# mkdir nag10
[root@xxx-nag10 backups]# ls -la
total 8
drwxr-xr-x 1 root root 4096 Aug 4 08:15 .
What path are you using on both machines for backups? That's the folder that will need to be writable by nagios. This command should return it -
curl -XGET "localhost:9200/_snapshot?pretty"
Re: Backup Jobs
Posted: Thu Aug 04, 2016 12:09 pm
by jspink
Path is /backups
I have 10 nodes total that all mount that at the root.
the nag01 to nag10 folders inside /backup were created from each node to ensure I had write access