Backup Jobs

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
jspink
Posts: 43
Joined: Wed Nov 25, 2015 3:27 pm

Backup Jobs

Post by jspink »

I've got an NFS mount point configured on all 10 of my instances, I can create and delete files/directors in that mount point from each instance with no issues, however after having my backup jobs setup with the repository and the scheduled job stating "successful" each night

tail of the job.log shows variations of the below across 10 nodes

Code: Select all

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0    99    0    99    0    55   3143   1746 --:--:-- --:--:-- --:--:--  1466
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0    99    0    99    0    55   3188   1771 --:--:-- --:--:-- --:--:--  1419
Yet, I never had any backups that show up under snapshots for me.
Pretty new to the backup portion of NLS, so it's very possible I've missed something somewhere.


result of:
curl -s -XGET localhost:9200/_nodes/?pretty | grep path -A7

Code: Select all

 "path" : {
          "data" : "/extdata/data",
          "work" : "/usr/local/nagioslogserver/tmp/elasticsearch",
          "home" : "/usr/local/nagioslogserver/elasticsearch",
          "conf" : "/usr/local/nagioslogserver/elasticsearch/config",
          "logs" : "/var/log/elasticsearch",
          "repo" : "/"
        },



Where do I start on the troubleshooting of this?
Nagios Log Server: 10 Instances - 3,916,302,797 documents last check in 180 shards
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Backup Jobs

Post by Box293 »

What is the output of these commands on ALL of your nodes:

Code: Select all

grep nag /etc/passwd
grep nag /etc/group
The following KB article will help with troubleshooting:

https://support.nagios.com/kb/article.php?id=494

Let us know what steps you followed in the troubleshooting article and what output was produced.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
jspink
Posts: 43
Joined: Wed Nov 25, 2015 3:27 pm

Re: Backup Jobs

Post by jspink »

Box293 wrote:What is the output of these commands on ALL of your nodes:
CODE: SELECT ALL
grep nag /etc/passwd
grep nag /etc/group
Return from each of the nodes is as follows:

Code: Select all

grep nag /etc/passwd
nagios:x:500:100::/home/nagios:/bin/bash

grep nag /etc/group
apache:x:48:nagios
nagios:x:500:nagios,apache
Working through steps in the KB and will return results
Nagios Log Server: 10 Instances - 3,916,302,797 documents last check in 180 shards
jspink
Posts: 43
Joined: Wed Nov 25, 2015 3:27 pm

Re: Backup Jobs

Post by jspink »

Appears to be permissions of some sort:

Code: Select all

[root@xxx-nag01 ~]# curator snapshot --repository "Backups" indices --all-indices
2016-08-04 07:57:12,616 INFO      Job starting: snapshot indices
2016-08-04 07:57:12,616 WARNING   Overriding default connection timeout.  New timeout: 21600
2016-08-04 07:57:12,649 INFO      Matching all indices. Ignoring flags other than --exclude.
2016-08-04 07:57:12,649 INFO      Action snapshot will be performed on the following indices: [u'kibana-int', u'logstash-2016.07.19', u'logstash-2016.07.20', u'logstash-2016.07.21', u'logstash-2016.07.22', u'logstash-2016.07.23', u'logstash-2016.07.24', u'logstash-2016.07.25', u'logstash-2016.07.26', u'logstash-2016.07.27', u'logstash-2016.07.28', u'logstash-2016.07.29', u'logstash-2016.07.30', u'logstash-2016.07.31', u'logstash-2016.08.01', u'logstash-2016.08.02', u'logstash-2016.08.03', u'logstash-2016.08.04', u'nagioslogserver', u'nagioslogserver_log']
2016-08-04 07:57:14,085 ERROR     Failed to verify all nodes have repository access.
2016-08-04 07:57:14,085 WARNING   Job did not complete successfully.
Reviewing each of the nodes for permissions yet again
Nagios Log Server: 10 Instances - 3,916,302,797 documents last check in 180 shards
jspink
Posts: 43
Joined: Wed Nov 25, 2015 3:27 pm

Re: Backup Jobs

Post by jspink »

If i'm missing something here, please let me know - I'm a Windows admin with limited but growing Nix skills.
I've used fstab to mount a share using an AD user/pass that has full permissions to the share.
I can access the mount from each node, and as shown below, I can create a folder (and delete other folders created on other nodes) from each node.
See below for permissions.
Yet I'm still getting the notice Failed to verify all nodes have repository access.

Code: Select all

[root@xxx-nag10 backups]# mkdir nag10
[root@xxx-nag10 backups]# ls -la
total 8
drwxr-xr-x   1 root root 4096 Aug  4 08:15 .
dr-xr-xr-x. 25 root root 4096 Aug  2 15:28 ..
drwxr-xr-x   0 root root    0 Aug  4 08:14 nag01
drwxr-xr-x   0 root root    0 Aug  4 08:13 nag02
drwxr-xr-x   0 root root    0 Aug  4 08:13 nag03
drwxr-xr-x   0 root root    0 Aug  4 08:13 nag04
drwxr-xr-x   0 root root    0 Aug  4 08:14 nag05
drwxr-xr-x   0 root root    0 Aug  4 08:14 nag06
drwxr-xr-x   0 root root    0 Aug  4 08:14 nag07
drwxr-xr-x   0 root root    0 Aug  4 08:14 nag08
drwxr-xr-x   0 root root    0 Aug  4 08:15 nag09
drwxr-xr-x   0 root root    0 Aug  4 08:15 nag10
[root@xxx-nag10 backups]#
Nagios Log Server: 10 Instances - 3,916,302,797 documents last check in 180 shards
jspink
Posts: 43
Joined: Wed Nov 25, 2015 3:27 pm

Re: Backup Jobs

Post by jspink »

I've reviewed permissions, and verified every way I know how - any further suggestions on troubleshooting?
Nagios Log Server: 10 Instances - 3,916,302,797 documents last check in 180 shards
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Backup Jobs

Post by rkennedy »

It looks like your permissions are set to root:root, and other does NOT have write permissions. You'll want to adjust this to be nagios:nagios as the nagios user is whom the backups will run as.
Former Nagios Employee
jspink
Posts: 43
Joined: Wed Nov 25, 2015 3:27 pm

Re: Backup Jobs

Post by jspink »

rkennedy wrote:It looks like your permissions are set to root:root, and other does NOT have write permissions. You'll want to adjust this to be nagios:nagios as the nagios user is whom the backups will run as.
Not sure where to do that.
Nagios Log Server: 10 Instances - 3,916,302,797 documents last check in 180 shards
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Backup Jobs

Post by rkennedy »

jspink wrote:
rkennedy wrote:It looks like your permissions are set to root:root, and other does NOT have write permissions. You'll want to adjust this to be nagios:nagios as the nagios user is whom the backups will run as.
Not sure where to do that.
Going based off of this, where you list your permissions on your mounted folder -

Code: Select all

[root@xxx-nag10 backups]# mkdir nag10
[root@xxx-nag10 backups]# ls -la
total 8
drwxr-xr-x   1 root root 4096 Aug  4 08:15 .
What path are you using on both machines for backups? That's the folder that will need to be writable by nagios. This command should return it - curl -XGET "localhost:9200/_snapshot?pretty"
Former Nagios Employee
jspink
Posts: 43
Joined: Wed Nov 25, 2015 3:27 pm

Re: Backup Jobs

Post by jspink »

Path is /backups

I have 10 nodes total that all mount that at the root.

the nag01 to nag10 folders inside /backup were created from each node to ensure I had write access
Nagios Log Server: 10 Instances - 3,916,302,797 documents last check in 180 shards
Locked