Page 1 of 1

Backup error from a single node

Posted: Thu Oct 24, 2019 9:33 am
by rferebee
Good morning,

Since cutting over my servers to CentOS 7, I've noticed an error repeating for one of my nodes whenever it attempts to perform a backup.

Here's the error in the audit log:
System JOBS Wed, 23 Oct 2019 22:05:25 -0700 Error creating LS backup. Check permissions of backup directory /store/backups/nagioslogserver and disk space. 4deb767e-abbd-43f0-8839-049760687e98 Nagios Log Server
When I look at the permissions for that directory, they look okay to me. Also, I think there's enough drive space:

Code: Select all

root@nagioslscc2:/root> cd /store/backups
root@nagioslscc2:/store/backups> ls -la
total 4
drwxr-xr-x. 3 nagios nagios   29 Sep 11 10:39 .
drwxr-xr-x. 3 nagios nagios   21 Sep 11 10:39 ..
drwxr-xr-x. 4 nagios nagios 4096 Oct 23 22:00 nagioslogserver
root@nagioslscc2:/store/backups> cd nagioslogserver
root@nagioslscc2:/store/backups/nagioslogserver> ls
1570770031  nagioslogserver.2019-10-18.1571462054.tar.gz  nagioslogserver.2019-10-20.1571634041.tar.gz  nagioslogserver.2019-10-22.1571754226.tar.gz
1571893216  nagioslogserver.2019-10-19.1571547641.tar.gz  nagioslogserver.2019-10-21.1571720441.tar.gz
root@nagioslscc2:/store/backups/nagioslogserver> ls -la
total 489368
drwxr-xr-x. 4 nagios nagios      4096 Oct 23 22:00 .
drwxr-xr-x. 3 nagios nagios        29 Sep 11 10:39 ..
drwxrwxrwx  2 nagios nagios        24 Oct 10 22:00 1570770031
drwxrwxrwx  2 nagios nagios        24 Oct 23 22:00 1571893216
-rw-r--r--  1 nagios nagios 122130464 Oct 18 22:16 nagioslogserver.2019-10-18.1571462054.tar.gz
-rw-r--r--  1 nagios nagios 122256289 Oct 19 22:07 nagioslogserver.2019-10-19.1571547641.tar.gz
-rw-r--r--  1 nagios nagios 122431160 Oct 20 22:16 nagioslogserver.2019-10-20.1571634041.tar.gz
-rw-r--r--  1 nagios nagios  11692743 Oct 21 22:06 nagioslogserver.2019-10-21.1571720441.tar.gz
-rw-r--r--  1 nagios nagios 122587299 Oct 22 07:30 nagioslogserver.2019-10-22.1571754226.tar.gz
root@nagioslscc2:/store/backups/nagioslogserver> df -h
Filesystem                               Size  Used Avail Use% Mounted on
devtmpfs                                  32G     0   32G   0% /dev
tmpfs                                     32G     0   32G   0% /dev/shm
tmpfs                                     32G  130M   32G   1% /run
tmpfs                                     32G     0   32G   0% /sys/fs/cgroup
/dev/mapper/centos_nagioslssc2temp-root   50G  6.0G   45G  12% /
/dev/sda1                               1014M  216M  799M  22% /boot
/dev/mapper/nagiosvg-nagioslog           6.8T  3.3T  3.2T  52% /usr/local/nagioslogserver
/dev/mapper/centos_nagioslssc2temp-home   39G   33M   39G   1% /home
//10.x.x.x/NLSREPCC                204T  108T   96T  54% /nlsrepcc
doanfs001:/admin                          79G   43G   37G  54% /admin
tmpfs                                    6.3G     0  6.3G   0% /run/user/6603
Any ideas why my system is throwing this error?

Re: Backup error from a single node

Posted: Thu Oct 24, 2019 2:24 pm
by mbellerue
That all looks good. Are you setup to only keep 5 backups? It may be worth deleting the oldest backup just to see if it's encountering some kind of strange condition where it can't backup because the maximum number of backups has already been reached.

Also, I thought I remembered something about your backups being on a network drive. Was that the case, or was that just the case for your log server data? Because right now it looks like backups are just stored on the local system.

Re: Backup error from a single node

Posted: Thu Oct 24, 2019 3:07 pm
by rferebee
Ok, I'll try that. Where would I find the setting to see how many backups we're keeping?

Our snapshots are written to a network drive. I'm not aware of the backups being written anywhere other than the default location (locally?).

Re: Backup error from a single node

Posted: Thu Oct 24, 2019 3:25 pm
by mbellerue
Actually, I am mistaken, it looks like it only keeps 5 backups. I thought it was configurable via System Jobs, but that only changes the schedule.

I would still recommend either deleting the oldest backup, or moving it out of that directory, and seeing if that kicks the backup job into gear.

Speaking of System Jobs, does the Last Run Status of the backups system job show success? Does it error if you run it manually?

Re: Backup error from a single node

Posted: Thu Oct 24, 2019 3:27 pm
by rferebee
This morning it showed FAILURE, but I'll try again now that I've deleted the oldest backup on that particular node.

Re: Backup error from a single node

Posted: Thu Oct 24, 2019 3:35 pm
by rferebee
Ok, the backups worked on all the nodes this time.

Might have been fluke, I would say go ahead and lock this up.

Re: Backup error from a single node

Posted: Thu Oct 24, 2019 4:07 pm
by benjaminsmith
Might have been fluke, I would say go ahead and lock this up.
Thanks for the update @rferebee. We'll close this out.