Nagios XI 5.4.0 - Scheduled Backups

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
SteveBeauchemin
Posts: 524
Joined: Mon Oct 14, 2013 7:19 pm

Nagios XI 5.4.0 - Scheduled Backups

Post by SteveBeauchemin »

My Nagios XI 5.4.0 Scheduled backups are configured for SSH.

The Test Connection and Test Upload both work. There is a ton of space on both systems.

The backup_xi.sh script works fine as the nagios user when I run

Code: Select all

sudo /usr/local/nagiosxi/scripts/backup_xi.sh
So it works manually.
When I run the scheduled backup, the system makes all the smaller tar.gz files, and starts to compress them into a big tar.gz

While the big file is growing, being created, somewhere in the middle of the process, the system deletes it.
Look at this piece of the tail -f cmdsubsys.log components/scheduledbackups.log

Code: Select all

Backing up logrotate config files...
Backing up Apache config files...
Compressing backup...
tail: cmdsubsys.log: file truncated
PROCESSED 0 COMMANDS
 COMMAND: CMD=1119, DATA=a:2:{i:0;s:19:"nagiosxi.1485211501";i:1;s:24:"/store/backups/nagiosxi/";}
CMDLINE=rm -rf /store/backups/nagiosxi/nagiosxi.1485211501.tar.gz
OUTPUT=
RETURNCODE=0
PROCESSED 1 COMMANDS
The CMDLINE gets run before the file is finished being compressed.
I use this to see what happened and when. The gzip was still growing when it vanished.

Code: Select all

cd /store/backups/nagiosxi
watch "ls -l nagi*;df -h .;ls -l"
As a result of the early deletion, the resulting file on the remote system is corrupt, backup looks like it succeeded but actually fails.

Please advise...

Thanks

Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI 5.4.0 - Scheduled Backups

Post by tgriep »

There is a timeout value when running a scheduled backup that needs to be increased on your server and that should allow the backup to finish.
To increase this setting, edit the following file

Code: Select all

/usr/local/nagiosxi/html/config.inc.php
and add this line to the bottom

Code: Select all

$cfg['backup_timeout'] = 3600;
Save the file and restart nagios and apache by running

Code: Select all

service nagios restart
service httpd restart
That will increase the timeout to one hour and the backup should finish in time.
Be sure to check out our Knowledgebase for helpful articles and solutions!
SteveBeauchemin
Posts: 524
Joined: Mon Oct 14, 2013 7:19 pm

Re: Nagios XI 5.4.0 - Scheduled Backups

Post by SteveBeauchemin »

I changed the file as requested. Added to the bottom.

The backup ran to completion this time. The SSH transfer to another host was a complete file. I was able to extract out what I needed.

Everyone should test a restore from their backups every so often. We do it a couple times a year. If someone has a large installation like mine, it is possible that they have broken backups and do not even know it. Test Test Test... rules to live by.

Thanks for the quick and accurate response.

Much appreciated...

Just FYI - since I do have a large installation - which equates to large perfdata and archive directories. Since they are so big, if I do need to pull data from the nagios directory, maybe from etc hosts or services, it takes forever because those 2 large locations are included in that one tar.gz file. As such, I have made a change to the backup_xi.sh file to make my life easier when looking for a file to restore. This is my change - somewhere around line 93

Code: Select all

vi /usr/local/nagiosxi/scripts/backup_xi.sh
echo "Backing up Nagios Core..."
echo "Excluding RRD and Archives..."
#cp -rp /usr/local/nagios $mydir
# SLB 2017-01-17
# == Split to 3 files - archives and perfdata are large - makes it easier to look inside nagios.tar.gz with those separated
echo "Backing up Nagios Core... perfdata files"
tar czfp $mydir/nagios-perfdata.tar.gz /usr/local/nagios/share/perfdata

# SLB 2017-01-17
echo "Backing up Nagios Core... archives files"
tar czfp $mydir/nagios-archives.tar.gz /usr/local/nagios/var/archives

# SLB 2017-01-17 # exclude them from the following
#tar czfp $mydir/nagios.tar.gz /usr/local/nagios
tar czfp $mydir/nagios.tar.gz --exclude=/usr/local/nagios/share/perfdata/* --exclude=/usr/local/nagios/var/archives/* /usr/local/nagios
And of course since I changed the backup script I had to change the restore script - which I hope to never actually need - but must exist for completeness...
Somewhere near line 131

Code: Select all

vi /usr/local/nagiosxi/scripts/restore_xi.sh
# Nagios Core
echo "Restoring Nagios Core..."
if [ "$arch" == "$backuparch" ] && [ "$ver" == "$backupver" ]; then
        rm -rf /usr/local/nagios
        echo "Restoring Nagios Core... nagios dir"
        cd $rootdir && tar xzf $backupdir/nagios.tar.gz
        # SLB 2017-01-17 - addded archives and perfdata
        echo "Restoring Nagios Core... archives dir"
        cd $rootdir && tar xzf $backupdir/nagios-archives.tar.gz
        echo "Restoring Nagios Core... perfdata dir"
        cd $rootdir && tar xzf $backupdir/nagios-perfdata.tar.gz
else
Anyway... makes my life easier.

Thanks - feel free to close this - Success! :D
Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
Locked