SSH Scheduled Backups failing

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
psanchez
Posts: 23
Joined: Wed Oct 17, 2012 12:14 pm

SSH Scheduled Backups failing

Post by psanchez »

Only the first backup transfer successful, now all are failing...
Now I'm just getting incomplete uploads.

Log: /usr/local/nagiosxi/var/components/scheduledbackups.log shows this error

ERROR: Scheduled SSH Backup Failed: File was not transferred successfully

on backup server, ssh logs show

Sep 29 04:43:42 xxxxxxxxx sshd[28910]: Accepted password for nagiosbackup from 999.999.999.999 port 48509 ssh2
Sep 29 04:43:42 xxxxxxxxx sshd[28910]: pam_unix(sshd:session): session opened for user nagiosbackup by (uid=0)
Sep 29 04:43:42 xxxxxxxxx sshd[28915]: subsystem request for sftp
Sep 29 04:43:53 xxxxxxxxx sshd[28915]: Received disconnect from 999.999.999.999: 11: PECL/ssh2 (http://pecl.php.net/packages/ssh2)
Sep 29 04:43:53 xxxxxxxxx sshd[28910]: pam_unix(sshd:session): session closed for user nagiosbackup


Nagios XI version: 2014R1.4
OS: RHEL 6.5 x64


If I clear all backups and schedule it again, only the first time works all afterwards are incomplete.

Local backups are fine.

Is there any timeout setting that i need to increase?

Is there a way to get debug/verbose output from schedule backup jobs?

How do i manually run a schedule ssh backup?

Any help?
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: SSH Scheduled Backups failing

Post by slansing »

There is an option on the bottom of the SSH backups tab titled Backup Limit, you must have this set to 0 to create unlimited backups, or (and this may be the case here) you have it set to any other int and it will limit the amount of backups created. If this is working as intended it should roll over, deleting the oldest backup to be able to create a new one. It is possible it is not working properly, do you have the limit set to "1" right now?
psanchez
Posts: 23
Joined: Wed Oct 17, 2012 12:14 pm

Re: SSH Scheduled Backups failing

Post by psanchez »

No, its set to 7. I see the 7 backups and it does rotate them, deleting older ones and trying to upload new ones...but uploads are never complete.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: SSH Scheduled Backups failing

Post by abrist »

There is indeed a timeout setting - from the changelog in 1.4:

Code: Select all

- Added the ability to specify backup creation timeout with cfg variable "backup_timeout" which defaults to 1200 secs (20 min) if not set -JO
Check to see if that is set in config.inc.php:

Code: Select all

grep backup_timeout /usr/local/nagiosxi/html/config.inc.php
If not, you can set it to 30 minutes (1800 seconds) by adding the following to config.inc.php:

Code: Select all

$cfg['backup_timeout']=1800;  
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
psanchez
Posts: 23
Joined: Wed Oct 17, 2012 12:14 pm

Re: SSH Scheduled Backups failing

Post by psanchez »

Thanks!

'backup_timeout' is currently not defined in /usr/local/nagiosxi/html/config.inc.php

I'll give that a try...

Just saw that Nagios XI 2014r1.5 is now out....
If I update config.inc.php, will change stay after an XI upgrade? or will I have to keep updating this after each upgrade?
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: SSH Scheduled Backups failing

Post by abrist »

psanchez wrote:If I update config.inc.php, will change stay after an XI upgrade? or will I have to keep updating this after each upgrade?
It does not change after upgrades. It is indeed persistent.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
psanchez
Posts: 23
Joined: Wed Oct 17, 2012 12:14 pm

Re: SSH Scheduled Backups failing

Post by psanchez »

Issue improved, but still continues...

So I upgraded to Nagios XI 2014R1.5, and the 'backup_timeout' setting remains.

Some sites did improve, others did not.
on each Nagios server, /usr/local/nagiosxi/var/components/scheduledbackups.log is not showing errors for since 10/1/2014.

Local backups are fine, its just SCP backups

I set the timeout to 45 minutes, thats more then enough time.

$cfg['backup_timeout']=2700;


So for example Site A, did improve..both local and SCP copies are the same size.
But Site B has not...

I'm still getting the same ssh log entries (explain on beginning of thread) on backup server.


I'll like to troubleshoot this more, but ....


Is there a way to enable verbose/debug logging of schedule backup job?

Is there a way to force run a manual schedule backup job? ...so i can look at output.


I can just have a script upload the local backups to remote backup server, but since this feature exist in XI I would like to take advantage of it, its easier for other not so technical people to adjust.




######################################################################
Site A - Local Bakcups
-rw-r--r-- 1 nagios nagios 211M Sep 29 19:02 nagiosxi.1412042461.tar.gz
-rw-r--r-- 1 nagios nagios 212M Sep 30 19:02 nagiosxi.1412128861.tar.gz
-rw-r--r-- 1 nagios nagios 290M Oct 1 19:02 nagiosxi.1412215261.tar.gz
-rw-r--r-- 1 nagios nagios 290M Oct 2 19:02 nagiosxi.1412301661.tar.gz
-rw-r--r-- 1 nagios nagios 291M Oct 3 19:02 nagiosxi.1412388061.tar.gz
-rw-r--r-- 1 nagios nagios 292M Oct 4 19:02 nagiosxi.1412474461.tar.gz
-rw-r--r-- 1 nagios nagios 292M Oct 5 19:02 nagiosxi.1412560861.tar.gz
# SCP backups
-rw-r--r-- 1 nagiosbackup nagiosbackup 200M Sep 30 07:23 nagiosxi.1412085661.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 212M Oct 1 07:23 nagiosxi.1412172062.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 290M Oct 2 07:51 nagiosxi.1412258462.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 170M Oct 3 07:51 nagiosxi.1412344862.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 291M Oct 4 07:51 nagiosxi.1412431261.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 292M Oct 5 07:51 nagiosxi.1412517661.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 292M Oct 6 07:51 nagiosxi.1412604062.tar.gz
#######################################
# Site B - Local backups
-rw-r--r-- 1 nagios nagios 418M Sep 29 19:15 nagiosxi.1412043061.tar.gz
-rw-r--r-- 1 nagios nagios 419M Sep 30 19:14 nagiosxi.1412129462.tar.gz
-rw-r--r-- 1 nagios nagios 498M Oct 1 19:15 nagiosxi.1412215861.tar.gz
-rw-r--r-- 1 nagios nagios 498M Oct 2 19:14 nagiosxi.1412302261.tar.gz
-rw-r--r-- 1 nagios nagios 498M Oct 3 19:14 nagiosxi.1412388661.tar.gz
-rw-r--r-- 1 nagios nagios 498M Oct 4 19:15 nagiosxi.1412475061.tar.gz
-rw-r--r-- 1 nagios nagios 498M Oct 5 19:14 nagiosxi.1412561461.tar.gz
# Site B - SCP backups
-rw-r--r-- 1 nagiosbackup nagiosbackup 374M Sep 30 07:33 nagiosxi.1412086261.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 420M Oct 1 07:33 nagiosxi.1412172661.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 44M Oct 2 08:01 nagiosxi.1412259061.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 193M Oct 3 08:01 nagiosxi.1412345461.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 498M Oct 4 08:01 nagiosxi.1412431862.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 363M Oct 5 08:01 nagiosxi.1412518261.tar.gz
-rw-r--r-- 1 nagiosbackup nagiosbackup 50M Oct 6 08:01 nagiosxi.1412604661.tar.gz
#########################################################################
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: SSH Scheduled Backups failing

Post by abrist »

Odd size differences between the two methods. Do both tarballs extract? Is the smaller one corrupt?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
psanchez
Posts: 23
Joined: Wed Oct 17, 2012 12:14 pm

Re: SSH Scheduled Backups failing

Post by psanchez »

No, most of them don't.

I have corrupt SCP backups.

Some will extract a some more files than others, but i will always still get this :

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now


This is what i find weird...why do local bacups work, but SCP keep giving me corrupt files.
There is no space issue on servers, they have plenty of space.

Does anyone know how to:


... enable verbose/debug logging of schedule backup job?

... a way to force run a manual schedule backup job? ...so i can look at output.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: SSH Scheduled Backups failing

Post by lmiltchev »

I was not able to recreate the issue, but while I was testing it, I found a bug, which can cause the ssh backups to fail on some systems. This is probably totally unrelated to your issue, but just wanted to give you heads up. The new version of the "Scheduled Backups" component will be included in the next release of Nagios XI.

Make sure your log is writable by group:

Code: Select all

chmod g+w /usr/local/nagiosxi/var/components/scheduledbackups.log
Try scheduling another ssh backup, then check the "scheduledbackups.log" for errors.

Code: Select all

tail /usr/local/nagiosxi/var/components/scheduledbackups.log
Is it possible that you have intermittent network issues between the Nagios XI and the remote server?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked