Page 1 of 1

Occasional scheduled backup failures.

Posted: Thu Jul 27, 2017 1:28 pm
by yo_marc
Hi all,

I have a couple of Nagios XI servers that experience intermittent / occasional backup failures.

The only thing I see in the scheduledbackups.log during one of these runs is:

07-27-2017 00:30:02 DEBUG: Running scheduled local backup ...
07-27-2017 00:30:02 INFO: Creating a local backup: nagiosxi.1501129802
07-27-2017 00:30:02 DEBUG: Sending create local backup command to CmdSubsystem

I have enabled additional logging in the actual backup_xi.sh script to help debug things. When this issue happens, it seems the backup_xi.sh script never gets called; so I presume there is an issue with the handoff from the scheduler to the CmdSubsystem.

CmdSubsystem logs seem to rotate out frequently, so I have not see anything logged there...

Any ideas?

Thanks!

Nagios XI 5.4.3
Backups scheduled using the XI scheduler for 00:30 each night. (Admin -> System Backups -> Scheduled Backups)
Backups running under the "Local" tab, and writing to an NFS share mounted on the XI server.

Re: Occasional scheduled backup failures.

Posted: Thu Jul 27, 2017 2:04 pm
by lmiltchev
Start a running tail on the cmdsubsys.log

Code: Select all

tail -f /usr/local/nagiosxi/var/cmdsubsys.log
then schedule a new local backup a few minutes in the future (Admin > Scheduled Backups > Local), and watch the log. When the backup process stops, stop the tail (ctrl + c), copy/paste the output on the forum.

Re: Occasional scheduled backup failures.

Posted: Thu Jul 27, 2017 2:16 pm
by yo_marc
Thanks! Will try to capture the issue happening.

Re: Occasional scheduled backup failures.

Posted: Thu Jul 27, 2017 2:59 pm
by lmiltchev
Sure, let us know how it went. We will keep the thread open.

Re: Occasional scheduled backup failures.

Posted: Thu Jul 27, 2017 3:09 pm
by yo_marc
Was able to reproduce after a couple of tries. A pattern I noticed - could be coincidence - but out of the four backups I ran I'd get a good one, then failure, then good one, then failure. I stopped at that point. Here is one of the failures -- not much to show:

* Backup scheduled for 3:45pm local time.


* Tail of cmdsubsys.log:

[root@my_xi_server var]# date ; tail -f cmdsubsys.log
Thu Jul 27 15:43:05 EDT 2017
....
PROCESSED 0 COMMANDS
tail: cmdsubsys.log: file truncated
............................................................tail: cmdsubsys.log: file truncated
.
PROCESSED 0 COMMANDS
tail: cmdsubsys.log: file truncated
............................................................tail: cmdsubsys.log: file truncated
.
PROCESSED 0 COMMANDS
^C
[root@my_xi_server var]# date
Thu Jul 27 15:47:47 EDT 2017


* Tail of scheduledbackups.log:

07-27-2017 15:45:01 DEBUG: Running scheduled local backup ...
07-27-2017 15:45:01 INFO: Creating a local backup: nagiosxi.1501184701
07-27-2017 15:45:01 DEBUG: Sending create local backup command to CmdSubsystem

Re: Occasional scheduled backup failures.

Posted: Thu Jul 27, 2017 4:49 pm
by dwhitfield
There were some fixes to the inits related to this in 5.4.4. Are you able to upgrade? The current is 5.4.7, so I'd suggest that.

Either after an upgrade (assuming you still have an issue), or if you can't upgrade, can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.

After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.

Re: Occasional scheduled backup failures.

Posted: Mon Jul 31, 2017 3:16 pm
by yo_marc
Thanks for the info regarding 5.4.4 -- I didn't realize there were updates that may fix things. I would like to upgrade the instances before proceeding -- it just may take us a bit (ie: weeks) before we can get that done. I am ok to close this out and I can re-address it if there are still issues after upgrading.

Re: Occasional scheduled backup failures.

Posted: Tue Aug 01, 2017 11:30 am
by dwhitfield
We can leave it open for you. Specifically the init updates will be of interest. Please be aware that any hardening may cause the inits to not be upgraded so you may want to copy the nagios inits and then diff them after the upgrade to make sure it actually went through.