Page 2 of 3
Re: Snapshots stopped
Posted: Wed Jan 20, 2016 10:13 pm
by Fred Kroeger
Before you close - can you explain why the latest index isn't included in the snapshot ?
I have it set to run just after midnight so that the previous days index is included - however it's not listed?
The partial screenshot below shows that it was run on 21/1/16 but the last index listed is 19/1/16
Snapshot-3.PNG
BTW - when I click on the info symbol next to the date-time , it shows a timestamp for the previous day ?
Snapshot-4.PNG
Also ..... why is the Snapshot date/time not shown as DD-MM-YYYY which is how I would expect to see it in Australia?
regards... Fred
Re: Snapshots stopped
Posted: Thu Jan 21, 2016 12:57 pm
by jolson
The time differences _could_ be explained if the snapshots relied on UTC or the time was not exactly in line with your expectations. Do you know the timezone/dates of your NLS nodes?
Also ..... why is the Snapshot date/time not shown as DD-MM-YYYY which is how I would expect to see it in Australia?
Currently the snapshot process is only built to display the timestamps in a single format - I can certanly put in a feature request on your behalf to get the date changed if you'd like.
Re: Snapshots stopped
Posted: Thu Jan 21, 2016 7:42 pm
by Fred Kroeger
Both NLS nodes are configured the same and both use the same NTP source and are in sync.
Just saw that the backup_maintenance task schedued time seems to be changing. I originally had it set to run daily at 1min after midnight.
Code: Select all
backup_maintenance Waiting SUCCESS 01/22/2016 00:20:59 1 day 01/23/2016 00:20:59
The Snaphot time shows a differnet runtime ?
Code: Select all
# date
Fri Jan 22 08:26:38 AWST 2016
# ls -la /etc/localtime
lrwxrwxrwx 1 root root 35 Nov 27 12:16 /etc/localtime -> /usr/share/zoneinfo/Australia/Perth
# cat /etc/sysconfig/clock
ZONE="Australia/Perthâ
UTC=False
# grep timezone /etc/php.ini
date.timezone = Australia/Perth
Re: Snapshots stopped
Posted: Fri Jan 22, 2016 12:11 pm
by hsmith
How long does it take your system to take a snapshot? Perhaps it's reporting the time that it finishes, instead of the time that it runs? My test NLS cluster doesn't have a lot of data, so my snap creation was almost instant.
Re: Snapshots stopped
Posted: Sun Jan 24, 2016 7:15 pm
by Fred Kroeger
Not yet - still not convinced that everything is being backed up ?
I've figured out why the strange date/time values in the curator snapshot name - The file name is actually the UTC time so I'm happy it's running at the correct time now - however I would much prefer the filename to reflect the local time.
However the List of indexes backed up are still a day behind. The snapshot ran after midnight on 25/1 - yet the latest logstash file shown is for the 23/1. Why isn't the logstash for 24/1 included in the list?
Snapshot-5.PNG
Also the scheduled backup time gets changed whenever it runs. It's actually rescheduling for 24hrs after the backup completes. So every day it is starting a little later. Is this expected behaviour ?
regards... Fred
Re: Snapshots stopped
Posted: Mon Jan 25, 2016 1:10 pm
by jolson
However the List of indexes backed up are still a day behind. The snapshot ran after midnight on 25/1 - yet the latest logstash file shown is for the 23/1. Why isn't the logstash for 24/1 included in the list?
This is very likely because _everything_ under the hood of Nagios Log Server uses UTC, including the index rotation time. The Web GUI time is simply adjusted to make it look like looks are collected in localtime. This being the case, there may be some skew (positive or negative) in terms of when your indices rotate.
In CT, we're currently -6 UTC, meaning that our indices will rotate at 1800 CT, which would be midnight in UTC. Are you in a timezone that has a positive relation to UTC? What timezone is the Web GUI set to?
My curiosity is that the alert subsystem _should_ be using UTC/GMT, I can check on that while you confirm the above. This would explain the erroneous backup behavior.
Also the scheduled backup time gets changed whenever it runs. It's actually rescheduling for 24hrs after the backup completes. So every day it is starting a little later. Is this expected behaviour ?
No it is not - I bet that the backup is rescheduled from the end-time instead of the beginning time. I'll talk to the devs about this and confirm whether or not it's a bug in the system after I do some testing of my own.
Jesse
Re: Snapshots stopped
Posted: Thu Jan 28, 2016 11:21 pm
by Fred Kroeger
Hi Jesse - We're on +8 UTC
Yes - I think you're right, it seems to be rescheduling for 24hrs at the end of the backup, so every night it's starting a bit later.
Consistent with this is that the "Last Run Time" value appears also to be the time it ended instead of started. So the "Next Run Time" value is 24hrs plus that.
regards.. Fred
Re: Snapshots stopped
Posted: Fri Jan 29, 2016 10:43 am
by hsmith
Is this behavior something you want changed, or just something you wanted clarification on?
Re: Snapshots stopped
Posted: Sun Jan 31, 2016 10:11 pm
by Fred Kroeger
Are you saying that this is expected behaviour?
If you schedule a backup to run at a certain time every 24hrs I would expect it to run at that time, not 24 hours after it finishes. This isn't consistent with any other Nagios scheduling?
Also I would expect that back up to contain the previous days activity relative to the local timezone and not relative to UTC. So currently my backup at midnight does not include the previous day as I've shown in the screen shots.
Re: Snapshots stopped
Posted: Mon Feb 01, 2016 11:30 am
by jolson
Are you saying that this is expected behaviour?
No, this is not expected behavior - the reschedule time is supposed to be at the time the subsystem process fires, not the time it completes.
Also I would expect that back up to contain the previous days activity relative to the local timezone and not relative to UTC. So currently my backup at midnight does not include the previous day as I've shown in the screen shots.
I agree with this - UTC can confuse things. If it must be in UTC, the Web GUI should mention that is the case.
I will put together a bug report and we'll get this fixed as soon as we can!