Page 1 of 1

SLA report doesn't respect custom date range

Posted: Wed Apr 01, 2020 4:00 pm
by tmvision
Hi,

While investigating our issue with availability reports described in this topic I noticed something strange with SLA reports.
I want to create a SLA report for a custom period. I choose a start and end date (say, 01/03/2020 00:00:00 and 01/03/2020 04:00:00) and I get a report which states that my service was up for 93.094% of the time in this period.
I try to change the numbers a little, and instead generate a report from 01/03/2020 00:00:00 to 01/03/2020 03:00:00 and I get the exact same report. Still 93.094% uptime.
This seems highly unlikely, so I try an extreme example: A report from 01/03/2020 00:00:00 to 01/03/2020 00:00:10. Same result. I find it very hard to believe, that we can have exactly 93.094% uptime during a period of 10 seconds.

My best guess is that the report-engine doesn't read the supplied dates at all. Can you verify this?
A bug regarding SLA reports was mentioned in the changelog for version 5.6.8 so I updated to the newest release (5.6.12, previously we were on 5.6.5). The problem is still present.

I hope you can help us find a fix for this issue.

Re: SLA report doesn't respect custom date range

Posted: Thu Apr 02, 2020 2:56 pm
by benjaminsmith
Hello @tmvision,

When you ran the report using the custom time period, it looks at the archived log file in that time frame by the timestamp of the file. As mentioned in the last post, availability reports are generated from /usr/local/nagios/var/nagios.log and logs in /usr/local/nagios/var/archives.

So the first thing to check is to make sure that you are not missing any archives for 1/3/2020 in the log folder.

Secondly, when running a very small report it will assume the host is the last known state or use the first assumed state option (see advanced settings), so if nothing has changed the report will not change.

You can gain a better understanding of the host states by filtering the report on either a specific host or hostgroup as it will provide a table of data of the host states for the report.

Explanations for the options in the Availability Report are in the following guide:

Generating Reports With Nagios XI

Re: SLA report doesn't respect custom date range

Posted: Wed Apr 08, 2020 8:01 am
by tmvision
As explained in the other thread, it seem that we had gotten into a bad habit of discarding old logs :oops: No wonder the reports from 1/3/2020 look odd!

Let us instead focus on reports based on more recent data, which is still available.
I have attached an image of two SLA reports, generated with the same settings. The report on the left is created with "Period: Today", while the other is made with a custom period which covers the same timerange.
These should show the same results, right?
sla-report.png
This is certainly not the case, so I still believe something is fishy in reports made with a "Custom" period.
What do you think? Am I missing some difference between "Today"-reports and custom-reports?

Re: SLA report doesn't respect custom date range

Posted: Wed Apr 08, 2020 3:43 pm
by lmiltchev
I was able to recreate the issue in-house, and filed an internal bug report (task_id=15048). Thanks for reporting the problem!

Re: SLA report doesn't respect custom date range

Posted: Wed Apr 15, 2020 4:28 am
by tmvision
Thank you for looking into it. Is it possible to follow the progress of the bug or does task_id refer to a private bug tracker?

Re: SLA report doesn't respect custom date range

Posted: Wed Apr 15, 2020 9:14 am
by lmiltchev
There is no public bug tracker for commercial products, at least not yet. You can ask for the status of the bug-fix task (providing the task ID) on the support forum or via a PM. You can also review the Nagios XI changelog here:

https://www.nagios.com/downloads/nagios-xi/change-log/

We post bug fixes and added features in it.