Availability report shows large amount of Undetermined Time

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
tmvision
Posts: 32
Joined: Fri Dec 01, 2017 8:15 am

Availability report shows large amount of Undetermined Time

Post by tmvision »

Hi,

Today I tried to generate an availability report, but the numbers presented seemed off (100 % percent uptime? I know that's not the case). Also, some of the reports seemed to be highly sensitive to the the "First Assumed Service State" option - e.g. going from 100 % "Ok" to 100 % "Warning".

I tried to generate a legacy availability report instead, which turned out quite interesting. Reports from the last few days look all right but older reports shows a lot of "Undetermined". A report covering the first three months of this year shows 97 % "undecidable" for every single service.

I am not quite sure how to troubleshoot this issue. What do you think is the cause of this behaviour?
I suspect this could be some kind of database issue.
Could it be the case, that for a long period of time Nagios hasn't been able to write availability information correctly? Where does Nagios expect to find this information? We can access state history without problems.
If this is indeed a case of availability-data missing from the database, is it possible to regenerate this data based on the state history?

Our system is running NagiosXI 5.6.5 on CentOS 7, 64-bit, manual install.

Edit: While investigating this issue I upgraded XI to version 5.6.12. The behaviour described above remains unchanged.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Availability report shows large amount of Undetermined T

Post by cdienger »

The availability reports are generated from /usr/local/nagios/var/nagios.log and logs in /usr/local/nagios/var/archives. Can you share some screenshots highlighting an example of odd results results and include the details set for the report? I'd like to see this and the logs that make up the the report(the should be PM'd to me).
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
tmvision
Posts: 32
Joined: Fri Dec 01, 2017 8:15 am

Re: Availability report shows large amount of Undetermined T

Post by tmvision »

I have sent you a PM with nagios.log, the contents of the archives-directory, as well as (parts of) availability reports covering April.

The archives-directory contains logs from the last 14 days, is this expected?
Still, the availability reports don't appear to read data older than the 6th.
Does it only read the logs which haven't been zipped yet?

Edit: I found a scheduled task in /etc/cron.daily which zips and removes old logs from the archive-directory. My guess would be that this is not an official Nagios-file. Probably a cost-cutting measure on our part, with unintended consequences.
That said, is there a way to restore old nagios.log-files based on the contents of the XI-database?
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Availability report shows large amount of Undetermined T

Post by ssax »

Nope, not an official file.

There's not a way to replay the DB back to the archives at this point in time.

How far does your State History report go back?
tmvision
Posts: 32
Joined: Fri Dec 01, 2017 8:15 am

Re: Availability report shows large amount of Undetermined T

Post by tmvision »

Our state history goes back two years. We were thus quite surprised when we noticed that the availability reports only accessed data from the last couple of days. We had assumed that they would be based on data found in the event history.

We are primarily interested in generating reports on the last month or so of history, which should be possible in a week or two. It would of course be nice with a tool that could restore the older files (primarily for archiving purposes) but it is not business critical.

I hope you will consider expanding the description of nagios.log in the documentation on log files to comment on the fact that this log is used as the data source for certain reports. I would consider that a useful addition as it might prevent others from indiscriminately deleting these "archived" logs the way we did. :)
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Availability report shows large amount of Undetermined T

Post by cdienger »

That is a good suggestion. I'll ping the kb team to have it updated.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked