Page 1 of 1
Problems with Availability Reports
Posted: Mon Apr 08, 2019 12:16 pm
by jorgeaaq
Hi:
We have Nagios XI enterprise licensed
I have a problem with nagios Availability Reports and after some digging I found something strange when I go to advanced configuration and go directly to nagios core reports
for example in April 1 , at 2.30am the host present a Critial hard failure check and the table reports after that 21hours and 21 minuts of failure ( the rest of the day) and suddenly in April 2 change to ok
suc141.JPG
however in nagios log I see that after a couple of minutes the service recovers and for the rest of the day only a few soft failures appear in logs
Servicio141.JPG
I suppose that this is why mi availability Reports are wrong
I am understanding this right?
There is an error in nagios?
why several hosts after a failure do not recover until the change of day?
can you help me
Jorge Arenas
CSA
Re: Problems with Availability Reports
Posted: Mon Apr 08, 2019 2:41 pm
by ssax
What version of XI are you using? You can grab it from the bottom left hand side of the web interface.
What version of Core are you running?
Please run that availability report again but please show us the options you selected in the previous page (this is very important) and in the final page, click the
[ View full log entries ] link so that we can see them all and resend the screenshot.
Thank you!
Re: Problems with Availability Reports
Posted: Thu Apr 11, 2019 12:42 pm
by jorgeaaq
Version of Nagios XI 5.5.9
Nagios Core 4.4.3
this is the information to create the availability report , we select Report period: this month and backtraced archives: 11
1.JPG
service Log entries without full log entries:
2.JPG
service log entries with full log entries selected (I am selecting april part):
8.JPG
Re: Problems with Availability Reports
Posted: Thu Apr 11, 2019 3:25 pm
by scottwilkerson
Can you show what you are selecting in step 3 of running the report?
We need to see if you are including soft states
Re: Problems with Availability Reports
Posted: Mon Apr 15, 2019 12:30 pm
by jorgeaaq
Sorry for missing that
when I select No in soft states
Paso 3 no soft states.png
I get a report with 21h 30m of Service Critical (HARD) state down
21 hrs and next day.png
but with Yes in soft states
detalle con soft states dia 1 de abril.png
my question here is
1.- why 21 hours down? ( los in the first email show down the service for few minutes and then everything ok... so why 21 hours in the table
2.- why the service change state as soon as change the day? this is the other strange behavior, why a service keep the state down and why at the end of the day return to ok
3.- when I include soft states the table reflects better detail of what really happens
so, I do not know if this behavior is expected, and if it is, can you explain why? or this is a bug in the report
thanks in advance
Jorge Arenas
Re: Problems with Availability Reports
Posted: Mon Apr 15, 2019 12:41 pm
by scottwilkerson
You didn't go back to HARD OK until 21 hours later.
However, there may be an explanation.
What version of Nagios Core is this?
There was a bug in early 4.4.x versions that could cause this to not go HARD when it is supposed to.
Re: Problems with Availability Reports
Posted: Mon Apr 22, 2019 6:37 pm
by jorgeaaq
Hi Scott:
my version is
Nagios Core 4.4.3
this version is affected?
what version of nagios XI, I need to upgrade to get a newer Nagios Core module?
or I need to upgrade manually ?
thanks in advance
Jorge Arenas Quezada
Re: Problems with Availability Reports
Posted: Tue Apr 23, 2019 9:01 am
by scottwilkerson
This is the latest version.
How often is this host/service checking?
Can you share the configuration for this host/service ?