Until now , i think that the causue of availability report not accurate have big chance because of the bug of soft ok not change to hard ok occurs sometimes in the case the host of service is also down ,and this issue still exist in the nagios core version 4.4.5.
In google search ,i found other guy also reported same kind of issue, described in the following link. So all the time duration between the time point soft OK and later the time point of real ok ,these duration are currently calculated as unavailability time, but actually , these duration is already ok time.
https://github.com/NagiosEnterprises/na ... issues/730
Service Availability Report seems not accurate for my servic
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Service Availability Report seems not accurate for my se
What are you selecting on step 3 when you run the report?
Re: Service Availability Report seems not accurate for my se
Hi, Scottwilkerson
Here's how i generates the report:
First i select a service, then click in "View Availability Report For This Service" ,then got the report page , then i select "last month" ,then click "update" , then i got the report of last month.
Here's a screenshot of the report of one of my service ,this service went into critical on 2020-02-09 12:14:50 ,at 20:00 ,nagios instance have a scheduled restart , and critical status continue , at 2020-02-10 0:0:0 , the status change to OK. From the alert log, i can see on 2020-02-09 12:44:20,this service got a soft ok , but no hard ok until the end of day .
So in real situration , i think this service have been recovered on 2020-02-09 12:44:20 , but availability report think this service was recovered on 2020-02-10 0:0:0 .
Here's how i generates the report:
First i select a service, then click in "View Availability Report For This Service" ,then got the report page , then i select "last month" ,then click "update" , then i got the report of last month.
Here's a screenshot of the report of one of my service ,this service went into critical on 2020-02-09 12:14:50 ,at 20:00 ,nagios instance have a scheduled restart , and critical status continue , at 2020-02-10 0:0:0 , the status change to OK. From the alert log, i can see on 2020-02-09 12:44:20,this service got a soft ok , but no hard ok until the end of day .
So in real situration , i think this service have been recovered on 2020-02-09 12:44:20 , but availability report think this service was recovered on 2020-02-10 0:0:0 .
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Service Availability Report seems not accurate for my se
When you access the report this way, some assumptions are made, it sets includesoftstates=no
To modify these you need to run the report from
Reports -> Availability -> Services
Select Service
Then you can choose all the possible settings for the report
To modify these you need to run the report from
Reports -> Availability -> Services
Select Service
Then you can choose all the possible settings for the report
Re: Service Availability Report seems not accurate for my se
Hi,Scottwilkerson
With your "formal" method of generating availability report ,now i can get the correct report.Thanks .
One confusion for me , so that soft ok which last 8 or 9 hours long, is it still a bug or an expected status transfer in your opinion?
With your "formal" method of generating availability report ,now i can get the correct report.Thanks .
One confusion for me , so that soft ok which last 8 or 9 hours long, is it still a bug or an expected status transfer in your opinion?
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Service Availability Report seems not accurate for my se
I believe this is expected.source888 wrote: One confusion for me , so that soft ok which last 8 or 9 hours long, is it still a bug or an expected status transfer in your opinion?
From the following you will see SOFT RECOVERY is a special case
https://assets.nagios.com/downloads/nag ... types.htmlService experiences a SOFT recovery. Event handlers execute, but notification are not sent, as this wasn't a "real" problem. State type is set HARD and check # is reset to 1 immediately after this happens.
So the actual state type is set to hard from the Nagios memory perspective but it is logged as SOFT to differentiate the fact that notification are not sent in these cases