Hi,
We have this issue in Nagios 5.7.3 on which the service state history and the availability report shows unreasonable hours on critical state.
This happens everytime we restart the host machine. For instance, we restarted the server on Jan 17 for ~20mins and was correctly reflected to host state report (first image below) but it doesn't match with the service state history which shows 17h+ in critical state (second image). Was the service check keeps on waiting even if the host is up for this 17h timeframe?
I have also attached the check settings of the service we're trying to generate the availability report for the last 31 days. Logs in /usr/local/nagios/var/archives were also fine.
Regards,
State/availability report issue during host restarts
State/availability report issue during host restarts
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: State/availability report issue during host restarts
Hi @melchi,
That trends report is pulled straight from the Nagios log files, so whatever is in the logs should be reflected in the report. Does this host have any other services? Is the same behavior reflected in the other services? If not, then we should take a closer look at this particular service, maybe it didn't return to an OK state after re-booting.
Can you pull a State History report on this service over the period and make sure to select any state type as both and any state in the report options and upload to the thread or send it a PM.
Benjamin
That trends report is pulled straight from the Nagios log files, so whatever is in the logs should be reflected in the report. Does this host have any other services? Is the same behavior reflected in the other services? If not, then we should take a closer look at this particular service, maybe it didn't return to an OK state after re-booting.
Can you pull a State History report on this service over the period and make sure to select any state type as both and any state in the report options and upload to the thread or send it a PM.
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: State/availability report issue during host restarts
Hi Benjamin,
Sorry for the delayed reply.
Please refer to the service log entries below in the generated availability report using sequence -> [Legacy Reports > Availability > Service(s) > Select Service > Report Period Last 31 Days].
As you can see, logs only updates every midnight this is why the next OK after the KO we had during the patching activity, is reported after 17h, causing the false Critical status.
Is there any way we can change the update schedule of the log? Because it seems not possible even after the restart of nagios core service.
-Melchi
Sorry for the delayed reply.
Please refer to the service log entries below in the generated availability report using sequence -> [Legacy Reports > Availability > Service(s) > Select Service > Report Period Last 31 Days].
As you can see, logs only updates every midnight this is why the next OK after the KO we had during the patching activity, is reported after 17h, causing the false Critical status.
Is there any way we can change the update schedule of the log? Because it seems not possible even after the restart of nagios core service.
-Melchi
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: State/availability report issue during host restarts
Hi melchi,
Looking over the last screenshot, I see a couple of timeouts there, so it's likely there may have been some network issues that caused the service to fail even though the host was reporting up.
This can be resolved by increasing the timeout on the service check. If upload the system profile, and the exact name of this service, I can review the check command and make recommendations for you.
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
--Benjamin
Looking over the last screenshot, I see a couple of timeouts there, so it's likely there may have been some network issues that caused the service to fail even though the host was reporting up.
This can be resolved by increasing the timeout on the service check. If upload the system profile, and the exact name of this service, I can review the check command and make recommendations for you.
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
--Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: State/availability report issue during host restarts
Hi Benjamin,
I sent you pm for these details.
Thanks,
Melchi
I sent you pm for these details.
Thanks,
Melchi
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: State/availability report issue during host restarts
Hi Melchi,
For that particular service, it looks like you are using the following plugin for this service.
https://exchange.nagios.org/directory/P ... II/details
Try to add a -t 60 option to the check command to increase the timeout option to reduce the number of timeouts. The default is 30 seconds.
Regarding, the other issue, to my knowledge, there is not an outstanding bug for this report. I would need all the nagios.logs for this time period to determine if the service was in a critical state for that time period or not.
Those files are located in te following directory:
--Benjamin
For that particular service, it looks like you are using the following plugin for this service.
https://exchange.nagios.org/directory/P ... II/details
Try to add a -t 60 option to the check command to increase the timeout option to reduce the number of timeouts. The default is 30 seconds.
Regarding, the other issue, to my knowledge, there is not an outstanding bug for this report. I would need all the nagios.logs for this time period to determine if the service was in a critical state for that time period or not.
Those files are located in te following directory:
Code: Select all
/usr/local/nagios/var/archives
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: State/availability report issue during host restarts
Hi Benjamin,
Sorry we just found out the issue. Its because when we generate the report we use the default one with the soft states not included that's why we get only hard states in the report.
You can lock this ticket. Thank you for your support!
-Melchi
Sorry we just found out the issue. Its because when we generate the report we use the default one with the soft states not included that's why we get only hard states in the report.
You can lock this ticket. Thank you for your support!
-Melchi
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: State/availability report issue during host restarts
Locking threadmelchi wrote:Hi Benjamin,
Sorry we just found out the issue. Its because when we generate the report we use the default one with the soft states not included that's why we get only hard states in the report.
You can lock this ticket. Thank you for your support!
-Melchi