Inconsistent Nagios Report

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
fran.pastor
Posts: 24
Joined: Tue Nov 22, 2011 3:17 am

Inconsistent Nagios Report

Post by fran.pastor »

Hello, I've just seen some data that I think are wrong. See attached screenshots.
Last night we had a little network problem and that caused the service check fail, but immediately, on the next check he has recovered. Something curious happened, when you make a report(trend or availability for example) this service check are down for 13 or 15 hours, why?
Attachments
instantánea4.png
instantánea3.png
instantánea3.png (11.22 KiB) Viewed 4862 times
instantánea2.png
Last edited by fran.pastor on Thu Feb 21, 2013 10:40 am, edited 1 time in total.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Inconsistent Nagios Report

Post by abrist »

I wonder if this host was flapping for 13 hours ....
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
fran.pastor
Posts: 24
Joined: Tue Nov 22, 2011 3:17 am

Re: Inconsistent Nagios Report

Post by fran.pastor »

No, if you look "instantanea2.png" screenshot, the service doesn't has more events, that screenshot is a "alert history" of that service. There have been no changes to the remaining hours.
is strange
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Inconsistent Nagios Report

Post by slansing »

I just want to verify that you do have flapping detection enabled correct?
fran.pastor
Posts: 24
Joined: Tue Nov 22, 2011 3:17 am

Re: Inconsistent Nagios Report

Post by fran.pastor »

slansing wrote:I just want to verify that you do have flapping detection enabled correct?
Yes slansing, we had correctly configurated flap detection, if you see the screenshot of "Alert History" look the events, if he had flapped we would see there.
Attachments
screenshot8.png
Last edited by fran.pastor on Thu Feb 21, 2013 11:57 am, edited 1 time in total.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Inconsistent Nagios Report

Post by abrist »

Could you post the host configuration and the main nagios.cfg file in code wrap?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
fran.pastor
Posts: 24
Joined: Tue Nov 22, 2011 3:17 am

Re: Inconsistent Nagios Report

Post by fran.pastor »

I think and I see no logical explanation. Where I can post a bug?
Attachments
screenshot7.png
screenshot6.png
screenshot5.png
Last edited by fran.pastor on Thu Feb 21, 2013 11:56 am, edited 1 time in total.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Inconsistent Nagios Report

Post by abrist »

You are welcome to post a bug report to http://tracker.nagios.org but I am not convinced it is a bug yet. Posting the host config and the main nagios config will allow us to check over your configuration to help verify if it is indeed a bug. You are welcome to obfuscate any sensitive information from those files.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
fran.pastor
Posts: 24
Joined: Tue Nov 22, 2011 3:17 am

Re: Inconsistent Nagios Report

Post by fran.pastor »

abrist wrote:You are welcome to post a bug report to http://tracker.nagios.org but I am not convinced it is a bug yet. Posting the host config and the main nagios config will allow us to check over your configuration to help verify if it is indeed a bug. You are welcome to obfuscate any sensitive information from those files.
thz for support abrist
Is suspect that if you look at the Trend Report, the service recovers at 00:00, but all day the check has been checked every 5 minutes checking and the result of all checks has been OK, only one CRITICAL at 00:20 +/-

This is the config result from objects.cache:

define service {
host_name Watchmouse
service_description Check Hotelopia
check_period 24x7
check_command check_watchmouse!Check Hotelopia!
contact_groups datacenter-administrators-tic
notification_period 24x7
initial_state o
check_interval 300.000000
retry_interval 300.000000
max_check_attempts 3
is_volatile 0
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess_over_service 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options o,w,u,c
freshness_threshold 0
check_freshness 0
notification_options u,w,c,r,s
notifications_enabled 1
notification_interval 0.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
failure_prediction_enabled 1
retain_status_information 1
retain_nonstatus_information 1
}
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Inconsistent Nagios Report

Post by slansing »

One issue I noticed right away was this:

Code: Select all

check_interval 300.000000
retry_interval 300.000000
You have your check_interval and retry_interval set to 300 minutes as this is how they interpret the numbers, setting them each to 5 for example would mean the host is checked at a 5 minute interval, and then every 5 minutes after that if the state changes it will be checked again three times before generating an alert.

In this fashion it is entirely possible that it detected the state change, but never checked again until 300 minutes later, and it would have had to do this three times before finding that the host was back up and switching to an Ok state.
Locked