Page 1 of 1

service checks running too often

Posted: Wed Dec 12, 2012 2:19 pm
by grimm26
I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a poller (service check) that is running too often and I am not sure why. I have

Code: Select all

service_check_timeout=180
because I had trouble with the poller running long. Relevant settings for the service check:

Code: Select all

      
        check_period                    24x7
        max_check_attempts              1
        normal_check_interval           5
        retry_check_interval            5		
I also set up a tracking logger in the poller to record "timestamp PID started by PPID : Poll [Start|End] of poller"

Code: Select all

2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
As you can see, I start to get overlapping pollers. I don't understand why this would happen. Any hints or clues?

Re: service checks running too often

Posted: Wed Dec 12, 2012 3:26 pm
by slansing
One possibility, are you running more than one instance of Nagios? Or is an additional one spawning for some reason?

Re: service checks running too often

Posted: Wed Dec 12, 2012 3:50 pm
by grimm26
There is only one instance of nagios running and all of the checks can be traced back to the same master nagios daemon.

Re: service checks running too often

Posted: Fri Dec 14, 2012 11:56 am
by slansing
Can you send a sample service config, as well as any templates related to it such as time periods?

Also make sure to check your nagios.cfg for the following line:

Code: Select all

interval_length=
The default is 60 which would lock checks to at maximum a 60 second interval, be sure it's not lower.

Re: service checks running too often

Posted: Fri Dec 14, 2012 12:11 pm
by grimm26
I think I solved this on the nagios-users mailing list. What happens is that when I did a nagios reload, every service check that was running at that point got a duplicate service schedule started for it. At that point, even a full restart didn't help since nagios was retaining scheduling info across restarts. I had to disable use_retained_scheduling_info and then restart to get a clean slate of service check scheduling without duplicates.

How do I get this bug tracked?

Re: service checks running too often

Posted: Fri Dec 14, 2012 12:14 pm
by slansing
Alrighty, thanks for the info, it would be great if you could post that information to our tracker at http://tracker.nagios.org.

Re: service checks running too often

Posted: Fri Dec 14, 2012 12:39 pm
by grimm26
Issue 409 created.

Re: service checks running too often

Posted: Fri Dec 14, 2012 1:09 pm
by slansing
Sounds good, thank you!