Page 1 of 2

About Service Check Scheduling

Posted: Wed Jan 09, 2013 8:01 pm
by cornea
Retry Check Interval is 1m. But after 18m, it took second check, why?
Is there anyone here got this problem?

Re: About Service Check Scheduling

Posted: Thu Jan 10, 2013 11:49 am
by abrist
The "alert summary" report will only show the history of alerts and state changes. So this particular service took 18 minutes to move from a soft critical to a soft OK. Was the service check flapping?

Re: About Service Check Scheduling

Posted: Thu Jan 10, 2013 8:24 pm
by cornea
abrist wrote:The "alert summary" report will only show the history of alerts and state changes. So this particular service took 18 minutes to move from a soft critical to a soft OK. Was the service check flapping?
I didn't find any notification about flapping.

Re: About Service Check Scheduling

Posted: Thu Jan 10, 2013 9:12 pm
by cornea
cornea wrote:Retry Check Interval is 1m. But after 18m, it took second check, why?
Is there anyone here got this problem?
further infomations. About after 4 hours, it took second check.

Re: About Service Check Scheduling

Posted: Fri Jan 11, 2013 4:05 pm
by abrist
Post the output of:

Code: Select all

cat /usr/local/nagios/var/nagios.log | grep [hostname of box] 

Re: About Service Check Scheduling

Posted: Sun Jan 13, 2013 8:53 pm
by cornea
abrist wrote:Post the output of:

Code: Select all

cat /usr/local/nagios/var/nagios.log | grep [hostname of box] 
I checked the log. It is as same as the graph.

But I found some "nagios" process' starttime is before I restart them. Is this possible that some process used the old configuration and the other used the new configuration?

Re: About Service Check Scheduling

Posted: Mon Jan 14, 2013 11:08 am
by slansing
Can you post the output that abrist suggested so that we can see all of the most recent logged information?

Re: About Service Check Scheduling

Posted: Wed Jan 16, 2013 3:45 am
by cornea
[Wed Jan 16 14:15:28 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;1;(Host Check Timed Out)
[Wed Jan 16 14:16:03 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;OK;SOFT;2;Processor:60% : 60% : : OK
[Wed Jan 16 14:16:13 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;2;(Host Check Timed Out)
[Wed Jan 16 14:17:20 2013] HOST ALERT: ASNAY0S0004;UP;SOFT;3;OK - 10.196.255.9: rta 16.178ms, lost 0%
[Wed Jan 16 14:24:45 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;WARNING;SOFT;1;WARNING - 10.196.255.9: rta 16.093ms, lost 66%
[Wed Jan 16 14:25:46 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;OK;SOFT;2;OK - 10.196.255.9: rta 16.134ms, lost 0%
[Wed Jan 16 14:30:46 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;CRITICAL;SOFT;1;CRITICAL - 10.196.255.9: rta nan, lost 100%
[Wed Jan 16 14:31:25 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_cpu;UNKNOWN;SOFT;1;ERROR: Description table : No response from remote host "10.196.255.9".
[Wed Jan 16 14:31:45 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;OK;SOFT;2;OK - 10.196.255.9: rta 16.348ms, lost 0%
[Wed Jan 16 14:32:20 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_cpu;OK;SOFT;2;CPU : 12 11 11 : OK
[Wed Jan 16 14:56:22 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;UNKNOWN;SOFT;1;ERROR: Description table : No response from remote host "10.196.255.9".
[Wed Jan 16 14:56:33 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;1;(Host Check Timed Out)
[Wed Jan 16 14:56:52 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;CRITICAL;HARD;1;CRITICAL - 10.196.255.9: rta nan, lost 100%
[Wed Jan 16 14:57:17 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;OK;SOFT;2;Processor:60% : 60% : : OK
[Wed Jan 16 14:57:32 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;2;(Host Check Timed Out)


Please notice the red line. The state is HARD, but it did not send out the notification. Why?

Re: About Service Check Scheduling

Posted: Wed Jan 16, 2013 11:41 am
by abrist
What notification options have you set on the host? Could you post the contents the respective host's cfg? Are you receiving email notifications for other hosts/services?

Re: About Service Check Scheduling

Posted: Wed Jan 16, 2013 7:56 pm
by cornea
abrist wrote:What notification options have you set on the host? Could you post the contents the respective host's cfg? Are you receiving email notifications for other hosts/services?
Yes. When the host is *real* DOWN, I can receive notifications.
I define a service template for this check, and lots of services use the template. Most time it works well, but sometimes it looks unnormal.