About Service Check Scheduling
About Service Check Scheduling
Retry Check Interval is 1m. But after 18m, it took second check, why?
Is there anyone here got this problem?
Is there anyone here got this problem?
Re: About Service Check Scheduling
The "alert summary" report will only show the history of alerts and state changes. So this particular service took 18 minutes to move from a soft critical to a soft OK. Was the service check flapping?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: About Service Check Scheduling
I didn't find any notification about flapping.abrist wrote:The "alert summary" report will only show the history of alerts and state changes. So this particular service took 18 minutes to move from a soft critical to a soft OK. Was the service check flapping?
Re: About Service Check Scheduling
further infomations. About after 4 hours, it took second check.cornea wrote:Retry Check Interval is 1m. But after 18m, it took second check, why?
Is there anyone here got this problem?
Re: About Service Check Scheduling
Post the output of:
Code: Select all
cat /usr/local/nagios/var/nagios.log | grep [hostname of box] Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: About Service Check Scheduling
I checked the log. It is as same as the graph.abrist wrote:Post the output of:
Code: Select all
cat /usr/local/nagios/var/nagios.log | grep [hostname of box]
But I found some "nagios" process' starttime is before I restart them. Is this possible that some process used the old configuration and the other used the new configuration?
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: About Service Check Scheduling
Can you post the output that abrist suggested so that we can see all of the most recent logged information?
Re: About Service Check Scheduling
[Wed Jan 16 14:15:28 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;1;(Host Check Timed Out)
[Wed Jan 16 14:16:03 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;OK;SOFT;2;Processor:60% : 60% : : OK
[Wed Jan 16 14:16:13 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;2;(Host Check Timed Out)
[Wed Jan 16 14:17:20 2013] HOST ALERT: ASNAY0S0004;UP;SOFT;3;OK - 10.196.255.9: rta 16.178ms, lost 0%
[Wed Jan 16 14:24:45 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;WARNING;SOFT;1;WARNING - 10.196.255.9: rta 16.093ms, lost 66%
[Wed Jan 16 14:25:46 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;OK;SOFT;2;OK - 10.196.255.9: rta 16.134ms, lost 0%
[Wed Jan 16 14:30:46 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;CRITICAL;SOFT;1;CRITICAL - 10.196.255.9: rta nan, lost 100%
[Wed Jan 16 14:31:25 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_cpu;UNKNOWN;SOFT;1;ERROR: Description table : No response from remote host "10.196.255.9".
[Wed Jan 16 14:31:45 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;OK;SOFT;2;OK - 10.196.255.9: rta 16.348ms, lost 0%
[Wed Jan 16 14:32:20 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_cpu;OK;SOFT;2;CPU : 12 11 11 : OK
[Wed Jan 16 14:56:22 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;UNKNOWN;SOFT;1;ERROR: Description table : No response from remote host "10.196.255.9".
[Wed Jan 16 14:56:33 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;1;(Host Check Timed Out)
[Wed Jan 16 14:56:52 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;CRITICAL;HARD;1;CRITICAL - 10.196.255.9: rta nan, lost 100%
[Wed Jan 16 14:57:17 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;OK;SOFT;2;Processor:60% : 60% : : OK
[Wed Jan 16 14:57:32 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;2;(Host Check Timed Out)
Please notice the red line. The state is HARD, but it did not send out the notification. Why?
[Wed Jan 16 14:16:03 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;OK;SOFT;2;Processor:60% : 60% : : OK
[Wed Jan 16 14:16:13 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;2;(Host Check Timed Out)
[Wed Jan 16 14:17:20 2013] HOST ALERT: ASNAY0S0004;UP;SOFT;3;OK - 10.196.255.9: rta 16.178ms, lost 0%
[Wed Jan 16 14:24:45 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;WARNING;SOFT;1;WARNING - 10.196.255.9: rta 16.093ms, lost 66%
[Wed Jan 16 14:25:46 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;OK;SOFT;2;OK - 10.196.255.9: rta 16.134ms, lost 0%
[Wed Jan 16 14:30:46 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;CRITICAL;SOFT;1;CRITICAL - 10.196.255.9: rta nan, lost 100%
[Wed Jan 16 14:31:25 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_cpu;UNKNOWN;SOFT;1;ERROR: Description table : No response from remote host "10.196.255.9".
[Wed Jan 16 14:31:45 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;OK;SOFT;2;OK - 10.196.255.9: rta 16.348ms, lost 0%
[Wed Jan 16 14:32:20 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_cpu;OK;SOFT;2;CPU : 12 11 11 : OK
[Wed Jan 16 14:56:22 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;UNKNOWN;SOFT;1;ERROR: Description table : No response from remote host "10.196.255.9".
[Wed Jan 16 14:56:33 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;1;(Host Check Timed Out)
[Wed Jan 16 14:56:52 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ping;CRITICAL;HARD;1;CRITICAL - 10.196.255.9: rta nan, lost 100%
[Wed Jan 16 14:57:17 2013] SERVICE ALERT: ASNAY0S0004;s_switch_cisco_ram;OK;SOFT;2;Processor:60% : 60% : : OK
[Wed Jan 16 14:57:32 2013] HOST ALERT: ASNAY0S0004;DOWN;SOFT;2;(Host Check Timed Out)
Please notice the red line. The state is HARD, but it did not send out the notification. Why?
Re: About Service Check Scheduling
What notification options have you set on the host? Could you post the contents the respective host's cfg? Are you receiving email notifications for other hosts/services?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: About Service Check Scheduling
Yes. When the host is *real* DOWN, I can receive notifications.abrist wrote:What notification options have you set on the host? Could you post the contents the respective host's cfg? Are you receiving email notifications for other hosts/services?
I define a service template for this check, and lots of services use the template. Most time it works well, but sometimes it looks unnormal.
Last edited by cornea on Wed Jan 16, 2013 8:01 pm, edited 1 time in total.