Nagios Support Forum

Posted: **Thu Mar 12, 2015 11:17 pm**

HI,

In Configure Host, monitoring settings, I don't understand the meaning of "potential problem is first detected"

In normal server monitoring, we will set every 2 minutes for checking the services. This means when services once reach the threshold value and then it will trigger the alarm.

But how to declare the "potential problem" for below statement?

When a potential problem is first detected ...
Re-check the host every 1 minutes up to 5 times before generating an alert.

Posted: **Fri Mar 13, 2015 9:34 am**

That gives you the option to re-check a host/service at a faster speed when a problem is detected to avoid false positives. People usually check 5 times with a minute between each check in order to make sure the issue wasn't temporary. This setting is what determines how fast the check runs. The "Max Check Attempts" defines how many times to re-check, and you can set this to 1 to alert immediately.

Posted: **Sat Mar 14, 2015 1:07 am**

Hi tmcdonald,

I would like to confirm setting with below example.

I set every 3 mins to check the service and

"When a potential problem is first detected ...
Re-check the host every 1 minutes up to 1 times before generating an alert."

Finally, If problem detected. Nagios will generate the alert in 3 mins or 4 mins?

Posted: **Sun Mar 15, 2015 10:15 pm**

Here's how the interval and retry settings work in a scenario:

Check Interval: 2m
Retry Interval: 1m
Number of Retries: 5

1.01 Nagios checks service, service is OK, next check is 1.03, attempt 1/5
1.03 Nagios checks service, service is OK, next check is 1.05, attempt 1/5
1.03.30 service breaks somehow, Nagios does not know about it yet
1.05 Nagios checks service, detects thresholds have been tiggered, SOFT state, NEXT check 1.06, attempt 1/5
1.06 Nagios checks service, thresholds still tiggered, SOFT state, NEXT check 1.07, attempt 2/5
1.07 Nagios checks service, thresholds still tiggered, SOFT state, NEXT check 1.08, attempt 3/5
1.08 Nagios checks service, thresholds still tiggered, SOFT state, NEXT check 1.09, attempt 4/5
1.09 Nagios checks service, thresholds still tiggered, HARD state, notifications sent, NEXT check 1.10, attempt 5/5

So it's only when the service reaches the number of retries before it enters a HARD state and starts sending notifications.

Posted: **Mon Mar 16, 2015 1:48 am**

So how to disable "retry interval" because we would like to get the alert when first error trigger.
Some error eg. system log messages just trigger 1 times or it will wait a long time for generate 2nd same error.

Posted: **Mon Mar 16, 2015 10:39 am**

From our documentation: http://nagios.sourceforge.net/docs/nagi ... tions.html

max_check_attempts: This directive is used to define the number of times that Nagios will retry the host check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the host check. Note: If you do not want to check the status of the host, you must still set this to a minimum value of 1. To bypass the host check, just leave the check_command option blank.

Posted: **Mon Mar 23, 2015 4:21 am**

Hi jolson,

Thanks for your answer.

Posted: **Mon Mar 23, 2015 9:32 am**

No problem - would it be alright if I locked this thread and marked as resolved?

Posted: **Mon Mar 30, 2015 9:50 pm**

sorry for lately reply. Sure. Case can be closed and thank your for your helping

Nagios Support Forum

Menaing of "potential problem is first dectect"

Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"

Re: Menaing of "potential problem is first dectect"