Consolidating multiple check_nrpe tests for same host

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
mhhall3
Posts: 6
Joined: Tue Jun 05, 2018 1:22 pm

Consolidating multiple check_nrpe tests for same host

Post by mhhall3 »

I've been a long time Nagios user in a mostly Linux environment.
I'm new to using this forum.... pointers to a prior discussion of this matter would be appreciated.
In multiple recent cases, I'm finding issues relating to the use of multiple "check_nrpe" tests to the same host.

Two of these recent cases, I'm seeing "Socket timeout" even after expanding the timeout value (from 10s to 30s).

Frequently this will cause batches of error e-mails related to check_nrpe failures.
I am just wondering if there might be a some kind of per host "back-off" created if a check_nrpe function fails (or times out).

Any ideas would be appreciated.

Mike
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Consolidating multiple check_nrpe tests for same host

Post by scottwilkerson »

There isn't a backoff, however there is host/service dependencies and of the host connection is going down then the services shouldn't notify.

Also, it is common to setup services with a configuration like this

Code: Select all

	max_check_attempts		5
	check_interval			5
	retry_interval			1
which checks every 5 minutes, if there is a failure, it would switch to check every 1 minute, 5 times before sending out an alert.

This will require 5 failures in a row before the notification goes out.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked