Re: [Nagios-devel] Automatically acknowledge services of an

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Automatically acknowledge services of an

Post by Guest »

On 12/09/2010 03:01 AM, Mathieu Gagn=E9 wrote:
> On 12/8/10 5:08 PM, Julien Mathis wrote:
>> Moreover I think you should reconsider your plugins. Is it normal for =
a
>> plugin to returns the CRITICAL status when it can not connect? Wouldn'=
t
>> it be more appropriate with the UNKNOWN status?
> Which plugins are we talking about?
> For example, if I use "check_http" and the port isn't opened for=20
> whatever reason (service is crashed, firewall, etc.), it is CRITICAL to=
=20
> me, not UNKNOWN. This is my business need. (but hey, to each his own)
[...]
> When the host is DOWN, service problems are silenced and NO=20
> notifications are sent, they are "muted". Why would you want to=20
> acknowledge a service problem if there isn't any notifications sent to=20
> contacts?

While I agree that automatic acknowledments promise to create more
problems than they'll solve, I'd like to comment on *this* tangent.

Of course, any kind of service where reachability via the net is part
and parcel of "the service" *should* use the OK/WARNING/CRITICAL range
of states to report connectivity problems. However, there also is a
plethora of checks where the remote access only satisfies the need of
centralizing the monitoring - CPU/RAM/disk usage, load, # of users, log
scans, hardware failures, you name it. In those cases, I *would* welcome
the possibility to map connectivity issues to UNKNOWN (or some
service-kin of hosts' UNREACHABLE) instead.

My favorite remote connector is check_by_ssh / check_by_ssc (the latter
basically being a multihop "Matryoshka-of-tunnels" SSH). Some of the
hosts actually have check_ping as their host check, for some of them it
would be outright *wrong* to change that to check_ssh (e.g., because I'm
also using check_http against that host). Of course I have a plain "does
SSH work" service defined on them and declare all services using
check_by_ssh as dependent on it. I even reduced the *_intervals and
max_check_attempts of the SSH check to priorize it. No dice, I *still*
get notifications for some of the dependent services before SSH is
declared CRITICAL.

Also, it's not *all* in the plugins. (It is in *most* cases, though -
and check_by_ssh falling back to the normal SSH_COMMAND, which isn't
aware of the needs of Nagios in the slightest, certainly doesn't help
*this* cause :-} ). In some cases, it's the Nagios core who times out
the check and provides a CRITICAL - e.g., the check_by_ssh timeout
doesn't apply to name resolution:

> # time ./check_by_ssh -H www.foobar.co.bj -C id -t 1
> check_by_ssh: Invalid hostname/address - www.foobar.co.bj
> real 0m5.132s

Kind regards,
J. Bern
--=20
Jochen Bern, Systemingenieur --- LINworks GmbH
Postfach 100121, 64201 Darmstadt | Robert-Koch-Str. 9, 64331 Weiterstadt
PGP (1024D/4096g) FP =3D D18B 41B1 16C0 11BA 7F8C DCF7 E1D5 FAF4 444E 1C2=
7
Tel. +49 6151 9067-231, Zentr. -0, Fax -299 - Amtsg. Darmstadt HRB 85202
Unternehmenssitz Weiterstadt, Gesch=E4ftsf=FChrer Metin Dogan, Oliver Mic=
hel





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked