Re: [Nagios-devel] Automatically acknowledge services of an

Guest · Post by **Guest** » Thu Dec 09, 2010 9:53 am

On 12/09/2010 03:01 AM, Mathieu Gagn=E9 wrote:
> On 12/8/10 5:08 PM, Julien Mathis wrote:
>> Moreover I think you should reconsider your plugins. Is it normal for =
a
>> plugin to returns the CRITICAL status when it can not connect? Wouldn'=
t
>> it be more appropriate with the UNKNOWN status?
> Which plugins are we talking about?
> For example, if I use "check_http" and the port isn't opened for=20
> whatever reason (service is crashed, firewall, etc.), it is CRITICAL to=
=20
> me, not UNKNOWN. This is my business need. (but hey, to each his own)
[...]
> When the host is DOWN, service problems are silenced and NO=20
> notifications are sent, they are "muted". Why would you want to=20
> acknowledge a service problem if there isn't any notifications sent to=20
> contacts?

While I agree that automatic acknowledments promise to create more
problems than they'll solve, I'd like to comment on *this* tangent.

Of course, any kind of service where reachability via the net is part
and parcel of "the service" *should* use the OK/WARNING/CRITICAL range
of states to report connectivity problems. However, there also is a
plethora of checks where the remote access only satisfies the need of
centralizing the monitoring - CPU/RAM/disk usage, load, # of users, log
scans, hardware failures, you name it. In those cases, I *would* welcome
the possibility to map connectivity issues to UNKNOWN (or some
service-kin of hosts' UNREACHABLE) instead.

My favorite remote connector is check_by_ssh / check_by_ssc (the latter
basically being a multihop "Matryoshka-of-tunnels" SSH). Some of the
hosts actually have check_ping as their host check, for some of them it
would be outright *wrong* to change that to check_ssh (e.g., because I'm
also using check_http against that host). Of course I have a plain "does
SSH work" service defined on them and declare all services using
check_by_ssh as dependent on it. I even reduced the *_intervals and
max_check_attempts of the SSH check to priorize it. No dice, I *still*
get notifications for some of the dependent services before SSH is
declared CRITICAL.

Also, it's not *all* in the plugins. (It is in *most* cases, though -
and check_by_ssh falling back to the normal SSH_COMMAND, which isn't
aware of the needs of Nagios in the slightest, certainly doesn't help
*this* cause :-} ). In some cases, it's the Nagios core who times out
the check and provides a CRITICAL - e.g., the check_by_ssh timeout
doesn't apply to name resolution:

> # time ./check_by_ssh -H www.foobar.co.bj -C id -t 1
> check_by_ssh: Invalid hostname/address - www.foobar.co.bj
> real 0m5.132s

Kind regards,
J. Bern
--=20
Jochen Bern, Systemingenieur --- LINworks GmbH
Postfach 100121, 64201 Darmstadt | Robert-Koch-Str. 9, 64331 Weiterstadt
PGP (1024D/4096g) FP =3D D18B 41B1 16C0 11BA 7F8C DCF7 E1D5 FAF4 444E 1C2=
7
Tel. +49 6151 9067-231, Zentr. -0, Fax -299 - Amtsg. Darmstadt HRB 85202
Unternehmenssitz Weiterstadt, Gesch=E4ftsf=FChrer Metin Dogan, Oliver Mic=
hel

This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]