Re: [Nagios-devel] [PATCH]
Posted: Tue Jan 04, 2011 7:59 am
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig988AB042C3C8E525034F1E08
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On 01/04/2011 04:43 PM, Thomas Guyot-Sionnest wrote:
> On 11-01-03 10:37 PM, Stephane LAPIE wrote:
>> Hello list,
>=20
>> I apologize in advance should this topic have already been raised in t=
he
>> past.
>=20
>=20
>=20
>> We make fairly intensive use of Nagios at our company (around 1700
>> machines, for 26000 services), using a cluster of OpenBSD machines.
>=20
>> We do distribution using NSCA (a re-made Ruby implementation of the
>> server), and external handler programs to offload sending the packets
>> (which leaves to Nagios the sole task of writing results to a named pi=
pe).
>=20
>> While tuning my configuration and creating several service groups
>> (simply for display purposes), I stumbled upon several problems :
>=20
>> 1) An actual bug : Beyond a certain number of members, Nagios simply
>> fumbles at handling service checks for affected services within its
>> child processes, and then reports the failure with a very misleading
>> error message : "Warning : Return code 127 was out of bounds. Make sur=
e
>> the plugin you're trying to run actually exists". (when the EXACT same=
>> configuration, minus service groups, works perfectly fine)
>=20
>> I haven't pinpointed the final cause for this one, and I think I have
>> simply found a triggering case, but this seems to hint at a deeper
>> problem in the check handling. (Additionally, the message associated
>> with code 127 should be made more accurate, as I spent several days
>> figuring if any combination of funny PATH environment variables and su=
ch
>> could prevent the execution of my scripts)
>=20
>> As a temporary fix for my setup, I removed the related servicegroups
>> entries, and I am running fine for now, but I am hoping this will be
>> fixed in a future version, as this is really more than just a small
>> annoyance.
> [...]
>> Further about the aforementioned bug :
>=20
>> I somehow have a value at which (and probably beyond which) the bug ca=
n
>> be reproduced (but it does not seem to be the direct cause). The
>> "symptoms" can be tracked down to MACRO_SERVICEGROUPMEMBERS generating=
a
>> 338084 bytes string (35 services, assigned to 294 machines via templat=
es).
>=20
> I believe this bug might have to do with the actual command line length=
> passed to popen. Is it possible somehow this macro ends up on the
> command line?
In my setup, this specific macro is never used for the concerned command
objects (the ones Nagios fails to execute).
--=20
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
--------------enig988AB042C3C8E525034F1E08
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk0i014ACgkQ24Ql8u6TF2N0FgCgj44K/qHg73GAZG1/p+S966JH
0AUAoKgWyFkPncaFz5SgTuBLzJLqJeOW
=/asP
-----END PGP SIGNATURE-----
--------------enig988AB042C3C8E525034F1E08--
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
--------------enig988AB042C3C8E525034F1E08
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On 01/04/2011 04:43 PM, Thomas Guyot-Sionnest wrote:
> On 11-01-03 10:37 PM, Stephane LAPIE wrote:
>> Hello list,
>=20
>> I apologize in advance should this topic have already been raised in t=
he
>> past.
>=20
>=20
>=20
>> We make fairly intensive use of Nagios at our company (around 1700
>> machines, for 26000 services), using a cluster of OpenBSD machines.
>=20
>> We do distribution using NSCA (a re-made Ruby implementation of the
>> server), and external handler programs to offload sending the packets
>> (which leaves to Nagios the sole task of writing results to a named pi=
pe).
>=20
>> While tuning my configuration and creating several service groups
>> (simply for display purposes), I stumbled upon several problems :
>=20
>> 1) An actual bug : Beyond a certain number of members, Nagios simply
>> fumbles at handling service checks for affected services within its
>> child processes, and then reports the failure with a very misleading
>> error message : "Warning : Return code 127 was out of bounds. Make sur=
e
>> the plugin you're trying to run actually exists". (when the EXACT same=
>> configuration, minus service groups, works perfectly fine)
>=20
>> I haven't pinpointed the final cause for this one, and I think I have
>> simply found a triggering case, but this seems to hint at a deeper
>> problem in the check handling. (Additionally, the message associated
>> with code 127 should be made more accurate, as I spent several days
>> figuring if any combination of funny PATH environment variables and su=
ch
>> could prevent the execution of my scripts)
>=20
>> As a temporary fix for my setup, I removed the related servicegroups
>> entries, and I am running fine for now, but I am hoping this will be
>> fixed in a future version, as this is really more than just a small
>> annoyance.
> [...]
>> Further about the aforementioned bug :
>=20
>> I somehow have a value at which (and probably beyond which) the bug ca=
n
>> be reproduced (but it does not seem to be the direct cause). The
>> "symptoms" can be tracked down to MACRO_SERVICEGROUPMEMBERS generating=
a
>> 338084 bytes string (35 services, assigned to 294 machines via templat=
es).
>=20
> I believe this bug might have to do with the actual command line length=
> passed to popen. Is it possible somehow this macro ends up on the
> command line?
In my setup, this specific macro is never used for the concerned command
objects (the ones Nagios fails to execute).
--=20
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
--------------enig988AB042C3C8E525034F1E08
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk0i014ACgkQ24Ql8u6TF2N0FgCgj44K/qHg73GAZG1/p+S966JH
0AUAoKgWyFkPncaFz5SgTuBLzJLqJeOW
=/asP
-----END PGP SIGNATURE-----
--------------enig988AB042C3C8E525034F1E08--
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]