Re: [Nagios-devel] [PATCH]
Posted: Tue Jan 04, 2011 10:25 am
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigDCB34D5D14E154A92B573C56
Content-Type: multipart/mixed; boundary="------------040403070608040201080100"
This is a multi-part message in MIME format.
--------------040403070608040201080100
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: quoted-printable
On 01/04/2011 06:38 PM, Andreas Ericsson wrote:
> http://www.op5.org/community/plugin-inv ... cts/merlin
> http://git.op5.org/git/?p=3Dnagios/merl ... TO;hb=3Dm=
aster
> http://git.op5.org/git/?p=3Dnagios/merl ... DME;hb=3D=
HEAD
>=20
> Make especially sure you read the first paragraph of the README.
Oh, I see.
I have been working with Nagios 3.2.0 since around Nov 2009 for our
monitoring setup, so I went and implemented my own thing using ssh keys,
a control shell script with specific commands, tuned configurations to
limit duplication of files, and enhancements to NSCA client/server to
make them workable, and such.
I never tried to touch the DB side of things with a vanilla Nagios base,
because it wouldn't be proper to handle that as a side-hack, and would
most likely kill any measure of performance.
I'll have to give this a look
Thanks a lot.
> Disable environment macros instead. If you're not using that macro on
> the command-line, your checks will continue to work. It's not a bug in
> Nagios, as such, it's just that environment variables and command line
> shares memory space, and that space is limited. For your 300k+ list of
> servicegroup members, you exhaust that space very quickly, and check
> execution fails.
Oh, so THIS is why in most cases the script would not even be executed.
I would have expected the error to be more straightforward, or have a
hint pointing to it.
Anyhow, thanks for the explanation, it now makes perfect sense, I should
have realized environment space was not unlimited. I had never stumbled
upon a case where I used up all of the space provided for ENV before.
>> 2) A performance problem : The MACRO_SERVICEGROUPMEMBERS code is
>> painfully slow and extremely costly in CPU performance. The attached
>> patch file is my attempt at fixing the most obvious issues :
>> - Repetitive malloc/realloc (I initially caught on this by ktrace-in=
g
>> the processes and realizing Nagios was mapping/unmapping a lot of memo=
ry).
>> - Repetitive string duplications and length calculations
>>
>> The above code has been tested for a few hours on a busy Nagios setup
>> and performs much faster, as expected. (Reduction of several thousands=
>> of malloc/realloc calls to 1, by initally calculating the memory size =
to
>> be allocated, thus avoiding unneeded system calls and memory areas
>> duplication)
>>
>=20
> Nice patch. I'll apply it tomorrow when it's my Nagios day. Any chance
> you could whip up something similar for HOSTGROUPMEMBERS until then?
Sure, please check out the attached file. It works on the same principle
as my previous patch, which means that short of the sprintf() arguments,
it's nearly a copy/paste. I ran it through my configuration for a test
run for an hour or so, and it seems to be doing fine so far.
Again, thanks a lot for your time.
--=20
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
--------------040403070608040201080100
Content-Type: text/x-csrc;
name="patch-common-macros.c"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="patch-common-macros.c"
--- common/macros.c.orig 2010-09-22 00:05:31.000000000 +0900
+++ common/macros.c 2011-01-04 18:55:30.850377775 +0900
@@ -1874,6 +1874,8 @@
int grab_standard_hostgroup_macro(int macro_type, hostgroup *temp_hostgr=
oup, char **output){
hostsmember *temp_hostsmember=3DNULL;
char *temp_buffer=3DNULL;
+ unsigned int temp_len=3D0;
+ unsigned int init_len=3D0;
=20
if(temp_hostgroup=3D=3DNULL || output=3D=3DNULL)
return ERROR;
@@ -1888,16 +1890,42 @@
*output=3D(char *)strdup(temp_hostgroup->alias);
break;
case MACRO_HOSTGROUPMEM
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
--------------enigDCB34D5D14E154A92B573C56
Content-Type: multipart/mixed; boundary="------------040403070608040201080100"
This is a multi-part message in MIME format.
--------------040403070608040201080100
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: quoted-printable
On 01/04/2011 06:38 PM, Andreas Ericsson wrote:
> http://www.op5.org/community/plugin-inv ... cts/merlin
> http://git.op5.org/git/?p=3Dnagios/merl ... TO;hb=3Dm=
aster
> http://git.op5.org/git/?p=3Dnagios/merl ... DME;hb=3D=
HEAD
>=20
> Make especially sure you read the first paragraph of the README.
Oh, I see.
I have been working with Nagios 3.2.0 since around Nov 2009 for our
monitoring setup, so I went and implemented my own thing using ssh keys,
a control shell script with specific commands, tuned configurations to
limit duplication of files, and enhancements to NSCA client/server to
make them workable, and such.
I never tried to touch the DB side of things with a vanilla Nagios base,
because it wouldn't be proper to handle that as a side-hack, and would
most likely kill any measure of performance.
I'll have to give this a look
> Disable environment macros instead. If you're not using that macro on
> the command-line, your checks will continue to work. It's not a bug in
> Nagios, as such, it's just that environment variables and command line
> shares memory space, and that space is limited. For your 300k+ list of
> servicegroup members, you exhaust that space very quickly, and check
> execution fails.
Oh, so THIS is why in most cases the script would not even be executed.
I would have expected the error to be more straightforward, or have a
hint pointing to it.
Anyhow, thanks for the explanation, it now makes perfect sense, I should
have realized environment space was not unlimited. I had never stumbled
upon a case where I used up all of the space provided for ENV before.
>> 2) A performance problem : The MACRO_SERVICEGROUPMEMBERS code is
>> painfully slow and extremely costly in CPU performance. The attached
>> patch file is my attempt at fixing the most obvious issues :
>> - Repetitive malloc/realloc (I initially caught on this by ktrace-in=
g
>> the processes and realizing Nagios was mapping/unmapping a lot of memo=
ry).
>> - Repetitive string duplications and length calculations
>>
>> The above code has been tested for a few hours on a busy Nagios setup
>> and performs much faster, as expected. (Reduction of several thousands=
>> of malloc/realloc calls to 1, by initally calculating the memory size =
to
>> be allocated, thus avoiding unneeded system calls and memory areas
>> duplication)
>>
>=20
> Nice patch. I'll apply it tomorrow when it's my Nagios day. Any chance
> you could whip up something similar for HOSTGROUPMEMBERS until then?
Sure, please check out the attached file. It works on the same principle
as my previous patch, which means that short of the sprintf() arguments,
it's nearly a copy/paste. I ran it through my configuration for a test
run for an hour or so, and it seems to be doing fine so far.
Again, thanks a lot for your time.
--=20
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
--------------040403070608040201080100
Content-Type: text/x-csrc;
name="patch-common-macros.c"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="patch-common-macros.c"
--- common/macros.c.orig 2010-09-22 00:05:31.000000000 +0900
+++ common/macros.c 2011-01-04 18:55:30.850377775 +0900
@@ -1874,6 +1874,8 @@
int grab_standard_hostgroup_macro(int macro_type, hostgroup *temp_hostgr=
oup, char **output){
hostsmember *temp_hostsmember=3DNULL;
char *temp_buffer=3DNULL;
+ unsigned int temp_len=3D0;
+ unsigned int init_len=3D0;
=20
if(temp_hostgroup=3D=3DNULL || output=3D=3DNULL)
return ERROR;
@@ -1888,16 +1890,42 @@
*output=3D(char *)strdup(temp_hostgroup->alias);
break;
case MACRO_HOSTGROUPMEM
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]