[Nagios-devel] Re: Nagios-devel digest, Vol 1 #807 - 8 msgs
Posted: Mon May 09, 2005 10:34 am
--Kj7319i9nmIyA2yE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
hey,
On Mon, May 09, 2005 at 08:47:42AM -0700, [email protected]=
forge.net wrote:
> From: Andreas Ericsson
> Zero overhead is just not going to happen. Nagios MUST be able to=20
> execute checks in parallell. It can't do that if it just enters a=20
> function instead without forking, threading or multiplexing (actually it=
=20
> can't do that without forking or threading, but popen() forks, so to=20
> multiplex the results from it would be a sort of mix of both worlds), as=
=20
> that would imply a serialized execution.
you have a point that there's going to need to be some kind of fork
or multi-threading capabilities. but calling a function in a forked
process or thread would still be much better performance-wise than the
multiple fork and exec calls in the current implementation. =20
> It would require a huge re-design of current arch. It would also require=
=20
> a huge re-design of most plugins, since they don't clean up after=20
> themselves as it is today. They also use very shoddy function-calls. Not=
=20
that wouldn't be as much of a "redesign" as it would be a code-cleanup,
which is never a bad thing to do anyway. plus, what i'm suggesting
isn't an all-or-nothing switchover, but a conditional switch. plugins
could be audited for poor memory management etc and as they are approved
added to a list of plugins to be added to the shared object target list.
> to mention; plugins that crash would cause nagios to crash. This just=20
> isn't good enough.
even forked children?
> > systems cache frequently accessed pages in memory, but there's still
> > unavoidable overhead in creating a new process, as well as the
> > context switching between the various processes.
>=20
> This would still be unavoidable, so point is still moot (see above on=20
> parallellism).
well, somewhat moot. see below:
> Three fork()'s and two execve()'s, as nagios itself forks once prior to=
=20
> running popen(). execve() replaces the running process, so there's no=20
that's the count that i got:
- nagios forks
- nagios child calls popen
- popen forks=20
- popen child calls execve(/bin/sh)
- /bin/sh forks
- /bin/sh child calls execve(cmd)
- /bin/sh child (now cmd) exits with status
and i'm suggesting
- nagios forks
- nagios child calls plugin_function
- nagios child exits return status of plugin_function
note that if this were in a multi-threaded arch, or if the child
processes were pre-allocated, even this fork would have a negligable
effect.
> running popen(). execve() replaces the running process, so there's no=20
> context-switching. It would be possible to get rid of one of the=20
assuming that one fork() isn't avoidable, you still three processes
between which you have to switch in the popen approach (nagios child,
popen child, /bin/sh child).
> Arguments can contain whitespace if escaped or enclosed in strings. Do=20
> you feel like writing a function that does that and that's fast enough=20
> to run as often as is required, while still being rock-solid safe? The=20
> functions that does this in glibc and bash are asm-enhanced and=20
> finetuned per architecture they're run at. You'd increase load=20
> drastically, not reduce it.
okay, so a little trickier than splitting on whitespace. however, i
don't see where your concerns about speed/efficiency are coming from.
why would we need to do this every time the command is executed? why
not parse the cmd into arguments when the command is first read in from
the conffile? plus, if we did that regardless of this dlopen suggestion,
we could also cut out the popen call and just do fork/exec/dup on the
actual command using the same argument list.
> A way around this would be to rewrite the plugins more or less from=20
> scratch, and possibly make them simpler as well, while tagging them for=
=20
> nagios to KNOW which ones are expected to have modules installed. For=20
> instance, the check_command could look
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
hey,
On Mon, May 09, 2005 at 08:47:42AM -0700, [email protected]=
forge.net wrote:
> From: Andreas Ericsson
> Zero overhead is just not going to happen. Nagios MUST be able to=20
> execute checks in parallell. It can't do that if it just enters a=20
> function instead without forking, threading or multiplexing (actually it=
=20
> can't do that without forking or threading, but popen() forks, so to=20
> multiplex the results from it would be a sort of mix of both worlds), as=
=20
> that would imply a serialized execution.
you have a point that there's going to need to be some kind of fork
or multi-threading capabilities. but calling a function in a forked
process or thread would still be much better performance-wise than the
multiple fork and exec calls in the current implementation. =20
> It would require a huge re-design of current arch. It would also require=
=20
> a huge re-design of most plugins, since they don't clean up after=20
> themselves as it is today. They also use very shoddy function-calls. Not=
=20
that wouldn't be as much of a "redesign" as it would be a code-cleanup,
which is never a bad thing to do anyway. plus, what i'm suggesting
isn't an all-or-nothing switchover, but a conditional switch. plugins
could be audited for poor memory management etc and as they are approved
added to a list of plugins to be added to the shared object target list.
> to mention; plugins that crash would cause nagios to crash. This just=20
> isn't good enough.
even forked children?
> > systems cache frequently accessed pages in memory, but there's still
> > unavoidable overhead in creating a new process, as well as the
> > context switching between the various processes.
>=20
> This would still be unavoidable, so point is still moot (see above on=20
> parallellism).
well, somewhat moot. see below:
> Three fork()'s and two execve()'s, as nagios itself forks once prior to=
=20
> running popen(). execve() replaces the running process, so there's no=20
that's the count that i got:
- nagios forks
- nagios child calls popen
- popen forks=20
- popen child calls execve(/bin/sh)
- /bin/sh forks
- /bin/sh child calls execve(cmd)
- /bin/sh child (now cmd) exits with status
and i'm suggesting
- nagios forks
- nagios child calls plugin_function
- nagios child exits return status of plugin_function
note that if this were in a multi-threaded arch, or if the child
processes were pre-allocated, even this fork would have a negligable
effect.
> running popen(). execve() replaces the running process, so there's no=20
> context-switching. It would be possible to get rid of one of the=20
assuming that one fork() isn't avoidable, you still three processes
between which you have to switch in the popen approach (nagios child,
popen child, /bin/sh child).
> Arguments can contain whitespace if escaped or enclosed in strings. Do=20
> you feel like writing a function that does that and that's fast enough=20
> to run as often as is required, while still being rock-solid safe? The=20
> functions that does this in glibc and bash are asm-enhanced and=20
> finetuned per architecture they're run at. You'd increase load=20
> drastically, not reduce it.
okay, so a little trickier than splitting on whitespace. however, i
don't see where your concerns about speed/efficiency are coming from.
why would we need to do this every time the command is executed? why
not parse the cmd into arguments when the command is first read in from
the conffile? plus, if we did that regardless of this dlopen suggestion,
we could also cut out the popen call and just do fork/exec/dup on the
actual command using the same argument list.
> A way around this would be to rewrite the plugins more or less from=20
> scratch, and possibly make them simpler as well, while tagging them for=
=20
> nagios to KNOW which ones are expected to have modules installed. For=20
> instance, the check_command could look
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]