Re: [Nagios-devel] [PATCH] nebmods: Stop with the move-nebmod

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] [PATCH] nebmods: Stop with the move-nebmod

Post by Guest »

Applied, with a cautionary note to module maintainers on how they can
change their makefiles to make test-from-sources work without pains.

Thanks. I've hated that code (or rather; It's effects on gdb) a long
time now but I was far too lazy to do anything about it.

On 2013-06-05 13:28, Robin Sonefors wrote:
> The explanation for why we did this was thus:
>
>> The problem with loaded modules is that if you overwrite the original
>> file (e.g. using 'mv'), you do not alter the inode of the original
>> file.
>
> Correct.
>
>> Since the original file/module is memory-mapped in some fashion,
>> Nagios will segfault the next time an event broker call is directed to
>> one of the module's callback functions.
>
> Of course it won't. Why not? The first quoted sentence explained it
> perfectly - the same inode still points to the same fil, so putting
> another inode in the original path won't make one bit of difference.
>
> What would cause a segfault, though, is truncating the module file and
> filling it with new content so the addresses of symbols change while the
> inode stays the same. You shouldn't do that.
>
>> This is extremely problematic when it comes to upgrading NEB modules
>> while Nagios is running.
>
> To sum up the above: no, it isn't.
>
> In addition to having a bogus reason for existing in the first place,
> the code itself is problematic: it tends to cause an accumulation of
> weird and nonsensical nebmod files in your var directory over time,
> while also breaking any sort of clever debuginfo scheme employed by most
> linux distros today, because the debuginfo can't be tracked down when we
> invent new, faked names literally only in order to confuse linkers.
>
> Signed-off-by: Robin Sonefors
> ---
> base/nebmods.c | 58 ++--------------------------------------------------------
> 1 file changed, 2 insertions(+), 56 deletions(-)
>
> diff --git a/base/nebmods.c b/base/nebmods.c
> index 3d84675..f6489f8 100644
> --- a/base/nebmods.c
> +++ b/base/nebmods.c
> @@ -163,14 +163,10 @@ int neb_load_all_modules(void) {
> }
>
>
> -#ifndef PATH_MAX
> -# define PATH_MAX 4096
> -#endif
> /* load a particular module */
> int neb_load_module(nebmodule *mod) {
> int (*initfunc)(int, char *, void *);
> int *module_version_ptr = NULL;
> - char output_file[PATH_MAX];
> int dest_fd, result = OK;
>
> if(mod == NULL)
> @@ -184,36 +180,8 @@ int neb_load_module(nebmodule *mod) {
> if(mod->should_be_loaded == FALSE || mod->filename == NULL)
> return ERROR;
>
> - /**********
> - Using dlopen() is great, but a real danger as-is. The problem with loaded modules is that if you overwrite the original file (e.g. using 'mv'),
> - you do not alter the inode of the original file. Since the original file/module is memory-mapped in some fashion, Nagios will segfault the next
> - time an event broker call is directed to one of the module's callback functions. This is extremely problematic when it comes to upgrading NEB
> - modules while Nagios is running. A workaround is to (1) 'mv' the original/loaded module file to another name (on the same filesystem)
> - and (2) copy the new module file to the location of the original one (using the original filename). In this scenario, dlopen() will keep referencing
> - the original file/inode for callbacks. This is not an ideal solution. A better one is to delete the module file once it is loaded by dlopen().
> - This prevents other processed from unintentially overwriting the original file, which would cause Nagios to crash. However, if we delete the file
> - before anyone else can muck with it, things should be good. 'lsof' shows that a deleted file is still referenced by the kernel and callback
> - functions continue to work once the module has been loaded. Long story, but this took quite a while to figure out, as there isn't much
> - of anything I could find on the subject other than some sketchy info on similar problems on HP-UX. Hopefully this will save future coders some time.
> - So... the trick is to (1

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: ae@op5.se
Locked