Page 1 of 1

Re: [Nagios-devel] Memory leak in Nagios head

Posted: Tue Nov 30, 2004 9:22 am
by Guest
Looks like the problem for this was actually in add_hostextinfo(),
where a dangling pointer was causing problem. Fix is in CVS now.


On 30 Nov 2004 at 13:45, Andreas Ericsson wrote:

> The "repeated SIGHUP" crash occurs in
> find_host(temp_hostextinfo->host_name), called from
> pre_flight_check().
>
> The attached patch makes nagios at least survive the HUPs (even though
> the memory leak is still there, so it should crash eventually when it
> hits the memory limit). I haven't tested wether this affects the GUI
> or not.
>
> Note that it's only tested using Matthews patch as well (which din't
> fix the problem), so I don't know if it will work solo or if both of
> them have to be combined to do the trick.
>
> Andreas Ericsson wrote:
> > Matthew Kent wrote:
> >
> >> On Mon, 2004-11-29 at 15:34, Andreas Ericsson wrote:
> >>
> >>> Matthew Kent wrote:
> >>>
> >>>> Forwarding this on in case anyone else has seen this behaviour
> >>>> and has some suggestions. I'll give it a run through valgrind and
> >>>> see if I can spot anything this evening.
> >>>>
> >>>
> >>> Thanks, Matt.
> >>>
> >>> A small update;
> >>>
> >>> After having run the daemon about 10 hours at a test system,
> >>> memory consumption has escalated from roughly 1MB to around 24MB.
> >>> Not very nice figures. It seems that sending a HUP makes memory
> >>> consumption make a small jump (usually around 20K).
> >>
> >>
> >>
> >> Well I may have trapped the HUP problem after some passes through
> >> valgrind. Seems reset_variables was getting called twice, right
> >> after receiving a sighup and immediately after at the start of the
> >> main do() loop in nagios.c
> >
> >
> > I'll get to testing right away.
> >
> >> I've removed the call to it from cleanup() as it's only called when
> >> erroring out anyway, and resetting the variables at this point is a
> >> bit of a lost cause ;)
> >>
> >> I also fixed a couple other minor items reported by valgrind.
> >> Although I couldn't figure out this last one
> >>
> >> 64 bytes in 8 blocks are definitely lost in loss record 66 of 118
> >> at 0x1B904EDD: malloc (vg_replace_malloc.c:131)
> >> by 0x808F4D4: xodtemplate_add_host_to_hostlist
> >> (xodtemplate.c:10665) by 0x808F456:
> >> xodtemplate_add_hostgroup_members_to_hostlist
> >> (xodtemplate.c:10640)
> >> by 0x808EF0E: xodtemplate_expand_hostgroups
> >> (xodtemplate.c:10434)
> >>
> >
> > This shouldn't be the longstanding problem though, since NSCORE
> > doesn't use xodtemplate_expand_hostgroups() on a regular basis. I'm
> > leaning towards a very small and subtle in-struct leak in
> > base/checks.c or common/statusdata.c (and their underlying
> > functions, naturally). Particularly since the problem seems to
> > present itself more rapidly when hosts and services changes status a
> > lot (or possibly just change their plugin output).
> >
>
> --
> Andreas Ericsson [email protected]
> OP5 AB www.op5.se
> Lead Developer
>



Ethan Galstad,
Nagios Developer
---
Email: [email protected]
Website: http://www.nagios.org






This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]