RE: [Nagios-devel] Help! I have tons of orphans!

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

RE: [Nagios-devel] Help! I have tons of orphans!

Post by Guest »

--------------Boundary-00=_VXOLR9M4ARSICSOMCWSM
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

I followed your directions below, and I do seem to be accumulating=20
orphans, or at least processes that aren't doing anything. After ten=20
minutes, my numbers stay pretty much pegged at 196 processes. It would=20
dip to 7 processes, then 9, then 61, then 91 (while staying in the=20
hundreds the rest of the time, except at the beginning). Looking at the=20
start time with ps axfu, I see processes as old as 17 minutes, with the=20
rest somewhere in the range from then to now. Also right after that I sa=
w=20
all those older processes go away, and nagios wrote them off in the log a=
s=20
orphans.

I also believe that this is an actual performance problem as the "last=20
check" timestamp for my services looks similar to the age of the processe=
s=20
themselves. Also something interesting about that is that after the olde=
r=20
processes went away, the timestamps were updated pretty soon afterwards. =
=20
That makes it look more like nagios isn't able to process their results=20
during periods than that my box isn't fast enough

If this does turn out to be related to speed, I would be kinda suprised. =
=20
I was running netsaint on a sparc II 400 (which typically stays at 5-7=20
load average), and the "last check" number stays within a couple of=20
minutes.

I don't know if this will shed light, but enclosed is my nagios.cfg.

Thanks,
Geoff

On Tuesday 11 March 2003 11:38, Carroll, Jim P [Contractor] wrote:
> Are you sure that the processes are being orphaned?
>
> You do realize that the main Nagios process spawns quite a few children=
,
> depending on the number of checks being done, right?
>
> Try this:
>
> Stop Nagios. Completely. Make sure it's a graceful shutdown. Check
> for any remaining nagios processes. Kill them without mercy.
>
> Start Nagios. Run this script:
>
> while :
> do
> ps -e | grep [n]agios | wc -l
> sleep 1
> done
>
> (This assumes you're running Nagios under the username 'nagios'.)
>
> Now, watch the count rise and fall. The count might start to climb, bu=
t
> it should always eventually return to 1, or at least close to 1 (taking
> into account race conditions).
>
> If, on the other hand, you've been watching it for 10 minutes and the
> count just seems to continue to climb, yes, you might be accumulating
> orphans. I suspect that's not the case, however. Yours is a concern
> many of us have experienced with Nagios at one time or another. It
> shouldn't be anything to fret about, unless you're genuinely
> experiencing performance problems.
>
> Let us know how that works out.
>
> jc
>
> > -----Original Message-----
> > From: Geoff Lovett [mailto:[email protected]]
> > Sent: Tuesday, March 11, 2003 11:24 AM
> > To: Jeremy T. Bouse
> > Cc: [email protected]
> > Subject: Re: [Nagios-devel] Help! I have tons of orphans!
> >
> >
> > I just turned it on. I should know in about an hour if it
> > works or not.
> >
> > Thanks,
> > Geoff
> >
> > On Tuesday 11 March 2003 11:12, Jeremy T. Bouse wrote:
> > > Have you tried turning on the obsess over services option and
> > > see if the problem presists? I've found some better performance wit=
h
> > > heavy testing with this option enabled...
> > >
> > > =09Jeremy
> > >
> > > On Tue, Mar 11, 2003 at 10:40:34AM -0600, Geoff Lovett wrote:
> > > > I am currently upgrading from Netsaint 0.0.7 to Nagios 1.0, and I
> > > > am finding that after Nagios runs for an hour or so, lots of
> > > > services start getting orphaned. The resources on this box aren'=
t
> > > > exhausted at all (that I can tell). It tried lowering the
> > > > max_concurrent_checks from 306 to 200, which helped for a little
> > > > while. Now it's exhibiting the same behaviour.
> > > >
> > > > The specs for the box are Debian Linux, kernel 2.4.20,
> >
> > PIII 650, 250M.
> >
> > > > I don't go into swap and the load generally stays
> >
> > between 1 and 5. I
> >
> > > > am montoring 620 services on 95 boxes.


...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked