RE: [Nagios-devel] Help! I have tons of orphans!
Posted: Tue Mar 11, 2003 9:38 am
Are you sure that the processes are being orphaned?
You do realize that the main Nagios process spawns quite a few children,
depending on the number of checks being done, right?
Try this:
Stop Nagios. Completely. Make sure it's a graceful shutdown. Check for
any remaining nagios processes. Kill them without mercy.
Start Nagios. Run this script:
while :
do
ps -e | grep [n]agios | wc -l
sleep 1
done
(This assumes you're running Nagios under the username 'nagios'.)
Now, watch the count rise and fall. The count might start to climb, but it
should always eventually return to 1, or at least close to 1 (taking into
account race conditions).
If, on the other hand, you've been watching it for 10 minutes and the count
just seems to continue to climb, yes, you might be accumulating orphans. I
suspect that's not the case, however. Yours is a concern many of us have
experienced with Nagios at one time or another. It shouldn't be anything to
fret about, unless you're genuinely experiencing performance problems.
Let us know how that works out.
jc
> -----Original Message-----
> From: Geoff Lovett [mailto:[email protected]]
> Sent: Tuesday, March 11, 2003 11:24 AM
> To: Jeremy T. Bouse
> Cc: [email protected]
> Subject: Re: [Nagios-devel] Help! I have tons of orphans!
>
>
> I just turned it on. I should know in about an hour if it
> works or not.
>
> Thanks,
> Geoff
>
> On Tuesday 11 March 2003 11:12, Jeremy T. Bouse wrote:
> > Have you tried turning on the obsess over services option and
> > see if the problem presists? I've found some better performance with
> > heavy testing with this option enabled...
> >
> > Jeremy
> >
> > On Tue, Mar 11, 2003 at 10:40:34AM -0600, Geoff Lovett wrote:
> > > I am currently upgrading from Netsaint 0.0.7 to Nagios 1.0, and I
> > > am finding that after Nagios runs for an hour or so, lots of
> > > services start getting orphaned. The resources on this box aren't
> > > exhausted at all (that I can tell). It tried lowering the
> > > max_concurrent_checks from 306 to 200, which helped for a little
> > > while. Now it's exhibiting the same behaviour.
> > >
> > > The specs for the box are Debian Linux, kernel 2.4.20,
> PIII 650, 250M.
> > > I don't go into swap and the load generally stays
> between 1 and 5. I
> > > am montoring 620 services on 95 boxes.
> > >
> > > While the services are getting orphaned, the load is pretty low
> > > (around 1.2). The check latency average shoots way up to
> 46, and the
> > > max to 111.
> > >
> > > Another thing I've noticed is that though the
> max_concurrent_checks is
> > > set to 200, the number of processes named "nagios" is greater than
> > > that during the periods when it reports orphans.
> > >
> > > Has anyone else experienced this? I searched the FAQ and
> this mailing
> > > list, and didn't find anything.
> > >
> > > Please copy my address in replies as I am not currently
> subscribed.
> > >
> > > Thanks,
> > > Geoff
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by:Crypto Challenge is now open!
> > > Get cracking and register here for some mind boggling fun and
> > > the chance of winning an Apple iPod:
> > > http://ads.sourceforge.net/cgi-bin/redi ... thaw0031en
> > > _______________________________________________
> > > Nagios-devel mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/lis ... gios-devel
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by:Crypto Challenge is now open!
> Get cracking and register here for some mind boggling fun and
> the chance of winning an Apple iPod:
> http://ads.sourceforge.net/cgi-bin/redi ... thaw0031en
> _______________________________________________
> Nagios-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/lis ... gios-devel
>
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
You do realize that the main Nagios process spawns quite a few children,
depending on the number of checks being done, right?
Try this:
Stop Nagios. Completely. Make sure it's a graceful shutdown. Check for
any remaining nagios processes. Kill them without mercy.
Start Nagios. Run this script:
while :
do
ps -e | grep [n]agios | wc -l
sleep 1
done
(This assumes you're running Nagios under the username 'nagios'.)
Now, watch the count rise and fall. The count might start to climb, but it
should always eventually return to 1, or at least close to 1 (taking into
account race conditions).
If, on the other hand, you've been watching it for 10 minutes and the count
just seems to continue to climb, yes, you might be accumulating orphans. I
suspect that's not the case, however. Yours is a concern many of us have
experienced with Nagios at one time or another. It shouldn't be anything to
fret about, unless you're genuinely experiencing performance problems.
Let us know how that works out.
jc
> -----Original Message-----
> From: Geoff Lovett [mailto:[email protected]]
> Sent: Tuesday, March 11, 2003 11:24 AM
> To: Jeremy T. Bouse
> Cc: [email protected]
> Subject: Re: [Nagios-devel] Help! I have tons of orphans!
>
>
> I just turned it on. I should know in about an hour if it
> works or not.
>
> Thanks,
> Geoff
>
> On Tuesday 11 March 2003 11:12, Jeremy T. Bouse wrote:
> > Have you tried turning on the obsess over services option and
> > see if the problem presists? I've found some better performance with
> > heavy testing with this option enabled...
> >
> > Jeremy
> >
> > On Tue, Mar 11, 2003 at 10:40:34AM -0600, Geoff Lovett wrote:
> > > I am currently upgrading from Netsaint 0.0.7 to Nagios 1.0, and I
> > > am finding that after Nagios runs for an hour or so, lots of
> > > services start getting orphaned. The resources on this box aren't
> > > exhausted at all (that I can tell). It tried lowering the
> > > max_concurrent_checks from 306 to 200, which helped for a little
> > > while. Now it's exhibiting the same behaviour.
> > >
> > > The specs for the box are Debian Linux, kernel 2.4.20,
> PIII 650, 250M.
> > > I don't go into swap and the load generally stays
> between 1 and 5. I
> > > am montoring 620 services on 95 boxes.
> > >
> > > While the services are getting orphaned, the load is pretty low
> > > (around 1.2). The check latency average shoots way up to
> 46, and the
> > > max to 111.
> > >
> > > Another thing I've noticed is that though the
> max_concurrent_checks is
> > > set to 200, the number of processes named "nagios" is greater than
> > > that during the periods when it reports orphans.
> > >
> > > Has anyone else experienced this? I searched the FAQ and
> this mailing
> > > list, and didn't find anything.
> > >
> > > Please copy my address in replies as I am not currently
> subscribed.
> > >
> > > Thanks,
> > > Geoff
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by:Crypto Challenge is now open!
> > > Get cracking and register here for some mind boggling fun and
> > > the chance of winning an Apple iPod:
> > > http://ads.sourceforge.net/cgi-bin/redi ... thaw0031en
> > > _______________________________________________
> > > Nagios-devel mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/lis ... gios-devel
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by:Crypto Challenge is now open!
> Get cracking and register here for some mind boggling fun and
> the chance of winning an Apple iPod:
> http://ads.sourceforge.net/cgi-bin/redi ... thaw0031en
> _______________________________________________
> Nagios-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/lis ... gios-devel
>
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]