Re[2]: [Nagios-devel] new plugin interface for Nagios

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re[2]: [Nagios-devel] new plugin interface for Nagios

Post by Guest »

This is a cryptographically signed message in MIME format.

------------7919C10565311A7
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable

=0D=0AAE> Deomid Ryabkov wrote:
>> greetings, fellow Nagios users.
>>=20
>> well, basically I think it's just about time to add a new plugin interac=
tion interface to Nagios.
>> pretty bold, ha? ;)
>> now let me explain. it has been almost a year since we turned to Nagios =
for our monitoring needs
>> (we were previosly using BigBrother and oh my dear, was it awful! ;))
>> so we are being almost happy now. however, as configuration continues to=
grow, the response time
>> of the whole monitoring system increases.
>>
>> currently we have 248 hosts monitored with 755 active checks at a 60 sec=
onds interval.
>> (interval_length=3D10, normal_check_interval 6)
>>=20
AE> I think this is where some of your problems start. Running ALL checks=
=20
AE> with a 60 second interval is hardly useful. You should look into=20
AE> implementing different templates for them (we have critical-service (1=
=20
AE> minute interval), default-service (5 minute interval),=20
AE> noncritical-service (30 minutes interval)). This allows for excellent=
=20
AE> scalability.
well, this is not the point of discussion. you can always lower check inter=
val
or, just as you say, adjust check intervals by criticalness of related serv=
ices.
all this i clearly understand, but... you know, it's always nice to be up-t=
o-date on this.
i mean the health of the network. and of course when my LA will be somewher=
e near 10,
i guess i'll have to lower the check interval or whetever, BUT (and that's =
the point
I'm trying to make): there's a room for improvement.

>> being in charge of the monitoring, by now i have done all i could to opt=
imize plugins,
>> and in fact this has helped a lot to keep the system running at a decent=
pace.
>> (for example, i have integrated disk checks into one plugin that uses sh=
ared snmplib
>> instead of calling snmpget, effectively elimitaing another fork)

AE> snmpget only loads heavily if it needs to parse the mibs. Use '-m: ' to=
=20
AE> load NO mibs with snmpget. This will make it a whole lot faster.

that is not the problem anymore: only a few of my check use it.

>> so the biggest problem at this time seems to be Nagios's need to launch =
a process for every check.
>>=20
AE> That problem will still exist, unless you mean to make the code=20
AE> thread-safe, which would make nagios a memory-hog on large systems (a=
=20
AE> lot more hash buckets would be required for this to work). Besides, on=
=20
AE> linux-systems, fork() uses copy-on-write, so only the PTE needs be crea=
ted.

well, now it takes fork() + exec() to complete a check. and my aim is that =
latter exec().
that doesn't make nagios threaded.

>> so now i'm thinking of adding some kind of plugin invocation mechanism i=
nto Nagios
>> that wouldn't require starting up another program.
>> and what i am thinking of as my options are:
>>=20
>> 1) shared library mechanism, like Apache modules. should be the fastest =
of all, but has its shortcomings.
>> not very flexible.

AE> Not a bad idea, but nagios would still have to fork() or=20
AE> pthread_create() to actually RUN the different checks (unless you want=
=20
AE> it to serialize checks, which is just plain dumb).

basically, i don't mind nagios to fork (yet), but instead of running an ext=
ernal plugin it should...
well, that is to be decided ;)

>> 2) some kind of IPC. this would involve, i think, some check daemon proc=
ess that'd start with nagios
>> and respond to check requests from it. a pipe or message queue could be =
used for communication.

AE> Now we're talking. See comments below.

>> 3) just forget about it.
>>=20
AE> Not necessarily a bad thing.
of course.

>> i think i'll do that one way or another. but i want to make it The Right=
Way (r) and this is
>> where i turn to you and ask if you have any ideas/opinions/suggestions a=
nd in general, if it's worth
>> implementing at all...
>>=20
AE> I'd say that nagios should be split i

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked