--------------090507050504070508050909
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Hi,
very interesting approach
Maybe we can talk offlist and in private about your goals and maybe
joining forces with Icinga. How about that?
Kind regards,
Michael
nap wrote:
Hi list,
I would like to have your feed back about a (unfinished)
reimplementation of Nagios named "Shinken" I wrote in Python that is
faster and more modular than the current Nagios implementation in C
(yes faster, you read correctly. I was the first surprised by that).
== The Shinken's history ==
Few months, I start to work on a proof of concept for Nagios focus on
distributed environments and performances. The main goal was to look
for a distributed and high availability architecture. I was also
thinking that Nagios' performances were quite good, but we can have
more.
For quick test and development, I used Python. I thought a process
pool can make Nagios be quicker instead of forking a new process to
kill it few seconds after for each checks. I also bypass the reaping
way of Nagios : reading flat file is just too slow. Instead, the
results are a structure that is send directly to the scheduler. No
files, more performances. To be equal to Nagios, I add the same
monitoring logic in the scheduler : HARD/SOFT states, dependencies
(parents, servicedep, hostdep, etc) and database export (Merlin).
Shinken used the standard Nagios conf file.
And the perf are quite good : with a Nagios3, a small check (do a
echo + exit) and a medium range server I run at 10000 checks in
5minutes (latency near 1s), 30K with full tweaks. With my tool, I run
150K !!
== The global architecture ==
For the Architecture, I think we must use the Unix Way of doing things
: one tool by usage. For now, Nagios do nearly every things : reads
conf, schedule, launch checks and raise notifications. I try an
architecture where the administrator can have any host/services he
wants and the daemons are just resources to manage this. The
architecture I propose is the following :
*Arbiter : a daemon that read the configuration, cut it automatically
(keep relations like parents in the same conf) in N confs, where N is
the number of schedulers we have. It dispatchs the configuration and
also read the orders in nagios.cmd and dispatch orders to schedulers.
*Schedulers : do the scheduling by looking at states of
hosts/services. It just do checks/notifications/event handlers queues
for others daemons. Same things for event broker informations : it's
just a queue.
*pollers : use a processes Pool, get checks to launch in schedulers
and returns results to schedulers.
*reactionners : same than pollers, but for notifications and event handlers.
*brokers : get event broker informations from schedulers and "do
things" why them (like create the service-perfdata file, or fill
databases).
The poller way of doing is like DNX, nothing new here. The
reactionners allow the administrators to have a unique daemon to send
all notifications of all his schedulers (usefull for SMTP
authorizations or the fill of a unique RSS file with all
notifications). The schedulers do not launch checks, so they do not
get latency when they launch notifications or event handlers.
The load balancing is automatic : the arbiter cuts the conf and
dispatch thems. For the high availability : there can be spare daemons
: if a daemon die, another take it's configuration (the Arbiter "ping"
daemons, and if a daemon failed, it just send the configuration to a
spare). The daemon are reach by network, so all daemons can be in
different servers (and it's better for high availability to not put
all daemons in the same server
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]