Re: [Nagios-devel] Nagios 2.0 Event Broker and DB Support
Posted: Thu Jul 31, 2003 10:51 pm
i like the idea. the future-compatibility problem should be solvable by=20
a clean module-api with some checks at module-load-time. how often is=20
the internal data-struture changed anyway ? shouldnt it be quite mature=20
by now?
chris
Ethan Galstad wrote:
> Sorry for the crosspost, but the nagios-devel list is usually pretty=20
> quiet when I request comments about new features I'm implementing. =20
> This one is bigger than most, so I wanted to reach more people. This=20
> is a bit long, so bear with me...
>=20
> I am almost complete with coding for 2.0. Two big things remain: the=20
> event broker and DB support (which is currently broken).
>=20
> My original intent was to develop the event broker as a separate=20
> application, tying it to Nagios with a unix domain socket. Nagios=20
> would send the event broker information about the everything that was=20
> going on (service checks, downtime, flapping, log entries, etc.). =20
> The event broker would be able to load user-developed modules (object=20
> files) at runtime and pass various types of Nagios data to them for=20
> processing. This is all fine and good. I have a working prototype=20
> of the event broker that does just this and seems to work okay. I=20
> got to thinking that it was rather stupid to develop a separate=20
> application for this when I could simply have Nagios load user-
> developed modules itself. Doing this would give the modules the=20
> benefit of having access to internal Nagios structures and functions=20
> (which is good and bad - see below).
>=20
> Here's an overview of how it would work:
>=20
> - Nagios would load user-specified modules (object files) at startup=20
> using the dlopen() function.
>=20
> - Nagios would call the module's initialization function (the name of=20
> which would be standardized).
>=20
> - The module's init function would register for various types of=20
> Nagios event data (service checks, host checks, log entries, event=20
> handlers, etc.) using callback functions.
>=20
> - When Nagios encounters an event for which a module has registered a=20
> callback function, Nagios would call that module's function and pass=20
> it data relevant to the event. The module is then free to do=20
> whatever it wants to that event data. An example might be to log=20
> service checks, performance data and log entries to MySQL, etc.
>=20
> - Before shutting down, Nagios calls the module's de-init function. =20
> This allows the module to clean up any resources it may be using.
>=20
>=20
> Seems good in theory. Heck, might even work okay. However, there's=20
> a big problem I have with it. If I implement things this way, the=20
> user-developed modules would have access to internal Nagios data=20
> structures and functions. This is not necessarily bad, as ill-
> behaved modules would not be adopted by too many people.
=20
> However, modules that might be compiled and working fine
> for Nagios 2.0 might segfault under future versions if the internal=20
> data structures change. Here's an example of what I mean:
>=20
> User module registers for Nagios service check data using its=20
> mymod_handle_servicecheck() function, which has a prototype of:
>=20
> int mymod_handle_servicecheck(service *);
>=20
> The service struct is an internal Nagios structure definition which=20
> changes between Nagios versions. If the user module is compiled for=20
> use with Nagios 2.0 and it's definition of the service struct, it=20
> will have problems if it is not recompiled for future versions of=20
> Nagios.
>=20
> Off the top of my head, I could overcome this by requiring that the=20
> user modules indicate (by calling a function) what version of Nagios=20
> they are compiled for. If they report anything but the current=20
> version (or do not report at all), unload them so they can do no=20
> harm.
>=20
> I'm afraid I'm a bit over my head on how to handle this one. Some of=20
> you developers out there must have experience with this type of=20
> thing. If so, how did you handle it? What would you recommend? =20
>
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
a clean module-api with some checks at module-load-time. how often is=20
the internal data-struture changed anyway ? shouldnt it be quite mature=20
by now?
chris
Ethan Galstad wrote:
> Sorry for the crosspost, but the nagios-devel list is usually pretty=20
> quiet when I request comments about new features I'm implementing. =20
> This one is bigger than most, so I wanted to reach more people. This=20
> is a bit long, so bear with me...
>=20
> I am almost complete with coding for 2.0. Two big things remain: the=20
> event broker and DB support (which is currently broken).
>=20
> My original intent was to develop the event broker as a separate=20
> application, tying it to Nagios with a unix domain socket. Nagios=20
> would send the event broker information about the everything that was=20
> going on (service checks, downtime, flapping, log entries, etc.). =20
> The event broker would be able to load user-developed modules (object=20
> files) at runtime and pass various types of Nagios data to them for=20
> processing. This is all fine and good. I have a working prototype=20
> of the event broker that does just this and seems to work okay. I=20
> got to thinking that it was rather stupid to develop a separate=20
> application for this when I could simply have Nagios load user-
> developed modules itself. Doing this would give the modules the=20
> benefit of having access to internal Nagios structures and functions=20
> (which is good and bad - see below).
>=20
> Here's an overview of how it would work:
>=20
> - Nagios would load user-specified modules (object files) at startup=20
> using the dlopen() function.
>=20
> - Nagios would call the module's initialization function (the name of=20
> which would be standardized).
>=20
> - The module's init function would register for various types of=20
> Nagios event data (service checks, host checks, log entries, event=20
> handlers, etc.) using callback functions.
>=20
> - When Nagios encounters an event for which a module has registered a=20
> callback function, Nagios would call that module's function and pass=20
> it data relevant to the event. The module is then free to do=20
> whatever it wants to that event data. An example might be to log=20
> service checks, performance data and log entries to MySQL, etc.
>=20
> - Before shutting down, Nagios calls the module's de-init function. =20
> This allows the module to clean up any resources it may be using.
>=20
>=20
> Seems good in theory. Heck, might even work okay. However, there's=20
> a big problem I have with it. If I implement things this way, the=20
> user-developed modules would have access to internal Nagios data=20
> structures and functions. This is not necessarily bad, as ill-
> behaved modules would not be adopted by too many people.
> However, modules that might be compiled and working fine
> for Nagios 2.0 might segfault under future versions if the internal=20
> data structures change. Here's an example of what I mean:
>=20
> User module registers for Nagios service check data using its=20
> mymod_handle_servicecheck() function, which has a prototype of:
>=20
> int mymod_handle_servicecheck(service *);
>=20
> The service struct is an internal Nagios structure definition which=20
> changes between Nagios versions. If the user module is compiled for=20
> use with Nagios 2.0 and it's definition of the service struct, it=20
> will have problems if it is not recompiled for future versions of=20
> Nagios.
>=20
> Off the top of my head, I could overcome this by requiring that the=20
> user modules indicate (by calling a function) what version of Nagios=20
> they are compiled for. If they report anything but the current=20
> version (or do not report at all), unload them so they can do no=20
> harm.
>=20
> I'm afraid I'm a bit over my head on how to handle this one. Some of=20
> you developers out there must have experience with this type of=20
> thing. If so, how did you handle it? What would you recommend? =20
>
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]