RE: [Nagios-devel] Defining services at runtime
Posted: Thu May 11, 2006 6:08 am
See below...
> Hi all,
>
> I've been researching this a little more, and I've come up with the
> following thoughts.
>
> I want the discovery process to be triggered by a configured service check
> on a switch - this is because I still want the standard Nagios scheduling
> mechanism to apply. When the check runs it will walk the switch, and
> gather
> the results, which it will then submit to my event broker module via a
> socket. The event broker module will check to see if a service exists for
> each of the switchports and if not it will create a passive service check
> for it. The event module will then submit the results for each of the
> switchports. If a switchport doesn't have any results submitted for 3
> consecutive checks then the service will be removed.
>
> So, the event broker needs to be able accomplish the following:
>
> * Create for existing services
>
> I notice that Nagios-db gets it's configuration information from the
> following callback:
Which callback were you thinking of?
> * Create a new service check
>
> nebstructs.h defines a struct "nebstruct_service_check_struct". However,
> this seems to be the only place this struct is referred to in the header
> files. How do I pass a completed struct to Nagios?
>
> It looks reality straight forward to work out how to fill out this struct,
> but "char *host_name;" could be a problem. The plugin is only going to
> know
> the host address, so I'll need a way to get a hostname from an address.
You can't add a service to Nagios using the
nebstruct_service_check_struct; in fact you can't add anything to Nagios
using any of the nebstruct_* structure. They are one-way, informational
only - passed down to your module. When your module returns, Nagios never
examines the structure that it passed to you for changes.
The way to add a service from within a NEB module would be to call the
internal "add_service()" function. It's the same one that Nagios uses to
add services during configuration load.
The API for the add_service() function is:
service *add_service(char *host_name, char *description, char
*check_period, int max_attempts, int parallelize, int
accept_passive_checks, int check_interval, int retry_interval, int
notification_interval, char *notification_period, int notify_recovery, int
notify_unknown, int notify_warning, int notify_critical, int
notify_flapping, int notifications_enabled, int is_volatile, char
*event_handler, int event_handler_enabled, char *check_command, int
checks_enabled, int flap_detection_enabled, double low_flap_threshold,
double high_flap_threshold, int stalk_ok, int stalk_warning, int
stalk_unknown, int stalk_critical, int process_perfdata, int
failure_prediction_enabled, char *failure_prediction_options, int
check_freshness, int freshness_threshold, int retain_status_information,
int retain_nonstatus_information, int obsess_over_service);
However, you have a significant problem with your above strategy:
To wit, be aware that your NEB modules consist of (mostly) callback
routines, which means they only get activated when Nagios is reporting
some event for which your callback routine is registered (like a service
check, host check, external command, etc. is occurring or has just
completed.)
If your callback routine gets invoked and then starts sitting on a socket,
listening for events (and not returning control immediately to Nagios,)
then Nagios will grind to a halt because the scheduler will be waiting for
your callback to return, so *it* can return to scheduling other events and
executing them.
So, you have a timing issue:
- The scheduler gets ready to run your service check.
- Nagios invokes your check_service callback routine just *before* it runs
your service check. Your callback routine "processes" this event and
*must* return control back to Nagios immediately.
- Your service check is executed, checks the switch, and writes the
results to a socket for the NEB module.
- However, your NEB module isn't "alive" yet because it hasn't yet been
invoked by Nagios via the service_check callback mechanism since your
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
> Hi all,
>
> I've been researching this a little more, and I've come up with the
> following thoughts.
>
> I want the discovery process to be triggered by a configured service check
> on a switch - this is because I still want the standard Nagios scheduling
> mechanism to apply. When the check runs it will walk the switch, and
> gather
> the results, which it will then submit to my event broker module via a
> socket. The event broker module will check to see if a service exists for
> each of the switchports and if not it will create a passive service check
> for it. The event module will then submit the results for each of the
> switchports. If a switchport doesn't have any results submitted for 3
> consecutive checks then the service will be removed.
>
> So, the event broker needs to be able accomplish the following:
>
> * Create for existing services
>
> I notice that Nagios-db gets it's configuration information from the
> following callback:
Which callback were you thinking of?
> * Create a new service check
>
> nebstructs.h defines a struct "nebstruct_service_check_struct". However,
> this seems to be the only place this struct is referred to in the header
> files. How do I pass a completed struct to Nagios?
>
> It looks reality straight forward to work out how to fill out this struct,
> but "char *host_name;" could be a problem. The plugin is only going to
> know
> the host address, so I'll need a way to get a hostname from an address.
You can't add a service to Nagios using the
nebstruct_service_check_struct; in fact you can't add anything to Nagios
using any of the nebstruct_* structure. They are one-way, informational
only - passed down to your module. When your module returns, Nagios never
examines the structure that it passed to you for changes.
The way to add a service from within a NEB module would be to call the
internal "add_service()" function. It's the same one that Nagios uses to
add services during configuration load.
The API for the add_service() function is:
service *add_service(char *host_name, char *description, char
*check_period, int max_attempts, int parallelize, int
accept_passive_checks, int check_interval, int retry_interval, int
notification_interval, char *notification_period, int notify_recovery, int
notify_unknown, int notify_warning, int notify_critical, int
notify_flapping, int notifications_enabled, int is_volatile, char
*event_handler, int event_handler_enabled, char *check_command, int
checks_enabled, int flap_detection_enabled, double low_flap_threshold,
double high_flap_threshold, int stalk_ok, int stalk_warning, int
stalk_unknown, int stalk_critical, int process_perfdata, int
failure_prediction_enabled, char *failure_prediction_options, int
check_freshness, int freshness_threshold, int retain_status_information,
int retain_nonstatus_information, int obsess_over_service);
However, you have a significant problem with your above strategy:
To wit, be aware that your NEB modules consist of (mostly) callback
routines, which means they only get activated when Nagios is reporting
some event for which your callback routine is registered (like a service
check, host check, external command, etc. is occurring or has just
completed.)
If your callback routine gets invoked and then starts sitting on a socket,
listening for events (and not returning control immediately to Nagios,)
then Nagios will grind to a halt because the scheduler will be waiting for
your callback to return, so *it* can return to scheduling other events and
executing them.
So, you have a timing issue:
- The scheduler gets ready to run your service check.
- Nagios invokes your check_service callback routine just *before* it runs
your service check. Your callback routine "processes" this event and
*must* return control back to Nagios immediately.
- Your service check is executed, checks the switch, and writes the
results to a socket for the NEB module.
- However, your NEB module isn't "alive" yet because it hasn't yet been
invoked by Nagios via the service_check callback mechanism since your
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]