Broker API

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
bipsen
Posts: 22
Joined: Thu Dec 10, 2015 8:42 am
Location: Denmark

Re: Broker API

Post by bipsen »

I think I located the error message to this:

Code: Select all

        schedule_new_event(EVENT_USER_FUNCTION,TRUE, current_time + atoi(perfdata_file_processing_interval), TRUE,
        atoi(perfdata_file_processing_interval), NULL, TRUE, (void *) npcdmod_file_roller, "", 0);

        /* register to be notified of certain events... */
        neb_register_callback(NEBCALLBACK_HOST_CHECK_DATA, npcdmod_module_handle,
                        0, npcdmod_handle_data);
        neb_register_callback(NEBCALLBACK_SERVICE_CHECK_DATA,
        npcdmod_module_handle, 0, npcdmod_handle_data);

I just need to figure out what goes wrong.... By adding extra logging, it seems to be this call:

Code: Select all

        schedule_new_event(EVENT_USER_FUNCTION,TRUE, current_time + atoi(perfdata_file_processing_interval), TRUE,
 
/Brian
bipsen
Posts: 22
Joined: Thu Dec 10, 2015 8:42 am
Location: Denmark

Re: Broker API

Post by bipsen »

Debug gives me:

Code: Select all

[1481222429.460588] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.460614] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.460624] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.460631] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.460638] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.460644] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.460659] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.461391] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.461728] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.461920] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462271] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462440] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462460] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462506] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462526] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462550] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462557] [001.0] [pid=30847] schedule_new_event()
[1481222429.462566] [008.0] [pid=30847] New Event Details:
[1481222429.462577] [008.0] [pid=30847]  Type:                       EVENT_USER_FUNCTION
[1481222429.462582] [008.0] [pid=30847]  High Priority:              Yes
[1481222429.462585] [008.0] [pid=30847]  Run Time:                   2016-12-08 19:40:44
[1481222429.462588] [008.0] [pid=30847]  Recurring:                  Yes
[1481222429.462591] [008.0] [pid=30847]  Event Interval:             15
[1481222429.462594] [008.0] [pid=30847]  Compensate for Time Change: Yes
[1481222429.462596] [008.0] [pid=30847]  Event Options:              0
[1481222429.462601] [008.0] [pid=30847]  Event ID:                   0x8def60
[1481222429.462604] [001.0] [pid=30847] add_event()
[1481222429.462622] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462633] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462639] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462645] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462652] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.462657] [064.0] [pid=30847] Module '/usr/lib64/nagios/brokers/npcdmod.o' loaded with return code of '0'
[1481222429.462661] [064.0] [pid=30847] nebmodule_deinit() found
[1481222429.462664] [064.1] [pid=30847] Making callbacks (type 0)...
[1481222429.463689] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.463724] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.463741] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.463763] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.463781] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.463884] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.463914] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.466510] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.466576] [064.1] [pid=30847] Making callbacks (type 2)...
[1481222429.467380] [064.1] [pid=30847] Making callbacks (type 0)...
[1481222429.467397] [001.0] [pid=30847] initialize_downtime_data()
[1481222429.467412] [064.1] [pid=30847] Making callbacks (type 19)...
[1481222429.467417] [001.0] [pid=30847] xrddefault_read_state_information() start
[1481222429.467502] [001.0] [pid=30847] check_for_host_flapping()
Any hints on what could be the issue ?
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Broker API

Post by dwhitfield »

From which logs are those errors coming?

If not /usr/local/nagios/var/npcd.log could you post some output from there? I know you said you turned on debugging, but none of that appears close to the format of my npcd.log.
bipsen
Posts: 22
Joined: Thu Dec 10, 2015 8:42 am
Location: Denmark

Re: Broker API

Post by bipsen »

The errors are from nagios.log (and nagios debug log).

I have not started the npcd daemon yet - as I want to see the broker module load without failure first...

I have built from 0.6.25 sourcecode on CentOS....
Last edited by bipsen on Thu Dec 08, 2016 3:17 pm, edited 1 time in total.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Broker API

Post by dwhitfield »

My apologies, that's an XI log, but apparently not a Core log.

Can you post your npcd.cfg? I made sure I checked that that was a Core thing first this time. :)

If you can't find a npcd.cfg, then you probably need to talk to the pnp4nagios team, but of course we can leave this open for community input.
bipsen
Posts: 22
Joined: Thu Dec 10, 2015 8:42 am
Location: Denmark

Re: Broker API

Post by bipsen »

I wonder if the subject is better located in the forum "Nagios Plugin Development" - even though this maybe not is a "plugin" in terms of something that can be used for a servicecheck er similar....

But if anyone could contribute with info/hints on why Nagos logs "Error: Failed to add event to squeue '(nil)' with prio 1: Success" when initializing the broker module, I would really appeciate it...

I have tried to dig into events.c in the nagios base sourcecode (also based on the debug output from nagios) - but it did not help me much...

Code: Select all

[1481222429.462557] [001.0] [pid=30847] schedule_new_event()
[1481222429.462566] [008.0] [pid=30847] New Event Details:
[1481222429.462577] [008.0] [pid=30847]  Type:                       EVENT_USER_FUNCTION
[1481222429.462582] [008.0] [pid=30847]  High Priority:              Yes
[1481222429.462585] [008.0] [pid=30847]  Run Time:                   2016-12-08 19:40:44
[1481222429.462588] [008.0] [pid=30847]  Recurring:                  Yes
[1481222429.462591] [008.0] [pid=30847]  Event Interval:             15
[1481222429.462594] [008.0] [pid=30847]  Compensate for Time Change: Yes
[1481222429.462596] [008.0] [pid=30847]  Event Options:              0
[1481222429.462601] [008.0] [pid=30847]  Event ID:                   0x8def60
[1481222429.462604] [001.0] [pid=30847] add_event()
Everything looks OK so far... So what to seem to fail, is the call to add_event in the function schedule_new_event in base/events.c

The call is made like:

Code: Select all

        /* add the event to the event list */
        add_event(nagios_squeue, new_event);
nagios_squeue seems to be a variable, that is set by Nagios itself .... Digging into add_event, I see:

Code: Select all

        if(event->priority) {
                event->sq_event = squeue_add_usec(sq, event->run_time, event->priority - 1, event);
                }
        else {
                event->sq_event = squeue_add(sq, event->run_time, event);
                }
This is just before the check with

Code: Select all

        if(!event->sq_event) {
                logit(NSLOG_RUNTIME_ERROR, TRUE, "Error: Failed to add event to squeue '%p' with prio %u: %s\n",
                          sq, event->priority, strerror(errno));
                }
The main question is now why event->sq_event - and the call to squeue_add/squeue_add_usec - not puts a "right" value in that variable....
bipsen
Posts: 22
Joined: Thu Dec 10, 2015 8:42 am
Location: Denmark

Re: Broker API

Post by bipsen »

A follow up - I tried to start nagios manually with timing points enabled:

Code: Select all

[0.0000 (+0.0000)] Variables reset
[0.0007 (+0.0007)] Main config file read
Nagios 4.2.3 starting... (PID=13246)
Local time is Thu Dec 08 22:09:22 CET 2016
[0.0011 (+0.0004)] NEB module API initialized
[0.0012 (+0.0001)] Query handler initialized
nerd: Channel hostchecks registered successfully
nerd: Channel servicechecks registered successfully
nerd: Channel opathchecks registered successfully
nerd: Fully initialized and ready to rock!
[0.0012 (+0.0001)] NERD initialized
wproc: Successfully registered manager as @wproc with query handler
[0.0015 (+0.0003)] 4 workers spawned
wproc: Registry request: name=Core Worker 13250;pid=13250
wproc: Registry request: name=Core Worker 13248;pid=13248
wproc: Registry request: name=Core Worker 13247;pid=13247
wproc: Registry request: name=Core Worker 13249;pid=13249
[0.0031 (+0.0015)] 4 workers connected
Error: Failed to add event to squeue '(nil)' with prio 1: Success
Event broker module '/usr/lib64/nagios/brokers/npcdmod.o' initialized successfully.
[0.0034 (+0.0003)] Modules loaded
[0.0034 (+0.0000)] First callback made
[0.0034 (+0.0000)] Reading config data from '/etc/nagios/nagios.cfg'
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/etc/nagiosql/hosttemplates.cfg', starting at line 14)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/etc/nagiosql/hosttemplates.cfg', starting at line 23)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/etc/nagiosql/hosttemplates.cfg', starting at line 40)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/etc/nagiosql/hosttemplates.cfg', starting at line 50)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/etc/nagiosql/hosttemplates.cfg', starting at line 66)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/etc/nagiosql/servicetemplates.cfg', starting at line 58)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/etc/nagiosql/servicetemplates.cfg', starting at line 81)
[0.0068 (+0.0034)] Done parsing config files
[0.0069 (+0.0000)] Done resolving objects
.....
[0.0085 (+0.0000)] Event queue initialized
[0.0086 (+0.0000)] Status data initialized
[0.0086 (+0.0000)] Downtime data initialized
[0.0086 (+0.0000)] Retention data initialized
[0.0110 (+0.0024)] Initial state information read
So it seems like the broker module is loaded before the event queue is initialized - can this be the cause of the issue ??
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Broker API

Post by ssax »

Looks like that is from Core when running add_event:

https://github.com/NagiosEnterprises/na ... nts.c#L868
bipsen
Posts: 22
Joined: Thu Dec 10, 2015 8:42 am
Location: Denmark

Re: Broker API

Post by bipsen »

ssax wrote:Looks like that is from Core when running add_event:

https://github.com/NagiosEnterprises/na ... nts.c#L868
That is the one... The question is then whether the call to https://github.com/NagiosEnterprises/na ... ios.c#L740 is too late - as the queue apparently isn't initialized, when the npcdmod broker module is loaded and initialized...
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Broker API

Post by dwhitfield »

bipsen wrote: Ok, with a bit of effort I succeeded in compiling the module for v4....
Did you make changes to Core? If so, the *best* thing to do is create a pull request on github.

Alternatively, could you post the changes you made? It's possible we'll merge the changes in. Thanks!
Locked