gearmand failed to load module

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

gearmand failed to load module

Post by emartine »

I updated our test redhat6 server and noticed that gearmand was update

gearmand.x86_64 1:0.33-2 @labs_consol_stable
mod_gearman.x86_64 1.5.1-1.el6 @labs_consol_stable

logs show:

Jan 29 17:14:59 nagiosxi-tst nagios: Error: Could not load module '/usr/lib64/mod_gearman/mod_gearman.o' -> /usr/lib64/mod_gearman/mod_gearman.o: undefined symbol: check_result_list
Jan 29 17:14:59 nagiosxi-tst nagios: Error: Failed to load module '/usr/lib64/mod_gearman/mod_gearman.o'.
Jan 29 17:14:59 nagiosxi-tst nagios: ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 17:14:59 nagiosxi-tst nagios: ndomod: Could not open data sink! I'll keep trying, but some output may get lost...
Jan 29 17:14:59 nagiosxi-tst nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 17:14:59 nagiosxi-tst nagios: Error: Module loading failed. Aborting.

So Nagios Core is not starting and of course gearmand services are not working. I am unable to submit commands from the XI interface and attempting to get into core results in an error.

So where do I start...? Compatibility issue with this version of gearmand? Where should I get the latest compatible version?
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: gearmand failed to load module

Post by Box293 »

Mod Gearman will need updating.

Follow this guide:

http://assets.nagios.com/downloads/nagi ... ios_XI.pdf
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: gearmand failed to load module

Post by emartine »

Nagios Core still wouldn't work. I rebooted the server to ensure that everything came up ok.

I then found out that I was receiving error
tail -f /var/log/gearmand.log
ERROR 2015-01-30 00:55:21.000000 [ main ] socket()(Address family not supported by protocol) -> libgearman-server/gearmand.cc:468
ERROR 2015-01-30 00:55:21.000000 [ main ] gearmand_sockfd_close() called with an invalid socket -> libgearman-server/io.cc:933
ERROR 2015-01-30 00:55:25.000000 [ main ] socket()(Address family not supported by protocol) -> libgearman-server/gearmand.cc:468
ERROR 2015-01-30 00:55:25.000000 [ main ] gearmand_sockfd_close() called with an invalid socket -> libgearman-server/io.cc:933
ERROR 2015-01-30 00:57:55.000000 [ main ] socket()(Address family not supported by protocol) -> libgearman-server/gearmand.cc:468
ERROR 2015-01-30 00:57:55.000000 [ main ] gearmand_sockfd_close() called with an invalid socket -> libgearman-server/io.cc:933
ERROR 2015-01-30 01:04:42.000000 [ main ] socket()(Address family not supported by protocol) -> libgearman-server/gearmand.cc:468
ERROR 2015-01-30 01:04:42.000000 [ main ] gearmand_sockfd_close() called with an invalid socket -> libgearman-server/io.cc:933
ERROR 2015-01-30 01:05:28.000000 [ main ] socket()(Address family not supported by protocol) -> libgearman-server/gearmand.cc:468
ERROR 2015-01-30 01:05:28.000000 [ main ] gearmand_sockfd_close() called with an invalid socket -> libgearman-server/io.cc:933

I modified:
/etc/sysconfig/gearmand
OPTIONS="-L myipaddress"

Gearman still wouldn't start. I logged into nagios xi via the web and noticed that the monitoring engine process wasn't started. I started that back up through the web gui and then went back into the commandline and restarted gearmand. Gearmand worked and my worker shows they are connecting.

gearman_top -H myipaddress:4730


Queue Name | Worker Available | Jobs Waiting | Jobs Running
--------------------------------------------------------------------------------------
eventhandler | 21 | 0 | 0
host | 21 | 0 | 0
service | 21 | 0 | 0
worker_nagiosxi-tst | 1 | 0 | 0
worker_gearmanworker-tst | 1 | 0 | 0
--------------------------------------------------------------------------------------

Great so far.


I then attempt to schedule a forced an immediate check on a service over the web and I receive a

Please Wait
Your request was not processed in a timely manner. It may still execute, as the server may be temporarily busy.

I'm monitoring 3 test hosts so I know something else is wrong. Any ideas? The system status shows everything as good but the last checks that occurred were earlier in the day and no new active checks are happening.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: gearmand failed to load module

Post by Box293 »

Can you please post your nagios.cfg file and your /etc/mod_gearman/mod_gearman_neb.conf file.

Also, do you have workers on other machines, or do you just use a local worker?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: gearmand failed to load module

Post by emartine »

nagios.cfg

Code: Select all

# MODIFIED
admin_email=root@localhost
admin_pager=root@localhost
translate_passive_host_checks=1
log_event_handlers=0
use_large_installation_tweaks=1
enable_environment_macros=0


# NDOUtils module
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
#Gearman
broker_module=/usr/lib64/mod_gearman/mod_gearman.o config=/etc/mod_gearman/mod_gearman_neb.conf eventhandler=no

# PNP settings - bulk mode with NCPD
process_performance_data=1
# service performance data
service_perfdata_file=/usr/local/nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$\tSERVICEOUTPUT::$SERVICEOUTPUT$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file-bulk
# host performance data
host_perfdata_file=/usr/local/nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tHOSTOUTPUT::$HOSTOUTPUT$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file-bulk


# OBJECTS - UNMODIFIED
#cfg_file=/usr/local/nagios/etc/objects/commands.cfg
#cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
#cfg_file=/usr/local/nagios/etc/objects/templates.cfg
#cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg


# STATIC OBJECT DEFINITIONS (THESE DON'T GET EXPORTED/IMPORTED BY NAGIOSQL)
cfg_dir=/usr/local/nagios/etc/static

# OBJECTS EXPORTED FROM NAGIOSQL
cfg_file=/usr/local/nagios/etc/contacttemplates.cfg
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/contacts.cfg
cfg_file=/usr/local/nagios/etc/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/commands.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/servicegroups.cfg
cfg_file=/usr/local/nagios/etc/hosttemplates.cfg
cfg_file=/usr/local/nagios/etc/servicetemplates.cfg
cfg_file=/usr/local/nagios/etc/servicedependencies.cfg
cfg_file=/usr/local/nagios/etc/serviceescalations.cfg
cfg_file=/usr/local/nagios/etc/hostdependencies.cfg
cfg_file=/usr/local/nagios/etc/hostescalations.cfg
cfg_file=/usr/local/nagios/etc/hostextinfo.cfg
cfg_file=/usr/local/nagios/etc/serviceextinfo.cfg
cfg_dir=/usr/local/nagios/etc/hosts
cfg_dir=/usr/local/nagios/etc/services

# GLOBAL EVENT HANDLERS
global_host_event_handler=xi_host_event_handler
global_service_event_handler=xi_service_event_handler

# UNMODIFIED
accept_passive_host_checks=1
accept_passive_service_checks=1
additional_freshness_latency=15
auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
bare_update_check=0
cached_host_check_horizon=15
cached_service_check_horizon=15
check_external_commands=1
check_for_orphaned_hosts=1
check_for_orphaned_services=1
check_for_updates=1
check_host_freshness=0
check_result_path=/usr/local/nagios/var/spool/checkresults
check_result_reaper_frequency=10
check_service_freshness=1
command_file=/usr/local/nagios/var/rw/nagios.cmd
daemon_dumps_core=0
date_format=us
debug_file=/usr/local/nagios/var/nagios.debug
debug_level=0
debug_verbosity=1
enable_event_handlers=1
enable_flap_detection=1
enable_notifications=1
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
event_broker_options=-1
event_handler_timeout=30
execute_host_checks=1
execute_service_checks=1
high_host_flap_threshold=20.0
high_service_flap_threshold=20.0
host_check_timeout=30
host_freshness_check_interval=60
host_inter_check_delay_method=s
illegal_macro_output_chars=`~$&|'"<>
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
interval_length=60
lock_file=/usr/local/nagios/var/nagios.lock
log_archive_path=/usr/local/nagios/var/archives
log_external_commands=0
log_file=/usr/local/nagios/var/nagios.log
log_host_retries=1
log_initial_states=0
log_notifications=1
log_passive_checks=0
log_rotation_method=d
log_service_retries=1
low_host_flap_threshold=5.0
low_service_flap_threshold=5.0
max_check_result_file_age=3600
max_check_result_reaper_time=30
max_concurrent_checks=0
max_debug_file_size=1000000
max_host_check_spread=30
max_service_check_spread=30
nagios_group=nagios
nagios_user=nagios
notification_timeout=30
object_cache_file=/usr/local/nagios/var/objects.cache
obsess_over_hosts=0
obsess_over_services=0
ocsp_timeout=5
passive_host_checks_are_soft=0
perfdata_timeout=5
precached_object_file=/usr/local/nagios/var/objects.precache
resource_file=/usr/local/nagios/etc/resource.cfg
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
retained_host_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_service_attribute_mask=0
retain_state_information=1
retention_update_interval=60
service_check_timeout=60
service_freshness_check_interval=60
service_inter_check_delay_method=s
service_interleave_factor=s
soft_state_dependencies=0
state_retention_file=/usr/local/nagios/var/retention.dat
status_file=/usr/local/nagios/var/status.dat
status_update_interval=10
temp_file=/usr/local/nagios/var/nagios.tmp
temp_path=/tmp
use_aggressive_host_checking=0
use_regexp_matching=0
use_retained_program_state=1
use_retained_scheduling_info=1
use_syslog=1
use_true_regexp_matching=0
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: gearmand failed to load module

Post by emartine »

mod_gearman_neb.conf

debug=0
logfile=/var/log/mod_gearman/mod_gearman_neb.log

server=localhost:serverip:4730
eventhandler=yes
services=yes
hosts=yes
do_hostchecks=yes
route_eventhandler_like_checks=no
encryption=yes
key=this is set
use_uniq_jobs=on
localhostgroups=
localservicegroups=
result_workers=1
perfdata=no
perfdata_mode=1
orphan_host_checks=yes
orphan_service_checks=yes
accept_clear_results=no
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: gearmand failed to load module

Post by emartine »

I have the local worker and one other server also serving as a worker. The password is the same on all 3 configuration files.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: gearmand failed to load module

Post by Box293 »

Thanks for those files.

Did you update the remote worker as well?

In mod_gearman_neb.conf AND the local mod_gearman_worker.conf, can you please change:
This:
server=localhost:serverip:4730

To:

Code: Select all

server=127.0.0.1:4730
Then

Code: Select all

service nagios stop
service mod_gearman_worker stop
service gearmand restart
service mod_gearman_worker start
service nagios start
Does this help?

On your XI host, can you run this command and post the output"

Code: Select all

iptables --list
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: gearmand failed to load module

Post by emartine »

I've disabled iptables for this test as well as selinux.

After I made changes to the IP address to be localhost the gearman worker on the localhost no longer seems to be checking in... this is shown when I run gearman_top


I do see this in the logs:.
mod_gearman_neb.log
[2015-01-30 09:11:34][30971][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:127 -> libgearman/connection.cc:745

mod_gearman_worker.log
[2015-01-30 09:13:39][31587][INFO ] mod_gearman worker daemon started with pid 31587
[2015-01-30 09:13:40][31587][INFO ] no checks in 2minutes, restarting all workers
[2015-01-30 09:15:41][31587][INFO ] no checks in 2minutes, restarting all workers
[2015-01-30 09:17:42][31587][INFO ] no checks in 2minutes, restarting all workers
[2015-01-30 09:19:43][31587][INFO ] no checks in 2minutes, restarting all workers
[2015-01-30 09:21:44][31587][INFO ] no checks in 2minutes, restarting all workers
[2015-01-30 09:23:45][31587][INFO ] no checks in 2minutes, restarting all workers
[2015-01-30 09:25:46][31587][INFO ] no checks in 2minutes, restarting all workers
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: gearmand failed to load module

Post by tgriep »

Could you check the status of the Gearman worker?
Run this.

Code: Select all

service mod_gearman_worker status
If it isn't running, could you start it?

Code: Select all

service mod_gearman_worker start
Also could you post your worker config file and the log file?

Code: Select all

/etc/mod_gearman/mod_gearman_worker.conf
/var/log/mod_gearman/mod_gearman_worker.log
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked