error writing to data sink

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
kendallchenoweth
Posts: 195
Joined: Fri Sep 13, 2013 10:43 am

error writing to data sink

Post by kendallchenoweth »

How do I resolve this error?

[1418245190] ndomod registered for state change data'
[1418245190] ndomod registered for contact status data'
[1418245190] ndomod registered for adaptive contact data'
[1418245190] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1418245190] ndomod: Error writing to data sink! Some output may get lost...
[1418245190] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1418245190] Successfully launched command file worker with pid 6590
[1418245198] ndomod: Successfully reconnected to data sink! 0 items lost, 3390 queued items to flush.
[1418245198] ndomod: Error writing to data sink! Some output may get lost. 3220 queued items to flush.
[1418245206] ndomod: Successfully reconnected to data sink! 0 items lost, 1880 queued items to flush.
[1418245206] ndomod: Error writing to data sink! Some output may get lost. 1626 queued items to flush.

Nagios XI 2014R2.0

Thanks!

-Kendall Chenoweth
User avatar
lmiltchev
Former Nagios Staff
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: error writing to data sink

Post by lmiltchev »

Do you have mysql offloaded to a remote server? Can you connect to mysql from the CLI?

Code: Select all

mysql -pnagiosxi
Can you post the nagios.cfg, ndo2db.cfg, and ndomod.cfg files?
Be sure to check out our Knowledgebase for helpful articles and solutions!
kendallchenoweth
Posts: 195
Joined: Fri Sep 13, 2013 10:43 am

Re: error writing to data sink

Post by kendallchenoweth »

I can connect successfully from the CLI. The database is a remote database (RDS amazon instance); I can remote MYSQL, but there is no SSH connection possible to the remote database server.

Code: Select all

nagios@<host> ~]$ mysql -u ndoutils -h <host> -p'<password>'
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 969363
Server version: 5.6.21-log MySQL Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Code: Select all

[nagios@<host> ~]$ mysql -u nagiosql -h <host> -p'<password>'
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 969388
Server version: 5.6.21-log MySQL Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Code: Select all

[nagios@<host> etc]$ more nagios.cfg
# MODIFIED
admin_email=root@localhost
admin_pager=root@localhost
translate_passive_host_checks=1
log_event_handlers=0
use_large_installation_tweaks=1
enable_environment_macros=0


# NDOUtils module
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg


# PNP settings - bulk mode with NCPD
process_performance_data=1
# service performance data
service_perfdata_file=/usr/local/nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SER
VICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\t
SERVICESTATETYPE::$SERVICESTATETYPE$\tSERVICEOUTPUT::$SERVICEOUTPUT$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file-bulk
# host performance data
host_perfdata_file=/usr/local/nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCH
ECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tHOSTOUTPUT::$HOSTOUTPUT$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file-bulk


# OBJECTS - UNMODIFIED
#cfg_file=/usr/local/nagios/etc/objects/commands.cfg
#cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
#cfg_file=/usr/local/nagios/etc/objects/templates.cfg
#cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg


# STATIC OBJECT DEFINITIONS (THESE DON'T GET EXPORTED/IMPORTED BY NAGIOSQL)
cfg_dir=/usr/local/nagios/etc/static

# OBJECTS EXPORTED FROM NAGIOSQL
cfg_file=/usr/local/nagios/etc/contacttemplates.cfg
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/contacts.cfg
cfg_file=/usr/local/nagios/etc/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/commands.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/servicegroups.cfg
cfg_file=/usr/local/nagios/etc/hosttemplates.cfg
cfg_file=/usr/local/nagios/etc/servicetemplates.cfg
cfg_file=/usr/local/nagios/etc/servicedependencies.cfg
cfg_file=/usr/local/nagios/etc/serviceescalations.cfg
cfg_file=/usr/local/nagios/etc/hostdependencies.cfg
cfg_file=/usr/local/nagios/etc/hostescalations.cfg
cfg_file=/usr/local/nagios/etc/hostextinfo.cfg
cfg_file=/usr/local/nagios/etc/serviceextinfo.cfg
cfg_dir=/usr/local/nagios/etc/hosts
cfg_dir=/usr/local/nagios/etc/services

# GLOBAL EVENT HANDLERS
global_host_event_handler=xi_host_event_handler
global_service_event_handler=xi_service_event_handler



# UNMODIFIED
accept_passive_host_checks=1
accept_passive_service_checks=1
additional_freshness_latency=15
auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
bare_update_check=0
cached_host_check_horizon=15
cached_service_check_horizon=15
check_external_commands=1
check_for_orphaned_hosts=1
check_for_orphaned_services=1
check_for_updates=1
check_host_freshness=0
check_result_path=/usr/local/nagios/var/spool/checkresults
check_result_reaper_frequency=10
check_service_freshness=1
command_file=/usr/local/nagios/var/rw/nagios.cmd
daemon_dumps_core=0
date_format=us
debug_file=/usr/local/nagios/var/nagios.debug
debug_level=0
debug_verbosity=1
enable_event_handlers=1
enable_flap_detection=1
enable_notifications=1
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
event_broker_options=-1
event_handler_timeout=30
execute_host_checks=1
execute_service_checks=1
high_host_flap_threshold=20.0
high_service_flap_threshold=20.0
host_check_timeout=30
host_freshness_check_interval=60
host_inter_check_delay_method=s
illegal_macro_output_chars=`~$&|'"<>
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
interval_length=60
lock_file=/usr/local/nagios/var/nagios.lock
log_archive_path=/usr/local/nagios/var/archives
log_external_commands=0
log_file=/usr/local/nagios/var/nagios.log
log_host_retries=1
log_initial_states=0
log_notifications=1
log_passive_checks=0
log_rotation_method=d
log_service_retries=1
low_host_flap_threshold=5.0
low_service_flap_threshold=5.0
max_check_result_file_age=3600
max_check_result_reaper_time=30
max_concurrent_checks=0
max_debug_file_size=1000000
max_host_check_spread=30
max_service_check_spread=30
nagios_group=nagios
nagios_user=nagios
notification_timeout=30
object_cache_file=/usr/local/nagios/var/objects.cache
obsess_over_hosts=0
obsess_over_services=0
ocsp_timeout=5
passive_host_checks_are_soft=0
perfdata_timeout=5
precached_object_file=/usr/local/nagios/var/objects.precache
resource_file=/usr/local/nagios/etc/resource.cfg
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
retained_host_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_service_attribute_mask=0
retain_state_information=1
retention_update_interval=60
service_check_timeout=60
service_freshness_check_interval=60
service_inter_check_delay_method=s
service_interleave_factor=s
soft_state_dependencies=0
state_retention_file=/usr/local/nagios/var/retention.dat
status_file=/usr/local/nagios/var/status.dat
status_update_interval=10
temp_file=/usr/local/nagios/var/nagios.tmp
temp_path=/tmp
use_aggressive_host_checking=0
use_regexp_matching=0
use_retained_program_state=1
use_retained_scheduling_info=1
use_syslog=1
use_true_regexp_matching=0

Code: Select all

[nagios@<host> etc]$ more ndo2db.cfg
#####################################################################
# NDO2DB DAEMON CONFIG FILE
#####################################################################


lock_file=/usr/local/nagios/var/ndo2db.lock

ndo2db_user=nagios
ndo2db_group=nagios

socket_type=unix

socket_name=/usr/local/nagios/var/ndo.sock

tcp_port=5668


db_servertype=mysql
db_host=<host>
db_port=3306

db_name=nagios
db_prefix=nagios_

db_user=ndoutils
db_pass=<password>



## TABLE TRIMMING OPTIONS
# Several database tables containing Nagios event data can become quite large
# over time.  Most admins will want to trim these tables and keep only a
# certain amount of data in them.  The options below are used to specify the
# age (in MINUTES) that data should be allowd to remain in various tables
# before it is deleted.  Using a value of zero (0) for any value means that
# that particular table should NOT be automatically trimmed.

# Keep timed events for 24 hours
max_timedevents_age=1440

# Keep system commands for 1 week
max_systemcommands_age=10080

# Keep service checks for 1 week
max_servicechecks_age=10080

# Keep host checks for 1 week
max_hostchecks_age=10080

# Keep event handlers for 31 days
max_eventhandlers_age=44640

# DEBUG LEVEL
# This option determines how much (if any) debugging information will
# be written to the debug file.  OR values together to log multiple
# types of information.
# Values: -1 = Everything
#          0 = Nothing
#          1 = Process info
#          2 = SQL queries

debug_level=0



# DEBUG VERBOSITY
# This option determines how verbose the debug log out will be.
# Values: 0 = Brief output
#         1 = More detailed
#         2 = Very detailed

debug_verbosity=1



# DEBUG FILE
# This option determines where the daemon should write debugging information.

debug_file=/usr/local/nagios/var/ndo2db.debug



# MAX DEBUG FILE SIZE
# This option determines the maximum size (in bytes) of the debug file.  If
# the file grows larger than this size, it will be renamed with a .old
# extension.  If a file already exists with a .old extension it will
# automatically be deleted.  This helps ensure your disk space usage doesn't
# get out of control when debugging.

max_debug_file_size=1000000

Code: Select all

[nagios@<host> etc]$ more ndomod.cfg
#####################################################################
# NDOMOD CONFIG FILE
#####################################################################


# INSTANCE NAME
# This option identifies the "name" associated with this particular
# instance of Nagios and is used to seperate data coming from multiple
# instances.  Defaults to 'default' (without quotes).

instance_name=localhost



# OUTPUT TYPE
# This option determines what type of output sink the NDO NEB module
# should use for data output.  Valid options include:
#   file       = standard text file
#   tcpsocket  = TCP socket
#   unixsocket = UNIX domain socket (default)

#output_type=file
#output_type=tcpsocket
output_type=unixsocket



# OUTPUT
# This option determines the name and path of the file or UNIX domain
# socket to which output will be sent if the output type option specified
# above is "file" or "unixsocket", respectively.  If the output type
# option is "tcpsocket", this option is used to specify the IP address
# of fully qualified domain name of the host that the module should
# connect to for sending output.

#output=/usr/local/nagios/var/ndo.dat
#output=127.0.0.1
output=/usr/local/nagios/var/ndo.sock



# TCP PORT
# This option determines what port the module will connect to in
# order to send output.  This option is only vlaid if the output type
# option specified above is "tcpsocket".

tcp_port=5668



# OUTPUT BUFFER
# This option determines the size of the output buffer, which will help
# prevent data from getting lost if there is a temporary disconnect from
# the data sink.  The number of items specified here is the number of
# lines (each of variable size) of output that will be buffered.

output_buffer_items=5000



# BUFFER FILE
# This option is used to specify a file which will be used to store the
# contents of buffered data which could not be sent to the NDO2DB daemon
# before Nagios shuts down.  Prior to shutting down, the NDO NEB module
# will write all buffered data to this file for later processing.  When
# Nagios (re)starts, the NDO NEB module will read the contents of this
# file and send it to the NDO2DB daemon for processing.

buffer_file=/usr/local/nagios/var/ndomod.tmp



# FILE ROTATION INTERVAL
# This option determines how often (in seconds) the output file is
# rotated by Nagios.  File rotation is handled by Nagios by executing
# the command defined by the file_rotation_command option.  This
# option has no effect if the output_type option is a socket.

file_rotation_interval=14400



# FILE ROTATION COMMAND
# This option specified the command (as defined in Nagios) that is
# used to rotate the output file at the interval specified by the
# file_rotation_interval option.  This option has no effect if the
# output_type option is a socket.
#
# See the file 'misccommands.cfg' for an example command definition
# that you can use to rotate the log file.

#file_rotation_command=rotate_ndo_log



# FILE ROTATION TIMEOUT
# This option specified the maximum number of seconds that the file
# rotation command should be allowed to run before being prematurely
# terminated.

file_rotation_timeout=60



# RECONNECT INTERVAL
# This option determines how often (in seconds) that the NDO NEB
# module will attempt to re-connect to the output file or socket if
# a connection to it is lost.

reconnect_interval=15



# RECONNECT WARNING INTERVAL
# This option determines how often (in seconds) a warning message will
# be logged to the Nagios log file if a connection to the output file
# or socket cannot be re-established.

#reconnect_warning_interval=15
reconnect_warning_interval=900



# DATA PROCESSING OPTION
# This option determines what data the NDO NEB module will process.
# Do not mess with this option unless you know what you're doing!!!!
# Read the source code (include/ndbxtmod.h) to determine what values
# to use here.  Values from source code should be OR'ed to get the
# value to use here.  A value of -1 will cause all data to be processed.
# Read the source code (include/ndomod.h) and look for "NDOMOD_PROCESS_"
# to determine what values to use here.  Values from source code should
# be OR'ed to get the value to use here.  A value of -1 will cause all
# data to be processed.

# Process everything
#data_processing_options=-1

#no timed event, no host check, no service check
data_processing_options=67108669


# CONFIG OUTPUT OPTION
# This option determines what types of configuration data the NDO
# NEB module will dump from Nagios.  Values can be OR'ed together.
# Values:
#         0 = Don't dump any configuration information
#         1 = Dump only original config (from config files)
#         2 = Dump config only after retained information has been restored
#         3 = Dump both original and retained configuration

config_output_options=2

Code: Select all

[root@<host> ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key
error: "net.bridge.bridge-nf-call-iptables" is an unknown key
error: "net.bridge.bridge-nf-call-arptables" is an unknown key
kernel.msgmnb = 393740288
kernel.msgmax = 131072000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.randomize_va_space = 1
kernel.exec-shield = 1
emislivec
Posts: 52
Joined: Tue Feb 25, 2014 10:06 am

Re: error writing to data sink

Post by emislivec »

Since ndomod is able to connect to ndo2db and write some data, ndo2db may be dying after handling some data.

This isn't a networking problem since ndomod is talking to ndo2db over a UNIX socket (/usr/local/nagios/var/ndo.sock). The DB configuration and connectivity seems fine, and ndo2db would handle no data at all if it was unable to connect to the DB.

Do you see any messages from ndo2db in your syslog? Is the DB being updated with new data?
kendallchenoweth
Posts: 195
Joined: Fri Sep 13, 2013 10:43 am

Re: error writing to data sink

Post by kendallchenoweth »

Dec 15 12:21:20 ip-10-154-25-117 nagios: ndomod: Please check remote ndo2db log, database connection or SSL Parameters
Dec 15 12:21:36 ip-10-154-25-117 nagios: ndomod: Successfully reconnected to data sink! 0 items lost, 97 queued items to flush.
Dec 15 12:21:36 ip-10-154-25-117 nagios: ndomod: Successfully flushed 97 queued items to data sink.
Dec 15 12:21:36 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:36 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:36 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:36 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:36 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:37 ip-10-154-25-117 nagios: ndomod: Error writing to data sink! Some output may get lost...
Dec 15 12:21:37 ip-10-154-25-117 nagios: ndomod: Please check remote ndo2db log, database connection or SSL Parameters
Dec 15 12:21:53 ip-10-154-25-117 nagios: ndomod: Successfully reconnected to data sink! 0 items lost, 53 queued items to flush.
Dec 15 12:21:53 ip-10-154-25-117 nagios: ndomod: Successfully flushed 53 queued items to data sink.
Dec 15 12:21:53 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:53 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:53 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
Dec 15 12:21:54 ip-10-154-25-117 nagios: ndomod: Error writing to data sink! Some output may get lost...
Dec 15 12:21:54 ip-10-154-25-117 nagios: ndomod: Please check remote ndo2db log, database connection or SSL Parameters
kendallchenoweth
Posts: 195
Joined: Fri Sep 13, 2013 10:43 am

Re: error writing to data sink

Post by kendallchenoweth »

Nagios Log

[1418664179] ndomod: Error writing to data sink! Some output may get lost...
[1418664179] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1418664196] ndomod: Successfully reconnected to data sink! 0 items lost, 54 queued items to flush.
[1418664196] ndomod: Successfully flushed 54 queued items to data sink.
[1418664196] ndomod: Error writing to data sink! Some output may get lost...
[1418664196] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1418664212] ndomod: Successfully reconnected to data sink! 0 items lost, 88 queued items to flush.
[1418664212] ndomod: Successfully flushed 88 queued items to data sink.
[1418664212] ndomod: Error writing to data sink! Some output may get lost...
[1418664212] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1418664228] ndomod: Successfully reconnected to data sink! 0 items lost, 54 queued items to flush.
[1418664228] ndomod: Successfully flushed 54 queued items to data sink.
[1418664229] ndomod: Error writing to data sink! Some output may get lost...
[1418664229] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
emislivec
Posts: 52
Joined: Tue Feb 25, 2014 10:06 am

Re: error writing to data sink

Post by emislivec »

kendallchenoweth wrote:Dec 15 12:21:36 ip-10-154-25-117 ndo2db: Error: Could not connect to MySQL database: Unknown MySQL server host 'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com ' (1)
So it looks like there is an issue with your DB configuration. (I thought that would result in ndo2db not reading any input from ndomod, but that seems to not be so).

Code: Select all

'nagiosxi-qa01.cyomeuveb6ni.us-east-1.rds.amazonaws.com '
That space at the end is odd. ndo2db is picky about its config input. Is there a space at the end of that line in your config file? It's a small detail to us, but could be the difference for ndo2db.
kendallchenoweth
Posts: 195
Joined: Fri Sep 13, 2013 10:43 am

Re: error writing to data sink

Post by kendallchenoweth »

I'm going to do some more checking, but I think that probably solved the problem. Thanks!
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: error writing to data sink

Post by sreinhardt »

One other thing to note is that aws restricts the number of rds connections based on the size of your instance. I don't recall exact numbers, but even a small nagios XI system needs a medium or larger rds instance to handle all the connections. I have limited experience working with rds instances specifically, but the few times we have, it is generally cheaper and suggested to use another regular instance hosting mysql instead.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
emislivec
Posts: 52
Joined: Tue Feb 25, 2014 10:06 am

Re: error writing to data sink

Post by emislivec »

kendallchenoweth wrote:I'm going to do some more checking, but I think that probably solved the problem. Thanks!
Was the extra space the problem? I'll fix this for the next release of ndoutils.
Locked