Page 2 of 2
Re: XI web interface goes into a bad state
Posted: Fri Mar 14, 2014 4:46 pm
by uidaho
Okay, so your hosts/services are all getting greyed out and tell you that they are pending?
The hosts are gray, and perpetually in a "Check is pending" state. The services are in various states, possibly the last state they were in when the Nagios server went down?
After a restart of nagios?
A restart of Nagios after the Nagios server has rebooted fixes the problem. Specifically, under Monitoring Process in XI, we stop and then start the process state.
Do they eventually go back to their states once checks are ran, or are checks not even being scheduled when you look at their details pages.
In XI, they remain in this state for a considerable amount of time (longer than our scheduled checks). It may be possible that eventually they go into a correct state, but this has not been observed. If we look in core, the states are accurate and being updated.
Do you recall disabling state retention options on your hosts/services or in the nagios.cfg?
I don't recall disabling state retention. Here are some possibly related entries from our nagios.cfg:
Code: Select all
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
retained_host_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_service_attribute_mask=0
retain_state_information=1
retention_update_interval=60
service_check_timeout=60
service_freshness_check_interval=60
service_inter_check_delay_method=s
service_interleave_factor=s
sleep_time=0.25
soft_state_dependencies=1
state_retention_file=/usr/local/nagios/var/retention.dat
Before I increase the debug level - is this a benign state to leave Nagios running in, or should this be only for short amounts of time?
Thank you!
Re: XI web interface goes into a bad state
Posted: Mon Mar 17, 2014 10:40 am
by lmiltchev
Before I increase the debug level - is this a benign state to leave Nagios running in, or should this be only for short amounts of time?
Usually, you want to switch to debug mode temporarily (while troubleshooting the issue). It is not a good idea to use it permanently, since your log files can grow out of control.
Re: XI web interface goes into a bad state
Posted: Wed Mar 26, 2014 11:49 am
by uidaho
I've modified the debug settings for ndo2db. After a restart of the service, ndo2db.debug remains empty. Upon a reboot of the host, the XI interface is in the strange state, and ndo2db.debug remains empty. To put the interface in a good state, I stop the monitoring process through the XI interface, and then start it. ndo2db.debug remains empty.
Here are the log entries for when the host was rebooted:
Code: Select all
Process Information2014-03-26 09:31:36Successfully shutdown... (PID=8774)
Process Information2014-03-26 09:31:36Caught SIGTERM, shutting down...
Information2014-03-26 09:31:36ndomod: Please check remote ndo2db log, database connection or SSL Parameters
Information2014-03-26 09:31:36ndomod: Error writing to data sink! Some output may get lost...
I've reverted the debug settings for ndo2db. Please let me know what other test I can run and information I can gather.
Re: XI web interface goes into a bad state
Posted: Wed Mar 26, 2014 4:38 pm
by scottwilkerson
It sounds like you have some kind of a race condition, where ndo2db could be taking longer than expected to start.
Could you post your ndo2db.cfg configuration obfuscating any db passwords
Re: XI web interface goes into a bad state
Posted: Wed Mar 26, 2014 6:24 pm
by uidaho
Here is our config file.
Code: Select all
#####################################################################
# NDO2DB DAEMON CONFIG FILE
#####################################################################
lock_file=/usr/local/nagios/var/ndo2db.lock
ndo2db_user=nagios
ndo2db_group=nagios
socket_type=unix
socket_name=/usr/local/nagios/var/ndo.sock
tcp_port=5668
db_servertype=mysql
db_host=localhost
db_port=3306
db_name=nagios
db_prefix=nagios_
db_user=#########
db_pass=#########
## TABLE TRIMMING OPTIONS
# Several database tables containing Nagios event data can become quite large
# over time. Most admins will want to trim these tables and keep only a
# certain amount of data in them. The options below are used to specify the
# age (in MINUTES) that data should be allowd to remain in various tables
# before it is deleted. Using a value of zero (0) for any value means that
# that particular table should NOT be automatically trimmed.
# Keep timed events for 24 hours
max_timedevents_age=1440
# Keep system commands for 1 week
max_systemcommands_age=10080
# Keep service checks for 1 week
max_servicechecks_age=10080
# Keep host checks for 1 week
max_hostchecks_age=10080
# Keep event handlers for 31 days
max_eventhandlers_age=44640
# DEBUG LEVEL
# This option determines how much (if any) debugging information will
# be written to the debug file. OR values together to log multiple
# types of information.
# Values: -1 = Everything
# 0 = Nothing
# 1 = Process info
# 2 = SQL queries
debug_level=0
#debug_level=1
# DEBUG VERBOSITY
# This option determines how verbose the debug log out will be.
# Values: 0 = Brief output
# 1 = More detailed
# 2 = Very detailed
debug_verbosity=1
#debug_verbosity=2
# DEBUG FILE
# This option determines where the daemon should write debugging information.
debug_file=/usr/local/nagios/var/ndo2db.debug
# MAX DEBUG FILE SIZE
# This option determines the maximum size (in bytes) of the debug file. If
# the file grows larger than this size, it will be renamed with a .old
# extension. If a file already exists with a .old extension it will
# automatically be deleted. This helps ensure your disk space usage doesn't
# get out of control when debugging.
max_debug_file_size=1000000
Re: XI web interface goes into a bad state
Posted: Thu Mar 27, 2014 1:58 pm
by lmiltchev
Can you run the following commands and show the output?
Code: Select all
grep dbserver /usr/local/nagiosxi/html/config.inc.php
grep server /var/www/html/nagiosql/config/settings.php
Re: XI web interface goes into a bad state
Posted: Thu Mar 27, 2014 5:09 pm
by uidaho
Here is what I ran and the output:
Code: Select all
$ grep dbserver /usr/local/nagiosxi/html/config.inc.php
$cfg['dbserver']='localhost'; // this setting is no longer used - use settings below
"dbserver" => 'localhost',
"dbserver" => 'localhost',
"dbserver" => 'localhost',
$ grep server /var/www/html/nagiosql/config/settings.php
server = localhost
Re: XI web interface goes into a bad state
Posted: Fri Mar 28, 2014 10:50 am
by lmiltchev
This looks correct. I was hoping to see something else.
At this point, I would recommend that you open an email support ticket in our system by emailing us at
[email protected]. We will probably have to schedule a remote session to further troubleshoot your problem.