Database Backend Is Not Running

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
mcwhorts
Posts: 60
Joined: Fri Oct 07, 2011 11:59 am

Re: Database Backend Is Not Running

Post by mcwhorts »

I'm starting to see this now
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Database Backend Is Not Running

Post by jolson »

In your ndo2db configuration file:

Code: Select all

lock_file=/usr/local/nagiosxi/var/subsys/ndo2db.lock
Let's change the debug level to -1:

Code: Select all

debug_level=-1
And restart it:

Code: Select all

service ndo2db restart
Is any log generated at /usr/local/nagios/var/ndo2db.debug? If so, what are the contents?

Code: Select all

cat /usr/local/nagios/var/ndo2db.debug
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
mcwhorts
Posts: 60
Joined: Fri Oct 07, 2011 11:59 am

Re: Database Backend Is Not Running

Post by mcwhorts »

I made those changes to the ndo2db config and restarted. So far nothing has been logged at /usr/local/nagios/var/ndo2db.debug
mcwhorts
Posts: 60
Joined: Fri Oct 07, 2011 11:59 am

Re: Database Backend Is Not Running

Post by mcwhorts »

I'm seeing this in the logs. I don't know if it's useful.

[1436994795.390529] [002.0] [pid=9722] INSERT INTO nagios_hoststatus SET instance_id='1', host_object_id='5604', status_update_time=FROM_UNIXTIME(1436994795), output='PING OK - Packet loss = 0%, RTA = 0\.85 ms', long_output='', perfdata='rta=0\.853000ms;3000\.000000;5000\.000000;0\.000000 pl=0%;80;100;0', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='3', last_check=FROM_UNIXTIME(1436994790), next_check=FROM_UNIXTIME(1436994975), check_type='0', last_state_change=FROM_UNIXTIME(1435803964), last_hard_state_change=FROM_UNIXTIME(1434561932), last_hard_state='0', last_time_up=FROM_UNIXTIME(1436994795), last_time_down=FROM_UNIXTIME(1430592573), last_time_unreachable=FROM_UNIXTIME(1435803922), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='1', is_flapping='0', percent_state_change='0.000000', latency='0.000000', execution_time='4.502390', scheduled_downtime_depth='0', failure_prediction_enabled='0', process_performance_data='1', obsess_over_host='1', modified_host_attributes='0', event_handler='', check_command='check_nrpe_jpnmon!check_alive_nsin!!!!!!!', normal_check_interval='3.000000', retry_check_interval='1.000000', check_timeperiod_object_id='2' ON DUPLICATE KEY UPDATE instance_id='1', host_object_id='5604', status_update_time=FROM_UNIXTIME(1436994795), output='PING OK - Packet loss = 0%, RTA = 0\.85 ms', long_output='', perfdata='rta=0\.853000ms;3000\.000000;5000\.000000;0\.000000 pl=0%;80;100;0', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='3', last_check=FROM_UNIXTIME(1436994790), next_check=FROM_UNIXTIME(1436994975), check_type='0', last_state_change=FROM_UNIXTIME(1435803964), last_hard_state_change=FROM_UNIXTIME(1434561932), last_hard_state='0', last_time_up=FROM_UNIXTIME(1436994795), last_time_down=FROM_UNIXTIME(1430592573), last_time_unreachable=FROM_UNIXTIME(1435803922), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='1', is_flapping='0', percent_state_change='0.000000', latency='0.000000', execution_time='4.502390', scheduled_downtime_depth='0', failure_prediction_enabled='0', process_performance_data='1', obsess_over_host='1', modified_host_attributes='0', event_handler='', check_command='check_nrpe_jpnmon!check_alive_nsin!!!!!!!', normal_check_interval='3.000000', retry_check_interval='1.000000', check_timeperiod_object_id='2'
[1436994795.391147] [002.0] [pid=9722] INSERT INTO nagios_hoststatus SET instance_id='1', host_object_id='5604', status_update_time=FROM_UNIXTIME(1436994795), output='PING OK - Packet loss = 0%, RTA = 0\.85 ms', long_output='', perfdata='rta=0\.853000ms;3000\.000000;5000\.000000;0\.000000 pl=0%;80;100;0', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='3', last_check=FROM_UNIXTIME(1436994790), next_check=FROM_UNIXTIME(1436994975), check_type='0', last_state_change=FROM_UNIXTIME(1435803964), last_hard_state_change=FROM_UNIXTIME(1434561932), last_hard_state='0', last_time_up=FROM_UNIXTIME(1436994795), last_time_down=FROM_UNIXTIME(1430592573), last_time_unreachable=FROM_UNIXTIME(1435803922), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='1', is_flapping='0', percent_state_change='0.000000', latency='0.000000', execution_time='4.502390', scheduled_downtime_depth='0', failure_prediction_enabled='0', process_performance_data='1', obsess_over_host='1', modified_host_attributes='0', event_handler='', check_command='check_nrpe_jpnmon!check_alive_nsin!!!!!!!', normal_check_interval='3.000000', retry_check_interval='1.000000', check_timeperiod_object_id='2' ON DUPLICATE KEY UPDATE instance_id='1', host_object_id='5604', status_update_time=FROM_UNIXTIME(1436994795), output='PING OK - Packet loss = 0%, RTA = 0\.85 ms', long_output='', perfdata='rta=0\.853000ms;3000\.000000;5000\.000000;0\.000000 pl=0%;80;100;0', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='3', last_check=FROM_UNIXTIME(1436994790), next_check=FROM_UNIXTIME(1436994975), check_type='0', last_state_change=FROM_UNIXTIME(1435803964), last_hard_state_change=FROM_UNIXTIME(1434561932), last_hard_state='0', last_time_up=FROM_UNIXTIME(1436994795), last_time_down=FROM_UNIXTIME(1430592573), last_time_unreachable=FROM_UNIXTIME(1435803922), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='1', is_flapping='0', percent_state_change='0.000000', latency='0.000000', execution_time='4.502390', scheduled_downtime_depth='0', failure_prediction_enabled='0', process_performance_data='1', obsess_over_host='1', modified_host_attributes='0', event_handler='', check_command='check_nrpe_jpnmon!check_alive_nsin!!!!!!!', normal_check_interval='3.000000', retry_check_interval='1.000000', check_timeperiod_object_id='2'


Is there something specific that I need to look for?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Database Backend Is Not Running

Post by tgriep »

The ndo2db lock file is being created in the wrong folder and that is causing the Database Backend Status to be wrong.

Stop the ndo2db process

Code: Select all

service ndo2db stop
Edit /usr/local/nagios/etc/ndo2db.cfg
change

Code: Select all

lock_file=/usr/local/nagiosxi/var/subsys/ndo2db.lock
to

Code: Select all

lock_file=/usr/local/nagios/var/ndo2db.lock
Now delete the old files

Code: Select all

rm /usr/local/nagiosxi/var/subsys/ndo2db*
rm /usr/local/nagios/var/ndo2db.lock
Start ndo2db

Code: Select all

service ndo2db start
Run this too and post back the output

Code: Select all

tail -50 /var/log/cron
Be sure to check out our Knowledgebase for helpful articles and solutions!
mcwhorts
Posts: 60
Joined: Fri Oct 07, 2011 11:59 am

Re: Database Backend Is Not Running

Post by mcwhorts »

Jul 15 12:47:01 niteowl CROND[366]: (root) CMD (ps -ef|grep -v grep |grep vmtoolsd > /dev/null || echo " To restart: vmtoolsd -b /var/run/vmtoolsd.pid" | mailer.sh -s "VMware tools down" -system root)
Jul 15 12:47:01 niteowl CROND[368]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jul 15 12:47:01 niteowl CROND[369]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jul 15 12:47:01 niteowl CROND[370]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jul 15 12:47:01 niteowl CROND[371]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jul 15 12:47:01 niteowl CROND[372]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jul 15 12:47:01 niteowl CROND[380]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4148]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4149]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4150]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4151]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4152]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4153]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4155]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jul 15 12:48:01 niteowl CROND[4154]: (root) CMD (/etc/webmin/sysstats/sysstats.pl)
Jul 15 12:48:01 niteowl CROND[4163]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8070]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8071]: (root) CMD (/etc/webmin/sysstats/sysstats.pl)
Jul 15 12:49:01 niteowl CROND[8072]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8073]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8075]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8078]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8079]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8080]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jul 15 12:49:01 niteowl CROND[8082]: (root) CMD (differ.sh -a -f /var/adm/alert 2>&1 | mailer.sh -s "alert" -system [email protected],root`[ -f /.text ] && ( echo -n ,; cat /.text; )`)
Jul 15 12:49:01 niteowl CROND[8081]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11902]: (root) CMD (/etc/webmin/sysstats/sysstats.pl)
Jul 15 12:50:01 niteowl CROND[11903]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11904]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11905]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11906]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11909]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11912]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11914]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok)
Jul 15 12:50:01 niteowl CROND[11911]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Jul 15 12:50:01 niteowl CROND[11910]: (root) CMD (topper.sh)
Jul 15 12:50:01 niteowl CROND[11915]: (root) CMD (access-mon.sh -rebuild 30 | egrep -v 'itchy|10.33.34.95|dbaud|toptrack' >> /var/log/access.log 2>>/tmp/access-mon-err)
Jul 15 12:50:01 niteowl CROND[11921]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11919]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11916]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jul 15 12:50:01 niteowl CROND[11926]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13619]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13620]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13621]: (root) CMD (/etc/webmin/sysstats/sysstats.pl)
Jul 15 12:51:01 niteowl CROND[13622]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13623]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13625]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13626]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13628]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jul 15 12:51:01 niteowl CROND[13629]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
mcwhorts
Posts: 60
Joined: Fri Oct 07, 2011 11:59 am

Re: Database Backend Is Not Running

Post by mcwhorts »

After running these commands again
service nagios stop
killall -9 nagios
service ndo2db stop
service mysqld stop
service crond stop
service mysqld start
service ndo2db start
service nagios start
service crond start

It seems as though ndo2bd finally came to life. Everything appears to running as it should.

Thanks for all your help guys!
Locked