Page 1 of 1

ndo2db is not running ( NAGIOSXI )

Posted: Mon May 25, 2015 12:26 am
by Ashish Sood
Hi,

I am working around to find the solution, but still i unable to find the solution of my problem. i am using centos7 and installed nagiosXi in my system but on the nagios home page i can see a alert on the top right side which saying that ndo2db is not running, i tried to restart it several time but still the problem is persist.

OUTPUT
[root@localhost subsys]# service ndo2db status
ndo2db is not running but subsystem locked
NDO2db configuration file
[root@localhost subsys]# cat /usr/local/nagios/etc/ndo2db.cfg
#####################################################################
# NDO2DB DAEMON CONFIG FILE
#####################################################################


lock_file=/usr/local/nagios/var/ndo2db.lock

ndo2db_user=nagios
ndo2db_group=nagios

socket_type=unix

socket_name=/usr/local/nagios/var/ndo.sock

tcp_port=5668


db_servertype=mysql
db_host=localhost
db_port=3306

db_name=nagios
db_prefix=nagios_

db_user=ndoutils
db_pass=n@gweb



## TABLE TRIMMING OPTIONS
# Several database tables containing Nagios event data can become quite large
# over time. Most admins will want to trim these tables and keep only a
# certain amount of data in them. The options below are used to specify the
# age (in MINUTES) that data should be allowd to remain in various tables
# before it is deleted. Using a value of zero (0) for any value means that
# that particular table should NOT be automatically trimmed.

# Keep timed events for 24 hours
max_timedevents_age=1440

# Keep system commands for 1 week
max_systemcommands_age=10080

# Keep service checks for 1 week
max_servicechecks_age=10080

# Keep host checks for 1 week
max_hostchecks_age=10080

# Keep event handlers for 31 days
max_eventhandlers_age=44640





# DEBUG LEVEL
# This option determines how much (if any) debugging information will
# be written to the debug file. OR values together to log multiple
# types of information.
# Values: -1 = Everything
# 0 = Nothing
# 1 = Process info
# 2 = SQL queries

debug_level=0



# DEBUG VERBOSITY
# This option determines how verbose the debug log out will be.
# Values: 0 = Brief output
# 1 = More detailed
# 2 = Very detailed

debug_verbosity=1



# DEBUG FILE
# This option determines where the daemon should write debugging information.

debug_file=/usr/local/nagios/var/ndo2db.debug



# MAX DEBUG FILE SIZE
# This option determines the maximum size (in bytes) of the debug file. If
# the file grows larger than this size, it will be renamed with a .old
# extension. If a file already exists with a .old extension it will
# automatically be deleted. This helps ensure your disk space usage doesn't
# get out of control when debugging.

max_debug_file_size=1000000
Let me know what other information i can provide you to help you to know the root cause of the problem.

Thanks In Advance
Ashish

Re: ndo2db is not running ( NAGIOSXI )

Posted: Tue May 26, 2015 9:34 am
by tgriep
Remove the lock file, /usr/local/nagios/var/ndo2db.lock and them restart the ndo2db process by running the following.

Code: Select all

service ndo2db restart

Re: ndo2db is not running ( NAGIOSXI )

Posted: Wed May 27, 2015 4:22 am
by Ashish Sood
@tgriep

Still having a same problem
[root@localhost ~]# rm /usr/local/nagios/var/ndo2db.lock
rm: remove regular empty file ‘/usr/local/nagios/var/ndo2db.lock’? y
[root@localhost ~]# service ndo2db restart
Restarting ndo2db (via systemctl): [ OK ]
[root@localhost ~]# service ndo2db status
ndo2db is not running but subsystem locked
[root@localhost ~]#

Re: ndo2db is not running ( NAGIOSXI )

Posted: Wed May 27, 2015 11:17 am
by abrist
Lets turn on ndo debugging, try to restart the process, and then when it fails, post the debug output:
Edit:

Code: Select all

/usr/local/nagios/etc/ndo2db.cfg
Change:

Code: Select all

debug_level=0
To:

Code: Select all

debug_level=-1
And change:

Code: Select all

debug_verbosity=1
To:

Code: Select all

debug_verbosity=2
Save out.
Remove the lock file if it exists:

Code: Select all

rm -f /usr/local/nagios/var/ndo2db.lock
And then restart ndo2db:

Code: Select all

service ndo2db start
Once if fails, get a tail of the following files and post the output:

Code: Select all

tail -25 /usr/local/nagios/var/ndo2db.debug
tail -25 /var/log/messages
tail -25 /usr/local/nagios/var/nagios.log
Once you have the debug output, do not forget to decrease the debug level in ndo2db.cfg back to defaults.

Was this system restored from a backup to centos/rhel 7 by chance? I ask because you may have issues with the old mysql libs not existing as in cent/rhel 7 they moved to mariadb. If that is the case, lets us know and I will get you the steps to rebuild ndo from the nagiosxi tarball.

Re: ndo2db is not running ( NAGIOSXI )

Posted: Wed May 27, 2015 7:27 pm
by Ashish Sood
No such file is there at the specified location
tail -25 /usr/local/nagios/var/ndo2db.debug

tail -25 /var/log/messages
[root@localhost ~]# tail -25 /var/log/messages
May 28 05:55:02 localhost systemd: Started Session 4053 of user nagios.
May 28 05:55:02 localhost systemd: Starting Session 4052 of user nagios.
May 28 05:55:02 localhost systemd: Started Session 4052 of user nagios.
May 28 05:55:02 localhost systemd: Starting Session 4049 of user nagios.
May 28 05:55:02 localhost systemd: Started Session 4049 of user nagios.
May 28 05:55:02 localhost systemd: Starting Session 4047 of user nagios.
May 28 05:55:02 localhost systemd: Started Session 4047 of user nagios.
May 28 05:55:02 localhost systemd: Starting Session 4048 of user nagios.
May 28 05:55:02 localhost systemd: Started Session 4048 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4063 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4063 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4062 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4062 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4064 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4064 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4061 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4061 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4058 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4058 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4060 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4060 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4057 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4057 of user nagios.
May 28 05:56:01 localhost systemd: Starting Session 4059 of user nagios.
May 28 05:56:01 localhost systemd: Started Session 4059 of user nagios.

tail -25 /usr/local/nagios/var/nagios.log
[root@localhost ~]# tail -25 /usr/local/nagios/var/nagios.log
[1432765507] HOST ALERT: 160.110.246.130;DOWN;HARD;5;CRITICAL - 160.110.246.130: Host unreachable @ 160.110.246.81. rta nan, lost 100%
[1432765507] HOST NOTIFICATION: nagiosadmin;160.110.246.130;DOWN;xi_host_notification_handler;CRITICAL - 160.110.246.130: Host unreachabl e @ 160.110.246.81. rta nan, lost 100%
[1432765539] SERVICE ALERT: 160.110.246.130;Ping;CRITICAL;SOFT;3;CRITICAL - 160.110.246.130: Host unreachable @ 160.110.246.81. rta nan, lost 100%
[1432765600] SERVICE ALERT: 160.110.246.130;Ping;CRITICAL;SOFT;4;CRITICAL - 160.110.246.130: Host unreachable @ 160.110.246.81. rta nan, lost 100%
[1432765659] SERVICE ALERT: 160.110.246.130;Ping;CRITICAL;HARD;5;CRITICAL - 160.110.246.130: Host unreachable @ 160.110.246.81. rta nan, lost 100%
[1432765741] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;CPU Usage;CRITICAL;xi_service_notification_handler;connect to address 192.168 .10.2 and port 12489: Connection refused
[1432765845] Auto-save of retention data completed successfully.
[1432766104] ndomod: Still unable to connect to data sink. 16445 items lost, 5000 queued items to flush.
[1432767022] ndomod: Still unable to connect to data sink. 17221 items lost, 5000 queued items to flush.
[1432767924] ndomod: Still unable to connect to data sink. 17980 items lost, 5000 queued items to flush.
[1432768052] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;Memory Usage;CRITICAL;xi_service_notification_handler;connect to address 192. 168.10.2 and port 12489: Connection refused
[1432768314] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;Drive C: Disk Usage;CRITICAL;xi_service_notification_handler;connect to addre ss 192.168.10.2 and port 12489: Connection refused
[1432768840] ndomod: Still unable to connect to data sink. 18768 items lost, 5000 queued items to flush.
[1432768998] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;Uptime;CRITICAL;xi_service_notification_handler;connect to address 192.168.10 .2 and port 12489: Connection refused
[1432769191] HOST NOTIFICATION: nagiosadmin;160.110.246.130;DOWN;xi_host_notification_handler;CRITICAL - 160.110.246.130: Host unreachabl e @ 160.110.246.81. rta nan, lost 100%
[1432769445] Auto-save of retention data completed successfully.
[1432769639] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;CPU Usage;CRITICAL;xi_service_notification_handler;connect to address 192.168 .10.2 and port 12489: Connection refused
[1432769756] ndomod: Still unable to connect to data sink. 19564 items lost, 5000 queued items to flush.
[1432770658] ndomod: Still unable to connect to data sink. 20324 items lost, 5000 queued items to flush.
[1432771575] ndomod: Still unable to connect to data sink. 21091 items lost, 5000 queued items to flush.
[1432771914] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;Drive C: Disk Usage;CRITICAL;xi_service_notification_handler;connect to addre ss 192.168.10.2 and port 12489: Connection refused
[1432771951] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;Memory Usage;CRITICAL;xi_service_notification_handler;connect to address 192. 168.10.2 and port 12489: Connection refused
[1432772488] ndomod: Still unable to connect to data sink. 21876 items lost, 5000 queued items to flush.
[1432772598] SERVICE NOTIFICATION: nagiosadmin;192.168.10.2;Uptime;CRITICAL;xi_service_notification_handler;connect to address 192.168.10 .2 and port 12489: Connection refused
[1432772791] HOST NOTIFICATION: nagiosadmin;160.110.246.130;DOWN;xi_host_notification_handler;CRITICAL - 160.110.246.130: Host unreachabl e @ 160.110.246.81. rta nan, lost 100%

Re: ndo2db is not running ( NAGIOSXI )

Posted: Wed May 27, 2015 10:57 pm
by Box293
What is the output of these commands?

Code: Select all

ps -aef | grep ndo2db

df -h

df -i

cd /
du

ls -al /usr/local/nagios/var

Re: ndo2db is not running ( NAGIOSXI )

Posted: Thu May 28, 2015 7:13 pm
by Ashish Sood
Here is the output below
[root@localhost ~]# ps -aef | grep ndo2db
root 11901 11802 0 05:41 pts/0 00:00:00 grep --color=auto ndo2db
[root@localhost ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 18G 5.4G 13G 31% /
devtmpfs 908M 0 908M 0% /dev
tmpfs 917M 92K 917M 1% /dev/shm
tmpfs 917M 9.1M 908M 1% /run
tmpfs 917M 0 917M 0% /sys/fs/cgroup
/dev/sda1 497M 124M 373M 25% /boot
[root@localhost ~]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/centos-root 18317312 161812 18155500 1% /
devtmpfs 232213 406 231807 1% /dev
tmpfs 234707 8 234699 1% /dev/shm
tmpfs 234707 601 234106 1% /run
tmpfs 234707 13 234694 1% /sys/fs/cgroup
/dev/sda1 512000 330 511670 1% /boot
[root@localhost /]# ls -al /usr/local/nagios/var/
total 276
drwxrwxr-x. 6 nagios nagios 4096 May 29 05:43 .
drwxr-xrwx. 9 root root 87 May 21 01:19 ..
drwxrwxr-x. 2 nagios nagios 4096 May 27 23:59 archives
-rw-r--r--. 1 nagios nagios 0 May 29 05:43 host-perfdata
-rw-r--r--. 1 nagios nagios 34 May 29 05:19 nagios.configtest
-rw-r--r--. 1 nagios nagios 5 May 29 05:19 nagios.lock
-rw-r--r--. 1 nagios nagios 27545 May 29 05:34 nagios.log
-rw-rw-r--. 1 nagios nagios 13957 May 21 17:57 nagios.tmpZzXyjw
-rw-r--r--. 1 nagios nagios 0 May 29 05:19 ndo2db.lock
-rw-r--r--. 1 nagios nagios 59996 May 29 05:19 npcd.log
-rw-r--r--. 1 nagios nagios 38952 May 29 05:19 objects.cache
-rw-r--r--. 1 nagios nagios 38952 May 29 05:19 objects.precache
-rw-rw-r--. 1 nagios nagios 3958 May 28 12:23 perfdata.log
-rw-------. 1 nagios nagios 31036 May 29 05:19 retention.dat
drwxrwsr-x. 2 nagios nagcmd 39 May 29 05:19 rw
-rw-r--r--. 1 nagios nagios 378 May 29 05:43 service-perfdata
drwxr-xr-x. 5 nagios nagios 52 May 21 01:18 spool
drwxr-xr-x. 2 nagios nagios 21 May 29 05:43 stats
-rw-rw-r--. 1 nagios nagios 31491 May 29 05:43 status.dat

Thanks,
Ashish

Re: ndo2db is not running ( NAGIOSXI )

Posted: Thu May 28, 2015 7:31 pm
by Box293
Thanks for that.

Can you please post the ndo2db.cfg here. If you could upload it as an attachment it will allow us to view it exactly as it exists on your system.

/usr/local/nagios/etc/ndo2db.cfg

Re: ndo2db is not running ( NAGIOSXI )

Posted: Sat May 30, 2015 8:05 pm
by Ashish Sood
Sorry to reply late, Please find the attached

Re: ndo2db is not running ( NAGIOSXI )

Posted: Sun May 31, 2015 5:39 pm
by Box293
OK so here's why we might not be getting the debug log created:

Code: Select all

debug_level=1
abrist asked for it to be set to negative one (-1)

Code: Select all

debug_level=-1
Can you follow all the steps in his post http://support.nagios.com/forum/posting ... 0#pr139863 and does it create the debug log this time?