Page 1 of 2

Database Error

Posted: Tue Sep 26, 2017 4:42 am
by paltel
Dear can you help my
this massage appears before 3 days then I run the repair command but nothing happened the same error then I restart the server then the web started
then today the same problem appears and I show the database maintenance in admin page not running since 1 day this time the I didn't received any notification form Nagios

BR

Re: Database Error

Posted: Tue Sep 26, 2017 9:26 am
by scottwilkerson
can you run the following and post the results

Code: Select all

df -h
tail -f /var/log/mysqld.log
You may need to perform a more advanced repair or you could be out of disk space.

Some additional info can be found here
https://support.nagios.com/kb/article/n ... ables.html

Re: Database Error

Posted: Wed Sep 27, 2017 12:21 am
by paltel
[root@nagiosxi ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_nagiosxi-lv_root
100G 95G 0 100% /
tmpfs 7.8G 12K 7.8G 1% /dev/shm
/dev/sda1 477M 94M 359M 21% /boot
/dev/mapper/vg_nagiosxi-lv_home
20G 16G 2.8G 85% /home
[root@nagiosxi ~]#


[root@nagiosxi ~]# tail -f /var/log/mysqld.log
170927 7:28:59 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_contactnotificationmethods' is marked as crashed and last (automatic?) repair failed
170927 7:28:59 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_contactnotificationmethods' is marked as crashed and last (automatic?) repair failed
170927 7:28:59 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_contactnotificationmethods' is marked as crashed and last (automatic?) repair failed
170927 7:29:00 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_contactnotificationmethods' is marked as crashed and last (automatic?) repair failed
170927 8:05:01 [Warning] Disk is full writing '/tmp/STZjNvxx' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
170927 8:05:01 [Warning] Retry in 60 secs. Message reprinted in 600 secs
170927 8:06:09 [Warning] Disk is full writing '/tmp/STZjNvxx' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
170927 8:06:09 [Warning] Retry in 60 secs. Message reprinted in 600 secs
170927 8:11:10 [Warning] Disk is full writing '/tmp/STZjNvxx' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
170927 8:11:10 [Warning] Retry in 60 secs. Message reprinted in 600 secs
170927 8:21:10 [Warning] Disk is full writing '/tmp/STZjNvxx' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
170927 8:21:10 [Warning] Retry in 60 secs. Message reprinted in 600 secs



why disk full every day??
this case happened when I upgrade Nagios before 10 days
can you help my to solve this problem
every day disk full I delete sum files but the next day also begin full

BR

Re: Database Error

Posted: Wed Sep 27, 2017 11:36 am
by scottwilkerson
Hard to say why the disk is full without running some more commands.

I'd start by seeing which root folder is fullest with something like this

Code: Select all

du -hs /*
then maybe investigate the sub-directory of a large folder

Code: Select all

du -hs /store/*
Until you find the where all the data is bing consumed. Maybe you have local backups daily and they are filling the drive?

Once you have created disk space, you will likely need to run the database repair script

Code: Select all

/usr/local/nagiosxi/scripts/repair_databases.sh

Re: Database Error

Posted: Thu Sep 28, 2017 12:54 am
by paltel
[root@nagiosxi ~]# du -hs /*
7.8M /bin
91M /boot
4.0K /cgroup
184K /dev
48M /etc
8.0G /home
270M /lib
27M /lib64
16K /lost+found
4.0K /media
0 /misc
4.0K /mnt
0 /net
28M /opt
du: cannot access `/proc/4300/task/9540/fdinfo/188': No such file or directory
du: cannot access `/proc/4300/task/59360/fd/188': No such file or directory
du: cannot access `/proc/4300/task/60381/fdinfo/188': No such file or directory
du: cannot access `/proc/14377/task/14377/fd/4': No such file or directory
du: cannot access `/proc/14377/task/14377/fdinfo/4': No such file or directory
du: cannot access `/proc/14377/fd/4': No such file or directory
du: cannot access `/proc/14377/fdinfo/4': No such file or directory
du: cannot access `/proc/15292': No such file or directory
du: cannot access `/proc/15293': No such file or directory
du: cannot access `/proc/15294': No such file or directory
du: cannot access `/proc/15295': No such file or directory
du: cannot access `/proc/15299': No such file or directory
du: cannot access `/proc/15302': No such file or directory
du: cannot access `/proc/15303': No such file or directory
du: cannot access `/proc/15304': No such file or directory
du: cannot access `/proc/15308': No such file or directory
du: cannot access `/proc/15310': No such file or directory
du: cannot access `/proc/15313': No such file or directory
du: cannot access `/proc/15315': No such file or directory
du: cannot access `/proc/15517': No such file or directory
du: cannot access `/proc/15562': No such file or directory
du: cannot access `/proc/15563': No such file or directory
du: cannot access `/proc/15564': No such file or directory
0 /proc
24G /root
15M /sbin
4.0K /selinux
4.0K /srv
15G /store
0 /sys
11M /tmp
22G /usr
45G /var
You have new mail in /var/spool/mail/root



[root@nagiosxi ~]# du -hs /store/*
16G /store/backups

Re: Database Error

Posted: Thu Sep 28, 2017 3:57 am
by paltel
===============
REPAIR COMPLETE
===============
No log handling enabled - turning on stderr logging
Undefined OBJECT-GROUP (snmpBasicNotificationsGroup): At line 693 in /usr/share/snmp/mibs/v2-mib.my
Stopping ndo2db: done.
Starting ndo2db: done.
Running configuration check...
Stopping nagios:. done.
Starting nagios: done.

=======================
nagios database repair succeeded
nagiosql database repair succeeded

You have new mail in /var/spool/mail/root


I extend the drive and delete sum daily backup then I run the command that repair the DB the result in above but the error in database maintenance not solved

Re: Database Error

Posted: Thu Sep 28, 2017 8:07 am
by paltel
1.1G /var/lib/pgsql/data/base/16385/16424
1.1G /var/lib/pgsql/data/base/16385/16424.1
1.1G /var/lib/pgsql/data/base/16385/16424.10
1.1G /var/lib/pgsql/data/base/16385/16424.11
1.1G /var/lib/pgsql/data/base/16385/16424.12
1.1G /var/lib/pgsql/data/base/16385/16424.13
1.1G /var/lib/pgsql/data/base/16385/16424.14
1.1G /var/lib/pgsql/data/base/16385/16424.15
1.1G /var/lib/pgsql/data/base/16385/16424.16
1.1G /var/lib/pgsql/data/base/16385/16424.17
1.1G /var/lib/pgsql/data/base/16385/16424.18
1.1G /var/lib/pgsql/data/base/16385/16424.19
1.1G /var/lib/pgsql/data/base/16385/16424.2
1.1G /var/lib/pgsql/data/base/16385/16424.20
1.1G /var/lib/pgsql/data/base/16385/16424.21
1.1G /var/lib/pgsql/data/base/16385/16424.22
1.1G /var/lib/pgsql/data/base/16385/16424.23
1.1G /var/lib/pgsql/data/base/16385/16424.24
1.1G /var/lib/pgsql/data/base/16385/16424.25
1.1G /var/lib/pgsql/data/base/16385/16424.26
1.1G /var/lib/pgsql/data/base/16385/16424.27
1.1G /var/lib/pgsql/data/base/16385/16424.28
1.1G /var/lib/pgsql/data/base/16385/16424.29
1.1G /var/lib/pgsql/data/base/16385/16424.3
1.1G /var/lib/pgsql/data/base/16385/16424.30
1.1G /var/lib/pgsql/data/base/16385/16424.31
1.1G /var/lib/pgsql/data/base/16385/16424.32
1.1G /var/lib/pgsql/data/base/16385/16424.33
1.1G /var/lib/pgsql/data/base/16385/16424.34
1.1G /var/lib/pgsql/data/base/16385/16424.35
1.1G /var/lib/pgsql/data/base/16385/16424.36
1.1G /var/lib/pgsql/data/base/16385/16424.37
1.1G /var/lib/pgsql/data/base/16385/16424.38
1.1G /var/lib/pgsql/data/base/16385/16424.39
1.1G /var/lib/pgsql/data/base/16385/16424.4
1.1G /var/lib/pgsql/data/base/16385/16424.40
1.1G /var/lib/pgsql/data/base/16385/16424.41
1.1G /var/lib/pgsql/data/base/16385/16424.42
1.1G /var/lib/pgsql/data/base/16385/16424.43
1.1G /var/lib/pgsql/data/base/16385/16424.44
1.1G /var/lib/pgsql/data/base/16385/16424.45
1.1G /var/lib/pgsql/data/base/16385/16424.46
278M /var/lib/pgsql/data/base/16385/16424.47
1.1G /var/lib/pgsql/data/base/16385/16424.5
1.1G /var/lib/pgsql/data/base/16385/16424.6
1.1G /var/lib/pgsql/data/base/16385/16424.7
1.1G /var/lib/pgsql/data/base/16385/16424.8
1.1G

what is this ??

Re: Database Error

Posted: Thu Sep 28, 2017 10:21 am
by npolovenko
Hi,@paltel, seems like you upgraded your Nagios XI recently. Sometimes database tables don't get properly updated and it causes them to bloat. First, run this to truncate Postgres database tables: (you mind need to manually clean up a little space before you can start this because Nagios will copy some database files into the temp folder during the process)

Code: Select all

service nagios stop
service ndo2db stop
service crond stop
service postgresql restart
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi
service crond start
service ndo2db start
service nagios start
service npcd restart
After that's done, check your Postgres configuration file /var/lib/pgsql/data/postgresql.conf and make sure that autovacum is turned on.

Code: Select all

autovacuum = on
*Sometimes there are few extra options that need to be enabled as well to let the vacuum run automatically.
Also, let us know what version of Postgres you are running. You can find it in the conf file as well. Or run the command psql -V

I would highly recommend to manually run the vacuum as well: https://support.nagios.com/kb/article/n ... rface.html Scroll down to the "The postgresql service is not running or the database is not accepting commands" and follow the steps. Please note that the steps will depend on the version of Postgres you're running.

Finally, follow this steps to fix the database tables. Assuming you still have Nagios XI installation directory in your tmp folder(if you deleted it please redownload):

Code: Select all

service nagios stop
psql nagiosxi nagiosxi -f /tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql
service postgresql restart
service nagios start
Let us know how it goes!

Re: Database Error

Posted: Sun Oct 01, 2017 12:39 am
by paltel
[root@nagiosxi /]# service nagios stop
Stopping nagios:. done.
[root@nagiosxi /]# service ndo2db stop
Stopping ndo2db: done.
[root@nagiosxi /]# service crond stop
Stopping crond: [ OK ]
[root@nagiosxi /]# service postgresql restart
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ]
[root@nagiosxi /]# pkill -9 -u nagios
[root@nagiosxi /]# echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | psql nagiosxi nagiosxi
TRUNCATE TABLE
TRUNCATE TABLE
TRUNCATE TABLE
[root@nagiosxi /]# service crond start
Starting crond: [ OK ]
[root@nagiosxi /]# service ndo2db start
Starting ndo2db: done.
[root@nagiosxi /]# service nagios start
Starting nagios: done.
[root@nagiosxi /]# service npcd restart
NPCD was not running.
NPCD started.
[root@nagiosxi /]# psql -V
psql (PostgreSQL) 8.4.20
contains support for command-line editing
[root@nagiosxi /]# vi var/lib/pgsql/data/postgresql.conf
You have mail in /var/spool/mail/root
[root@nagiosxi /]# tail var/lib/pgsql/data/postgresql.conf
# - Other Platforms and Clients -

#transform_null_equals = off


#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------

#custom_variable_classes = '' # list of custom variable class names
You have mail in /var/spool/mail/root
[root@nagiosxi /]# vi /var/lib/pgsql/data/postgresql.conf
[root@nagiosxi /]# service nagios stop
Stopping nagios:. done.
[root@nagiosxi /]# psql nagiosxi nagiosxi -f /tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql
/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql: No such file or directory
[root@nagiosxi /]# cd tmp/
[root@nagiosxi tmp]# rm -rf nagiosxi xi*.tar.gz
[root@nagiosxi tmp]# wget https://assets.nagios.com/downloads/nag ... est.tar.gz
--2017-10-01 08:36:53-- https://assets.nagios.com/downloads/nag ... est.tar.gz
Resolving assets.nagios.com... 72.14.181.71, 2600:3c00::f03c:91ff:fedf:b821
Connecting to assets.nagios.com|72.14.181.71|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 55323422 (53M) [application/x-gzip]
Saving to: “xi-latest.tar.gz”

100%[==============================================================================================================================>] 55,323,422 14.1M/s in 5.8s

2017-10-01 08:37:00 (9.04 MB/s) - “xi-latest.tar.gz” saved [55323422/55323422]

[root@nagiosxi tmp]# tar xzf xi-latest.tar.gz
[root@nagiosxi tmp]# cd /
[root@nagiosxi /]# service nagios stop
Stopping nagios:No lock file found in /usr/local/nagios/var/nagios.lock
[root@nagiosxi /]# psql nagiosxi nagiosxi -f /tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:2: ERROR: column "api_key" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:3: ERROR: column "api_enabled" of relation "xi_users" already exists
UPDATE 32
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:8: ERROR: column "login_attempts" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:9: ERROR: column "last_attempt" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:10: ERROR: column "last_password_change" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:13: ERROR: column "last_login" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:14: ERROR: column "last_edited" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:15: ERROR: column "last_edited_by" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:16: ERROR: column "created_by" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:17: ERROR: column "created_time" of relation "xi_users" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:24: ERROR: relation "xi_eventqueue_eventqueue_id_seq" already exists
psql:/tmp/nagiosxi/nagiosxi/nagiosxi-db/mods/pgsql/schema_01.sql:33: ERROR: relation "xi_eventqueue" already exists
You have mail in /var/spool/mail/root
[root@nagiosxi /]# service postgresql restart
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ]
[root@nagiosxi /]# service nagios start
Starting nagios: done.
[root@nagiosxi /]#

Re: Database Error

Posted: Sun Oct 01, 2017 12:45 am
by paltel
HI
I run all commands but still Database Maintenance not running since 6 days