MySQL Errors and Ghosting
Posted: Sat Sep 10, 2016 6:44 am
I am just finishing up a project and now getting some database errors and ghost behavior. There were several hundred passive hosts with 3000 passive services checks disabled on this server. One problem is that the current hosts and services are fine and then 5 minutes later all of the passive services show up in the Home Page Service Status Summary as Critical...they have been disabled they should not show up. When you click on the Critical link you see the services listed. These services are disabled and the hosts they are on are disabled in the CCM. The changes have been applied and the output was successful. Server has 16 GB of RAM and 32 CPU and is a physical box with RAID 5.
Some Problems I am seeing:
Tactical Overview shows 100 Hosts and 800 Services
Hostgroup Grid shows 100 Hosts and 800 Services
Host Detail shows 100 Hosts and 800 Services
but
Home Page Service Status Summary shows 100 Hosts and 2200 Services(attached to all the hosts that were removed)
Service Detail shows 2200 services.
So the data summaries on the server itself are incorrect. Again, there are no text files to represent these ghosting services in the services directory.
I have repaired the database using the repair_database.sh script. I have truncated nagios_logentries and nagios_notifications with these commands:
Then the database repair script was run again. Initially the repair seemed to work but then the issue returned in a few minutes. It seems like a ghosting issue of the past so I looked and the hosts and services in question are not listed in /usr/local/nagios/etc/services or hosts. There are no text files listing them. These are errors that have occurred in the /var/log/messages. The mysql log file is good. Any ideas?
Here are recent log entries:
Nagios XI Installation Profile
Some Problems I am seeing:
Tactical Overview shows 100 Hosts and 800 Services
Hostgroup Grid shows 100 Hosts and 800 Services
Host Detail shows 100 Hosts and 800 Services
but
Home Page Service Status Summary shows 100 Hosts and 2200 Services(attached to all the hosts that were removed)
Service Detail shows 2200 services.
So the data summaries on the server itself are incorrect. Again, there are no text files to represent these ghosting services in the services directory.
I have repaired the database using the repair_database.sh script. I have truncated nagios_logentries and nagios_notifications with these commands:
Code: Select all
mysql -u ndoutils -pn@gweb nagios -e 'TRUNCATE TABLE nagios_logentries'
mysql -u ndoutils -pn@gweb nagios -e 'TRUNCATE TABLE nagios_notifications'
Here are recent log entries:
Code: Select all
Sep 10 10:58:06 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='''
Sep 10 10:58:06 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'
Sep 10 10:58:06 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(1473505086), end_time_usec='975513', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.070000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(1473505086), end_time_usec='975513', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.070000', return_code='0', output='', long_output='''
Sep 10 10:58:06 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'
Sep 10 10:58:06 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='''
Sep 10 10:58:06 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'
Sep 10 10:58:07 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(1473505087), end_time_usec='135004', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.160000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(1473505087), end_time_usec='135004', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.160000', return_code='0', output='', long_output='''
Sep 10 10:58:07 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'
Nagios XI Installation Profile
Code: Select all
System:
Nagios XI Version : 5.2.7
nagios-xi 2.6.32-504.el6.x86_64 x86_64
CentOS release 6.6 (Final)
Gnome is not installed
Apache Information
PHP Version: 5.3.3
Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:48.0) Gecko/20100101 Firefox/48.0
Server Name: x.x.x.x
Server Address: x.x.x.x
Server Port: 443
Date/Time
PHP Timezone: UTC
PHP Time: Sat, 10 Sep 2016 10:59:31 +0000
System Time: Sat, 10 Sep 2016 10:59:31 +0000
Nagios XI Data
License ends in: x.x.x.x
nagios (pid 41556) is running...
NPCD running (pid 4003).
ndo2db (pid 19205) is running...
CPU Load 15: 2.20
Total Hosts: 334
Total Services: 793
Function 'get_base_uri' returns: https://x.x.x.x/nagiosxi/
Function 'get_base_url' returns: https://x.x.x.x/nagiosxi/
Function 'get_backend_url(internal_call=false)' returns: https://x.x.x.x/nagiosxi/includes/components/profile/profile.php
Function 'get_backend_url(internal_call=true)' returns: https://localhost/nagiosxi/backend/
Ping Test localhost
Running:
/bin/ping -c 3 localhost 2>&1
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.046 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.030 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.035 ms
--- localhost ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.030/0.037/0.046/0.006 ms
Test wget To localhost
WGET From URL: https://localhost/nagiosxi/includes/components/ccm/
Running:
/usr/bin/wget https://localhost/nagiosxi/includes/components/ccm/
--2016-09-10 10:59:33-- https://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... 127.0.0.1, 127.0.0.1
Connecting to localhost|127.0.0.1|:443... connected.
ERROR: cannot verify localhost's certificate, issued by "/C=--/ST=SomeState/L=SomeCity/O=SomeOrganization/OU=SomeOrganizationalUnit/CN=nagios-xi-tor-a/emailAddress=root@nagios-xi":
Unable to locally verify the issuer's authority.
ERROR: certificate common name "nagios-xi" doesn't match requested host name "localhost".
To connect to localhost insecurely, use '--no-check-certificate'.