MySQL Errors and Ghosting

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
mikew
Posts: 243
Joined: Sun Feb 05, 2012 7:05 pm

MySQL Errors and Ghosting

Post by mikew »

I am just finishing up a project and now getting some database errors and ghost behavior. There were several hundred passive hosts with 3000 passive services checks disabled on this server. One problem is that the current hosts and services are fine and then 5 minutes later all of the passive services show up in the Home Page Service Status Summary as Critical...they have been disabled they should not show up. When you click on the Critical link you see the services listed. These services are disabled and the hosts they are on are disabled in the CCM. The changes have been applied and the output was successful. Server has 16 GB of RAM and 32 CPU and is a physical box with RAID 5.

Some Problems I am seeing:
Tactical Overview shows 100 Hosts and 800 Services
Hostgroup Grid shows 100 Hosts and 800 Services
Host Detail shows 100 Hosts and 800 Services
but
Home Page Service Status Summary shows 100 Hosts and 2200 Services(attached to all the hosts that were removed)
Service Detail shows 2200 services.
So the data summaries on the server itself are incorrect. Again, there are no text files to represent these ghosting services in the services directory.



I have repaired the database using the repair_database.sh script. I have truncated nagios_logentries and nagios_notifications with these commands:

Code: Select all

mysql -u ndoutils -pn@gweb nagios -e 'TRUNCATE TABLE nagios_logentries'
mysql -u ndoutils -pn@gweb nagios -e 'TRUNCATE TABLE nagios_notifications'
Then the database repair script was run again. Initially the repair seemed to work but then the issue returned in a few minutes. It seems like a ghosting issue of the past so I looked and the hosts and services in question are not listed in /usr/local/nagios/etc/services or hosts. There are no text files listing them. These are errors that have occurred in the /var/log/messages. The mysql log file is good. Any ideas?


Here are recent log entries:

Code: Select all

Sep 10 10:58:06 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='''

Sep 10 10:58:06 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'

Sep 10 10:58:06 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(1473505086), end_time_usec='975513', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.070000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='904627', end_time=FROM_UNIXTIME(1473505086), end_time_usec='975513', command_line='/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.host', timeout='5', early_timeout='0', execution_time='0.070000', return_code='0', output='', long_output='''

Sep 10 10:58:06 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'

Sep 10 10:58:06 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='''

Sep 10 10:58:06 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'

Sep 10 10:58:07 nagios-xi ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(1473505087), end_time_usec='135004', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.160000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1473505086), start_time_usec='975838', end_time=FROM_UNIXTIME(1473505087), end_time_usec='135004', command_line='/usr/local/nagios/libexec/check_beams\.sh /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1473505086\.perfdata\.service', timeout='5', early_timeout='0', execution_time='0.160000', return_code='0', output='', long_output='''

Sep 10 10:58:07 nagios-xi ndo2db: mysql_error: 'Table 'nagios.nagios_systemcommands' doesn't exist'


Nagios XI Installation Profile

Code: Select all

System:
Nagios XI Version : 5.2.7
nagios-xi 2.6.32-504.el6.x86_64 x86_64
CentOS release 6.6 (Final)
Gnome is not installed
Apache Information
PHP Version: 5.3.3
Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:48.0) Gecko/20100101 Firefox/48.0
Server Name: x.x.x.x
Server Address: x.x.x.x
Server Port: 443
Date/Time
PHP Timezone: UTC
PHP Time: Sat, 10 Sep 2016 10:59:31 +0000
System Time: Sat, 10 Sep 2016 10:59:31 +0000
Nagios XI Data
License ends in: x.x.x.x

nagios (pid 41556) is running...
NPCD running (pid 4003).
ndo2db (pid 19205) is running...
CPU Load 15: 2.20
Total Hosts: 334
Total Services: 793
Function 'get_base_uri' returns: https://x.x.x.x/nagiosxi/
Function 'get_base_url' returns: https://x.x.x.x/nagiosxi/
Function 'get_backend_url(internal_call=false)' returns: https://x.x.x.x/nagiosxi/includes/components/profile/profile.php
Function 'get_backend_url(internal_call=true)' returns: https://localhost/nagiosxi/backend/
Ping Test localhost
Running:

/bin/ping -c 3 localhost 2>&1 

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.046 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.030 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.035 ms

--- localhost ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.030/0.037/0.046/0.006 ms
Test wget To localhost
WGET From URL: https://localhost/nagiosxi/includes/components/ccm/
Running:

/usr/bin/wget https://localhost/nagiosxi/includes/components/ccm/ 

--2016-09-10 10:59:33-- https://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... 127.0.0.1, 127.0.0.1
Connecting to localhost|127.0.0.1|:443... connected.
ERROR: cannot verify localhost's certificate, issued by "/C=--/ST=SomeState/L=SomeCity/O=SomeOrganization/OU=SomeOrganizationalUnit/CN=nagios-xi-tor-a/emailAddress=root@nagios-xi":
Unable to locally verify the issuer's authority.
ERROR: certificate common name "nagios-xi" doesn't match requested host name "localhost".
To connect to localhost insecurely, use '--no-check-certificate'.
Mike Weber

Nagios Training/Consulting
User avatar
tgriep
Madmin
Posts: 9181
Joined: Thu Oct 30, 2014 9:02 am

Re: MySQL Errors and Ghosting

Post by tgriep »

Can you login to the XI server as root and post the output of the following commands?

Code: Select all

echo 'show tables;' | mysql  -t -pnagiosxi nagios
echo 'desc nagios_systemcommands;' | mysql  -t -pnagiosxi nagios
echo 'select * from nagios_systemcommands;' | mysql  -t -pnagiosxi nagios
ps -ef --cols=300
I want to verify that the nagios_systemcommands table is missing of that there is another issue.
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
mikew
Posts: 243
Joined: Sun Feb 05, 2012 7:05 pm

Re: MySQL Errors and Ghosting

Post by mikew »

Here is the output:

Code: Select all

show tables;
+----------------------------------------+
| Tables_in_nagios                       |
+----------------------------------------+
| nagios_acknowledgements                |
| nagios_commands                        |
| nagios_commenthistory                  |
| nagios_comments                        |
| nagios_configfiles                     |
| nagios_configfilevariables             |
| nagios_conninfo                        |
| nagios_contact_addresses               |
| nagios_contact_notificationcommands    |
| nagios_contactgroup_members            |
| nagios_contactgroups                   |
| nagios_contactnotificationmethods      |
| nagios_contactnotifications            |
| nagios_contacts                        |
| nagios_contactstatus                   |
| nagios_customvariables                 |
| nagios_customvariablestatus            |
| nagios_dbversion                       |
| nagios_downtimehistory                 |
| nagios_eventhandlers                   |
| nagios_externalcommands                |
| nagios_flappinghistory                 |
| nagios_host_contactgroups              |
| nagios_host_contacts                   |
| nagios_host_parenthosts                |
| nagios_hostchecks                      |
| nagios_hostdependencies                |
| nagios_hostescalation_contactgroups    |
| nagios_hostescalation_contacts         |
| nagios_hostescalations                 |
| nagios_hostgroup_members               |
| nagios_hostgroups                      |
| nagios_hosts                           |
| nagios_hoststatus                      |
| nagios_instances                       |
| nagios_logentries                      |
| nagios_notifications                   |
| nagios_objects                         |
| nagios_processevents                   |
| nagios_programstatus                   |
| nagios_runtimevariables                |
| nagios_scheduleddowntime               |
| nagios_service_contactgroups           |
| nagios_service_contacts                |
| nagios_service_parentservices          |
| nagios_servicechecks                   |
| nagios_servicedependencies             |
| nagios_serviceescalation_contactgroups |
| nagios_serviceescalation_contacts      |
| nagios_serviceescalations              |
| nagios_servicegroup_members            |
| nagios_servicegroups                   |
| nagios_services                        |
| nagios_servicestatus                   |
+----------------------------------------+
54 rows in set (0.00 sec)

Code: Select all

desc nagios_systemcommands;
ERROR 1146 (42S02): Table 'nagios.nagios_systemcommands' doesn't exist

Code: Select all

select * from nagios_systemcommands;
ERROR 1146 (42S02): Table 'nagios.nagios_systemcommands' doesn't exist
Mike Weber

Nagios Training/Consulting
User avatar
tgriep
Madmin
Posts: 9181
Joined: Thu Oct 30, 2014 9:02 am

Re: MySQL Errors and Ghosting

Post by tgriep »

Do you have a recent backup of the XI server that you can use to restore it to a known good state?
It looks like you are missing more that that one table. Here is a list of the ones that are missing from your server.

Code: Select all

| nagios_statehistory                    |
| nagios_systemcommands                  |
| nagios_timedeventqueue                 |
| nagios_timedevents                     |
| nagios_timeperiod_timeranges           |
| nagios_timeperiods                     |
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
mikew
Posts: 243
Joined: Sun Feb 05, 2012 7:05 pm

Re: MySQL Errors and Ghosting

Post by mikew »

There are backups, however a tremendous amount of time has recently been spent on the server, as the project is ready to close. I suspect this has been an issue since early on. So I am suspicious of backups and anything over a week old will require many hours to get back the work that has been done. Is it possible to insert the tables that are missing?
Mike Weber

Nagios Training/Consulting
User avatar
tgriep
Madmin
Posts: 9181
Joined: Thu Oct 30, 2014 9:02 am

Re: MySQL Errors and Ghosting

Post by tgriep »

I have some instructions for recreating those missing tables but there is no guarantee that this will fix it as there could be other corruption that we don't know about.
Make a full backup of the server before proceeding.
Create a file called /tmp/mysql.sql and put the following in it.

Code: Select all

--
-- Table structure for table `nagios_statehistory`
--
  
CREATE TABLE IF NOT EXISTS `nagios_statehistory` (
  `statehistory_id` int(11) NOT NULL auto_increment,
  `instance_id` smallint(6) NOT NULL default '0',
  `state_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `state_time_usec` int(11) NOT NULL default '0',
  `object_id` int(11) NOT NULL default '0',
  `state_change` smallint(6) NOT NULL default '0',
  `state` smallint(6) NOT NULL default '0',
  `state_type` smallint(6) NOT NULL default '0',
  `current_check_attempt` smallint(6) NOT NULL default '0',
  `max_check_attempts` smallint(6) NOT NULL default '0',
  `last_state` smallint(6) NOT NULL default '-1',
  `last_hard_state` smallint(6) NOT NULL default '-1',
  `output` varchar(255) character set latin1 NOT NULL default '',
  `long_output` TEXT NOT NULL default '',
  PRIMARY KEY  (`statehistory_id`)
) ENGINE=MyISAM COMMENT='Historical host and service state changes';
  
-- --------------------------------------------------------

--
-- Table structure for table `nagios_systemcommands`
--
  
CREATE TABLE IF NOT EXISTS `nagios_systemcommands` (
  `systemcommand_id` int(11) NOT NULL auto_increment,
  `instance_id` smallint(6) NOT NULL default '0',
  `start_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `start_time_usec` int(11) NOT NULL default '0',
  `end_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `end_time_usec` int(11) NOT NULL default '0',
  `command_line` varchar(255) character set latin1 NOT NULL default '',
  `timeout` smallint(6) NOT NULL default '0',
  `early_timeout` smallint(6) NOT NULL default '0',
  `execution_time` double NOT NULL default '0',
  `return_code` smallint(6) NOT NULL default '0',
  `output` varchar(255) character set latin1 NOT NULL default '',
  `long_output` TEXT NOT NULL default '',   
  PRIMARY KEY  (`systemcommand_id`),
  KEY `instance_id` (`instance_id`),
  KEY `start_time` (`start_time`)
) ENGINE=MyISAM  COMMENT='Historical system commands that are executed';

-- --------------------------------------------------------

--
-- Table structure for table `nagios_timedeventqueue`
-- 
  
CREATE TABLE IF NOT EXISTS `nagios_timedeventqueue` (
  `timedeventqueue_id` int(11) NOT NULL auto_increment,
  `instance_id` smallint(6) NOT NULL default '0',
  `event_type` smallint(6) NOT NULL default '0', 
  `queued_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `queued_time_usec` int(11) NOT NULL default '0',
  `scheduled_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `recurring_event` smallint(6) NOT NULL default '0',
  `object_id` int(11) NOT NULL default '0',
  PRIMARY KEY  (`timedeventqueue_id`),
  KEY `instance_id` (`instance_id`),
  KEY `event_type` (`event_type`),
  KEY `scheduled_time` (`scheduled_time`),
  KEY `object_id` (`object_id`)
) ENGINE=MyISAM  COMMENT='Current Nagios event queue';
  
-- --------------------------------------------------------

--
-- Table structure for table `nagios_timedevents`
-- 

CREATE TABLE IF NOT EXISTS `nagios_timedevents` (
  `timedevent_id` int(11) NOT NULL auto_increment,   
  `instance_id` smallint(6) NOT NULL default '0',
  `event_type` smallint(6) NOT NULL default '0',
  `queued_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `queued_time_usec` int(11) NOT NULL default '0',
  `event_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `event_time_usec` int(11) NOT NULL default '0',
  `scheduled_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `recurring_event` smallint(6) NOT NULL default '0',
  `object_id` int(11) NOT NULL default '0',
  `deletion_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `deletion_time_usec` int(11) NOT NULL default '0',
  PRIMARY KEY  (`timedevent_id`),
  KEY `instance_id` (`instance_id`),
  KEY `event_type` (`event_type`),
  KEY `scheduled_time` (`scheduled_time`),
  KEY `object_id` (`object_id`)
) ENGINE=MyISAM  COMMENT='Historical events from the Nagios event queue';
  
-- --------------------------------------------------------

--
-- Table structure for table `nagios_timeperiods`
-- 

CREATE TABLE IF NOT EXISTS `nagios_timeperiods` (
  `timeperiod_id` int(11) NOT NULL auto_increment,   
  `instance_id` smallint(6) NOT NULL default '0',
  `config_type` smallint(6) NOT NULL default '0',
  `timeperiod_object_id` int(11) NOT NULL default '0',
  `alias` varchar(255) character set latin1 NOT NULL default '',
  PRIMARY KEY  (`timeperiod_id`),
  UNIQUE KEY `instance_id` (`instance_id`,`config_type`,`timeperiod_object_id`)
) ENGINE=MyISAM  COMMENT='Timeperiod definitions';
  
-- --------------------------------------------------------
  
--
-- Table structure for table `nagios_timeperiod_timeranges`
--
  
CREATE TABLE IF NOT EXISTS `nagios_timeperiod_timeranges` (
  `timeperiod_timerange_id` int(11) NOT NULL auto_increment,
  `instance_id` smallint(6) NOT NULL default '0',
  `timeperiod_id` int(11) NOT NULL default '0',
  `day` smallint(6) NOT NULL default '0',
  `start_sec` int(11) NOT NULL default '0',
  `end_sec` int(11) NOT NULL default '0',
  PRIMARY KEY  (`timeperiod_timerange_id`),
  UNIQUE KEY `instance_id` (`timeperiod_id`,`day`,`start_sec`,`end_sec`)
) ENGINE=MyISAM  COMMENT='Timeperiod definitions';
Save it and then run this to recreate the the missing tables.

Code: Select all

mysql -u root -pnagiosxi nagios < /tmp/mysql.sql
Then I would restart the nagios and ndo2bd process just in case.

Let us know if this helps out in the issue you are having.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked