Page 4 of 4

Re: Can't Acknowledge or Schedule Down Time (Not Authorized)

Posted: Wed Oct 26, 2011 8:55 am
by espint
I ran that code and there were no errors but....

Backup produced this:

Code: Select all

[root@monsrv003 nagios]# /usr/local/nagiosxi/scripts/backup_xi.sh
Backing up Core Config Manager (NagiosQL)...
tar: Removing leading `/' from member names
tar: Removing leading `/' from member names
Backing up Nagios Core...
tar: Removing leading `/' from member names
tar: /usr/local/nagios/var/ndo.sock: socket ignored
tar: /usr/local/nagios/var/spool/checkresults/ctd7vZY: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/clebyCo.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cMEF0mY: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cD1zP4z.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/c1LJrEz: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cmT9gsZ.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cbhpFEY.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cmT9gsZ: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cpZxdyA: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/c1Y8wfY.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cD1zP4z: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cPHeX9a: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cjfQYln: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/c3VAkvz.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cnwCFqM: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cPrrV0L.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cnwCFqM.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/coHV3CM: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cbhpFEY: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cpZxdyA.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/coHV3CM.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cxoPsxb.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/chVxv7m: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/ctyg0HT: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cZADNoh: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cPIIkLa.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/c1Y8wfY: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/clebyCo: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/c1LJrEz.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cCUGebA.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cMEF0mY.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/chVxv7m.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cPIIkLa: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cPHeX9a.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cfZaW6E: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cK9hBGA.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cyUB0Tn.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/c3VAkvz: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cjfQYln.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cXK1JRc: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cXK1JRc.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cGIjTSM.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cRZwkLn.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/ctyg0HT.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cZADNoh.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cxoPsxb: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/clvJ19a: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cRZwkLn: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cK9hBGA: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cPrrV0L: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/clvJ19a.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cCUGebA: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cyUB0Tn: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cfZaW6E.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/ctd7vZY.ok: Cannot stat: No such file or directory
tar: /usr/local/nagios/var/spool/checkresults/cGIjTSM: Cannot stat: No such file or directory
tar: Error exit delayed from previous errors
Backing up Nagios XI...
tar: Removing leading `/' from member names
Backing up MySQL databases...
mysqldump: Got error: 144: Table './nagios/nagios_conninfo' is marked as crashed and last (automatic?) repair failed when using LOCK TABLES
Error backing up MySQL database 'nagios' - check the password in this script!
You have new mail in /var/spool/mail/root
[root@monsrv003 nagios]# 
Thanks again

Re: Can't Acknowledge or Schedule Down Time (Not Authorized)

Posted: Wed Oct 26, 2011 10:23 am
by mguthrie
Hmm, we need to try and get that table repaired...

Code: Select all

service nagios stop
service mysqld stop
cd /var/lib/mysql/nagios
myisamchk -r -f nagios_nagios_conninfo.MYI
service mysqld start
service nagios start
If we can't get the table repaired we'll have to drop it and rebuild it, but I'd rather we try the repair runs a few times before we move to that. Let us know how it goes.

Re: Can't Acknowledge or Schedule Down Time (Not Authorized)

Posted: Wed Oct 26, 2011 4:18 pm
by espint
Tried the repair 3 times no luck. The command myisamchk -r -f nagos_nagios_confninfo.MYI produced the following errors all three times and the backup command failed the same as before.

[root@monsrv003 nagios]# myisamchk -r -f nagios_nagios_conninfo.MYI
Warning: option 'key_buffer_size': unsigned value 18446744073709551615 adjusted to 4294963200
Warning: option 'read_buffer_size': unsigned value 18446744073709551615 adjusted to 4294967295
Warning: option 'write_buffer_size': unsigned value 18446744073709551615 adjusted to 4294967295
Warning: option 'sort_buffer_size': unsigned value 18446744073709551615 adjusted to 4294967295
myisamchk: error: File 'nagios_nagios_conninfo.MYI' doesn't exist
[root@monsrv003 nagios]# service mysqld start

Re: Can't Acknowledge or Schedule Down Time (Not Authorized)

Posted: Thu Oct 27, 2011 10:28 am
by mguthrie
Wow that table is massive. That's a limitation with that particular version of mysql. There's an option to increase that buffer size, but there was a bug in older versions of mysql where it didn't work. If you're running RHEL 6, or CentOS5.7 or CentOS 6, you might be able to add the --max-buffer-size=18446744073709551615 flag to the repair run, but I'm guessing if your table is that large it's probably an older OS that's been running for a while.

In order to save the rest of your historical data for the backup, we'll have to drop, and recreate this table. This table simply stores connection information about ndoutils. I'm guessing it got huge after the table got corrupted.

Code: Select all

mysql -u root -p'nagiosxi' nagios
DROP TABLE nagios_conninfo;
Then run this query to rebuild the table.

Code: Select all

CREATE TABLE IF NOT EXISTS `nagios_conninfo` (
  `conninfo_id` int(11) NOT NULL auto_increment,
  `instance_id` smallint(6) NOT NULL default '0',
  `agent_name` varchar(32) character set latin1 NOT NULL default '',
  `agent_version` varchar(8) character set latin1 NOT NULL default '',
  `disposition` varchar(16) character set latin1 NOT NULL default '',
  `connect_source` varchar(16) character set latin1 NOT NULL default '',
  `connect_type` varchar(16) character set latin1 NOT NULL default '',
  `connect_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `disconnect_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `last_checkin_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `data_start_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `data_end_time` datetime NOT NULL default '0000-00-00 00:00:00',
  `bytes_processed` int(11) NOT NULL default '0',
  `lines_processed` int(11) NOT NULL default '0',
  `entries_processed` int(11) NOT NULL default '0',
  PRIMARY KEY  (`conninfo_id`)
) ENGINE=MyISAM  COMMENT='NDO2DB daemon connection information';

Re: Can't Acknowledge or Schedule Down Time (Not Authorized)

Posted: Thu Nov 03, 2011 7:25 am
by espint
I lost all host and service names. Gave up this fight and restored to a backup from 6 months ago. That worked and the errors resolved. Added new hosts and brought the configuration up to date. Same old story, I wish I had a more recent backup. If I had a wish list now, it would be automating the backup process. Thanks for all your effort.

Re: Can't Acknowledge or Schedule Down Time (Not Authorized)

Posted: Thu Nov 03, 2011 10:51 am
by mguthrie
Thanks for the update. Sorry we weren't able to preserve the current DB info. A simple and automated backup procedure is also on our TODO list as well.

Re: Can't Acknowledge or Schedule Down Time (Not Authorized)

Posted: Fri Nov 04, 2011 7:35 am
by espint
I was a little premature with saying all was well after the restore. All the new hosts and their services are working fine, but existing host have a status of Unknown and Status Information says: Server port must be an integer. I have tried removing the existing services and then the host but that does not fix the problem. Any Ideas.