Page 1 of 1

Storage crash and database errors

Posted: Mon Oct 17, 2011 6:03 pm
by jsmurphy
We had a major storage crash (good times let me assure you) last week and it appears to have corrupted the Nagios XI database. The application still appears to be running ok but /var/log/messages is filled with these fun messages:

Oct 18 09:52:07 hostname ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_logentries SET instance_id='1', logentry_time=FROM_UNIXTIME(1318891926), entry_time=FROM_UNIXTIME(1318891926), entry_time_usec='908347', logentry_type='64', logentry_data='Finished daemonizing\.\.\. \(New PID=11235\)', realtime_data='1', inferred_data_extracted='1''
Oct 18 09:52:07 hostname ndo2db: mysql_error: 'Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed'
Oct 18 09:53:07 hostname ndo2db: Error: mysql_query() failed for 'DELETE FROM nagios_eventhandlers WHERE instance_id='1' AND start_time<FROM_UNIXTIME(1316213587)'
Oct 18 09:53:07 hostname ndo2db: mysql_error: 'Table './nagios/nagios_eventhandlers' is marked as crashed and last (automatic?) repair failed'
Oct 18 09:54:08 hostname ndo2db: Error: mysql_query() failed for 'DELETE FROM nagios_eventhandlers WHERE instance_id='1' AND start_time<FROM_UNIXTIME(1316213648)'
Oct 18 09:54:08 hostname ndo2db: mysql_error: 'Table './nagios/nagios_eventhandlers' is marked as crashed and last (automatic?) repair failed'

I'm using the standard 32-bit CentOS VMware image, is there a database health script hidden somewhere that might help restore order?

Re: Storage crash and database errors

Posted: Mon Oct 17, 2011 6:37 pm
by jsmurphy
Nevermind I found it! For anyone else who comes across the same problem the script is in /usr/local/nagiosxi/scripts/repairmysql.sh

To repair the nagios_logentries table I had to edit the script and change line number 44 from: "$cmd -r -q $t" to read "$cmd -r $t" this turns the quick fix flag off.

Re: Storage crash and database errors

Posted: Tue Oct 18, 2011 10:27 am
by mguthrie
Thanks for the update. We actually tweaked that script for the 1.8 release to use "-r -f" flags so it would have more successful repair runs.