Page 1 of 2
Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 11:14 am
by highness
When we installed our instance of Nagios, we separated out the MySQL servers to external servers for failover / redundancy. Its been working well for quite a while, but in the past few weeks, we've noticed several issues that we're trying to deal with:
1. When we ACK an alert, it stays ACK'd for a short period of time before it comes back.
2. We haven't been collecting any information in our event logs (Home --> Monitoring Process --> Event Log).
When I look at the logs, I see a bunch of entries like this:
Code: Select all
Dec 4 08:07:22 fe1 ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_logentries SET instance_id='1', logentry_time=FROM_UNIXTIME(1449245242), entry_time=FROM_UNIXTIME(1449245242), entry_time_usec='597398', logentry_type='65536', logentry_data='SERVICE ALERT: our_router\.ourcompany\.com;Ping;CRITICAL;SOFT;3;CRITICAL - 10\.75\.24\.211: rta nan, lost 100%', realtime_data='1', inferred_data_extracted='1''
Dec 4 08:07:22 fe1 ndo2db: mysql_error: 'Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed'
I made sure that MySQL wasn't running on the Nagios box (only on the remote MySQL box)
I copied the repairmysql.sh script over to the MySQL box and then on the MySQL box and did the following:
Code: Select all
service mysqld stop
repairmysql.sh nagios
service mysqld start
It repaired everything without a hitch. Didn't see anything weird or curious.
Went back to the Nagios box, restarted ndo2db and Nagios but am still seeing those entries in the logs on the Nagios box.
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 11:24 am
by tgriep
Lets try this process to repair the database.
Login to your XI system as root and run the following command twice to see if it repairs the database. Replace xxx.xxx.xxx.xxx with the IP address of your mysql server.
Code: Select all
mysqlcheck -f -r -u ndoutils -pn@gweb --databases nagios -h xxx.xxx.xxx.xxx
If you get errors on the second run of this command, post them here.
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 12:36 pm
by highness
We don't have a ndoutils user. I'm assuming that you mean nagios?
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 12:56 pm
by hsmith
What's the output of this command?
Code: Select all
echo "select User from user;" | mysql -u root -pnagiosxi mysql
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 1:01 pm
by highness
echo "select User from user;" | mysql -u root -pOURSECRETPASSWORD mysql
User
nagiosql
replication
nagios
nagiosql
nagios
nagiosql
root
nagios
nagiosql
root
root
root
replication
root
root
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 2:39 pm
by tgriep
You can use the root username and password in the mysqlcheck command. My example was using the default ones.
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 3:21 pm
by highness
When I try that command, I get this far:
Code: Select all
nagios.nagios_acknowledgements OK
nagios.nagios_commands OK
nagios.nagios_commenthistory OK
nagios.nagios_comments OK
nagios.nagios_configfiles OK
nagios.nagios_configfilevariables OK
nagios.nagios_conninfo OK
nagios.nagios_contact_addresses OK
nagios.nagios_contact_notificationcommands OK
nagios.nagios_contactgroup_members OK
nagios.nagios_contactgroups OK
nagios.nagios_contactnotificationmethods OK
nagios.nagios_contactnotifications OK
nagios.nagios_contacts OK
nagios.nagios_contactstatus OK
nagios.nagios_customvariables OK
nagios.nagios_customvariablestatus OK
nagios.nagios_dbversion OK
nagios.nagios_downtimehistory OK
nagios.nagios_eventhandlers OK
nagios.nagios_externalcommands OK
nagios.nagios_flappinghistory OK
nagios.nagios_host_contactgroups OK
nagios.nagios_host_contacts OK
nagios.nagios_host_parenthosts OK
nagios.nagios_hostchecks OK
nagios.nagios_hostdependencies OK
nagios.nagios_hostescalation_contactgroups OK
nagios.nagios_hostescalation_contacts OK
nagios.nagios_hostescalations OK
nagios.nagios_hostgroup_members OK
nagios.nagios_hostgroups OK
nagios.nagios_hosts OK
nagios.nagios_hoststatus OK
nagios.nagios_instances OK
At which point, the /tmp directory on the database server fills up and it hangs. The /tmp directory is 1G in size.
I'm still seeing in the logs that the Nagios box thinks that the nagios_logentries table is crashed:
Code: Select all
Dec 4 12:16:39 fe1 ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_logentries SET instance_id='1', logentry_time=FROM_UNIXTIME(1449260199), entry_time=FROM_UNIXTIME(1449260199), entry_time_usec='743762', logentry_type='514', logentry_data='External command error: Command failed', realtime_data='1', inferred_data_extracted='1''
Dec 4 12:16:39 fe1 ndo2db: mysql_error: 'Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed'
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Fri Dec 04, 2015 3:41 pm
by tgriep
MYSQL uses the /tmp folder when a repair is run for temporary files.
You can either increase the space for /tmp or you can point the tmpdir for mysql to some place with enough space.
Edit your my.cnf file on your mysql server
Add this line
tmpdir = /whatewer/you/want
Save the file and restart mysql
Make sure the permissions on that folder are good.
Then try and run the mysqlcheck again.
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Mon Dec 07, 2015 10:58 am
by highness
That seemed to work.
Ran it twice and the event log is back.
Thanks!
Re: Nagios XI 2014 2.6 mysql DB errors...
Posted: Mon Dec 07, 2015 12:29 pm
by lmiltchev
I am glad that your issue has been resolved, highness! Is it all right if we lock this topic?