Page 1 of 1
DB backups - General q..
Posted: Thu Nov 11, 2010 5:58 pm
by lance
HI,
This isnt really an issue as such, probably a request for some admin advice.
We're running a distrbuted setup with NagiosXI as the central host & a combination of XI & Core distributed hosts, which is running very well. (All ESX VMs, RHEL5/CentOS)
On the central host we've mounted:
/var/lib/mysql
/store
on their own volumes using LVM, as I've seen them grow fairly quickly (Particularly the Store Partition). (Accross the distributed setup, theres only about @230 hosts, @900 Services, soon to be @500 hosts, 2300 services). We've had this setup since June this year.
Currently, the Mysql Nagios DB has hit 2G in size. I dont imagine this is excessive, but just want to be mindful of DB maintenace & Disk use.
So to help in managing the size of the database (& associated backups) & not being a SQL aficionado..:
- Is data aggreagated after a period of time?
- Do the DB Maintenance tasks perform any data purging/cleansing to help maintain the DB health?
- I guess with some SQL wizardry, I'd be able to export the performance data (We have access to a Data Warehouse). Would that assist in maintenance & keeping the DB size down? (ie export raw data - leave aggregated perf data..?)
From a backup perspective, what are the implications if:
- the db bakcup schedule is disabled
- say run the manual backup once a week (understanding there'd be loss of data)
On percieved benefit I can see is the process for recovery in case of a host failure..
As it is the nightly backup of mysql takes about 20mins
So at the momment in managing the space, we pretty much just increase the volume sizes via LVM as needed.
Appreciate any advice
Thanks
Lincoln
Re: DB backups - General q..
Posted: Fri Nov 12, 2010 10:38 am
by mguthrie
I'm going to have to pass this along to someone with more system admin knowledge than what I have, but I'll have one of our guys take a look at this.
I might also suggest posting this question to the Nagios users list through sourceforge. There are some definite power users for Nagios on that list and a lot of them are extremely knowledgeable.
nagios-users@lists.sourceforge.net
Re: DB backups - General q..
Posted: Sun Nov 14, 2010 12:51 am
by lance
Sure thing.
I'll head over to SF
Thanks for the reply
L.
Re: DB backups - General q..
Posted: Sun Nov 14, 2010 7:55 pm
by lance
Yet to get to SF, but been doing a bit of digging.
Have found that there are a couple of MySQl files in /var/lib/mysql/nagios that are pretty big:
nagios_logentries.MYD (11G)
nagios_logentries.MYI (660M)
Which are the largest.
nagios_externalcommands.MYD (5G)
nagios_externalcommands.MYI (320m)
Reading through the Nagios Core doc in relation to the nagios.cfg file, it suggests setting:
log_passive_checks=0
In a distributed setup (on the central host) as there could be a large amount of data. So have made this setting & wating to see if it makes a difference (in DB size).
Yet to figure out whether this disables population of the database via ndo..
Also, there are Table Trimming options in /usr/local/natios/etc/ndo2db.cfg,
**********************************************************
## TABLE TRIMMING OPTIONS
# Several database tables containing Nagios event data can become quite large
# over time. Most admins will want to trim these tables and keep only a
# certain amount of data in them. The options below are used to specify the
# age (in MINUTES) that data should be allowd to remain in various tables
# before it is deleted. Using a value of zero (0) for any value means that
# that particular table should NOT be automatically trimmed.
# Keep timed events for 24 hours
max_timedevents_age=1440
# Keep system commands for 1 week
max_systemcommands_age=10080
# Keep service checks for 1 week
max_servicechecks_age=10080
# Keep host checks for 1 week
max_hostchecks_age=10080
# Keep event handlers for 31 days
max_eventhandlers_age=44640
*********************************************************
Not sure if these actually act on the tables listed above or not.
The nagios_servicechecks files (MYD, MYI etc) seem to be comparatively small.
Is there a way to either:
- Ensure logging is turned off for data that makes it into teh nagios_logentries table
or
- include the nagios_logentries table in the table trimming options?
thanks
L.
Re: DB backups - General q..
Posted: Mon Nov 15, 2010 9:37 am
by admin
Hi Lance -
Based on your post, we'll be adding a new option to the dbmaint.php cron job for XI that allows you to have the job trim the logentries and externalcommands tables to a specific time frame.
There are a few existing DB trim jobs that get processed by the job right now. You can see the variables that determine how long data is saved in the following file:
/usr/local/nagiosxi/html/config.inc.php
Re: DB backups - General q..
Posted: Mon Nov 15, 2010 5:49 pm
by lance
Ok great.
thanks for that.
In the mean time any implications if I just run a job to get rid of log entries/external commands older than say a week?
have had a look at the relationship diagram in the ndoutils doc & there doesnt seem to be any foreign key restraints (that I can see at least..). & and have certainly tried clearing out the tables on a test system (Truncating), without breaking it..
Appreciate the feedbackk
L.
Re: DB backups - General q..
Posted: Tue Nov 16, 2010 3:32 pm
by mguthrie
As before, I'm going to pass intelligent answers to this up a level, but I thought it might be helpful to know that we appreciates threads like this, as we're not able to recreate testing environments for the large setups that some of our clients use. For situations like this, we're somewhat reliant on "customer testing" so to speak, so anything that you find that might help other users, we'd love for you to post. We'll do what we can to help.
Re: DB backups - General q..
Posted: Tue Nov 16, 2010 6:32 pm
by lance
Sure no problem.
I've worked with our DB team ( who mainly are responsible for the commercial databases - Oracle, MSSQL, DB2 etc) & they suggested it wouldnt be too difficult to script something that would basically get rid of any rows older than a particular date (ie how long you want to retain the logging data).
So came up with the following (basic) script & currently running in our test environment. The 7 days retention was just a starting point, as we'll need to review how long we want to retain the logging for. We created a separate user for this funtion.
**************************************************************************************************
#!/bin/bash
# this script cleans out old logs/external commands older than a week
USER=<user with delete role>
PASS=<password>
HOST=localhost
DB=nagios
# variable to automatically get date
tbDATE=`date -d '7 days ago' '+%Y-%m-%d'`
# cleanup nagios_logentries (retain weeks worth)
echo "cleaning up nagios_logentries table older than $tbDATE 23:59:59"
mysql --user=$USER --password=$PASS --host=$HOST --database=$DB -B -e \
"delete from $DB.nagios_logentries where entry_time < '$tbDATE 23:59:59'"
sleep 5
# cleanup nagios_externalcommands (retain weeks worth)
echo "cleaning up nagios_externalcommands table older than $tbDATE 23:59:59"
mysql --user=$USER --password=$PASS --host=$HOST --database=$DB -B -e \
"delete from $DB.nagios_externalcommands where entry_time < '$tbDATE 23:59:59'"
sleep 5
echo Done
*****************************************************************************************************
In the mean time, I've manually deleted rows out of the tables, retaining the previous 30 or so days of logging.
This has:
- reduced the table sizes significantly
- reduced the Backup sizes significantly
& seeems to have had no impact to the app!
Just from a administrative perspective as well (on the central host), in the main core nagios cfg file I've set:
use_syslog=0
log_passive_checks=0
Will probably need to apply the same logig to the Distributed hosts I suppose. I havent seenthe rapid increase in table space/disk use on those hosts at the moment.
I guess the main focus was on sizing. I havent looked at warehousing the data.
regards
L.
Re: DB backups - General q..
Posted: Wed Nov 17, 2010 5:50 pm
by lance
hey nice work with the latest release!
Just looking at the changes, can I confirm:
Am I right to to apply the new table trimming options I need to replace:
/usr/local/nagiosxi/html/config.inc.php
with
/usr/local/nagiosxi/html/config.inc.dist
?
Also in the file, do I change the
"max_logentries_age" => 365, // max time (in DAYS) to keep log entries
to a numeric that I want to retain the data for?
ie:
"max_logentries_age" => 7, // max time (in DAYS) to keep log entries
Thanks
L.