Databases Performance settings

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
alabzhinov
Posts: 25
Joined: Mon Aug 30, 2010 10:56 pm

Databases Performance settings

Post by alabzhinov »

Hi,

1. CentOs 5 32-bit
2. VMWare virtual machine
3. NagiosXI 2011 R1.7
4. no additional features

I configured the database performance settings as:
Max External Commands Age: 7
Max Log Entries Age: 90
Max Notifications Age: 90
Max State History Age: 180
Other settings are defaults.

The problem is Nagios doesn't delete old information that is more than 180 days for state history or more than 90 days for logs. So i have all that data from the moment i installed Nagios XI (November 2010) and it's consuming a lot of space. My understanding that Nagios should automatically delete all information that is older than specified in performance settings, is it correct? If so how do i troubleshoot this issue?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Databases Performance settings

Post by mguthrie »

Can you run the maintenance script manually and post the output from it?

Code: Select all

/usr/local/nagiosxi/cron/dbmaint.php
I'm wondering if there might be an issue with the lock file for that process. Also, just to be safe can you also verify that your system time is up to date?
alabzhinov
Posts: 25
Joined: Mon Aug 30, 2010 10:56 pm

Re: Databases Performance settings

Post by alabzhinov »

[root@sehc-nagiosxi ~]# clock --show
Wed 05 Oct 2011 01:12:13 PM EDT -0.794337 seconds

[root@sehc-nagiosxi ~]# /usr/local/nagiosxi/cron/dbmaint.php
CREATING: /usr/local/nagiosxi/var/dbmaint.lock
CLEANING ndoutils TABLE 'commenthistory'...
SQL: DELETE FROM nagios_commenthistory WHERE entry_time < FROM_UNIXTIME(1286298778)
CLEANING ndoutils TABLE 'processevents'...
SQL: DELETE FROM nagios_processevents WHERE event_time < FROM_UNIXTIME(1286298778)
CLEANING ndoutils TABLE 'externalcommands'...
SQL: DELETE FROM nagios_externalcommands WHERE entry_time < FROM_UNIXTIME(1317229978)
CLEANING ndoutils TABLE 'logentries'...
SQL: DELETE FROM nagios_logentries WHERE logentry_time < FROM_UNIXTIME(1310058778)
CLEANING ndoutils TABLE 'notifications'...
SQL: DELETE FROM nagios_notifications WHERE start_time < FROM_UNIXTIME(1310058778)
CLEANING ndoutils TABLE 'contactnotifications'...
SQL: DELETE FROM nagios_contactnotifications WHERE start_time < FROM_UNIXTIME(1310058778)
CLEANING ndoutils TABLE 'contactnotificationmethods'...
SQL: DELETE FROM nagios_contactnotificationmethods WHERE start_time < FROM_UNIXTIME(1310058778)
CLEANING ndoutils TABLE 'statehistory'...
SQL: DELETE FROM nagios_statehistory WHERE state_time < FROM_UNIXTIME(1302282778)
CLEANING ndoutils TABLE 'timedevents'...
SQL: DELETE FROM nagios_timedevents WHERE event_time < FROM_UNIXTIME(1317834478)
CLEANING ndoutils TABLE 'systemcommands'...
SQL: DELETE FROM nagios_systemcommands WHERE start_time < FROM_UNIXTIME(1317834478)
CLEANING ndoutils TABLE 'servicechecks'...
SQL: DELETE FROM nagios_servicechecks WHERE start_time < FROM_UNIXTIME(1317834478)
CLEANING ndoutils TABLE 'hostchecks'...
SQL: DELETE FROM nagios_hostchecks WHERE start_time < FROM_UNIXTIME(1317834478)
CLEANING ndoutils TABLE 'eventhandlers'...
SQL: DELETE FROM nagios_eventhandlers WHERE start_time < FROM_UNIXTIME(1317834478)
LASTOPT: 1317833701
INTERVAL: 60
NOW: 1317834778
OPTTIME: 1317837301
CLEANING nagiosxi TABLE 'commands'...
SQL: DELETE FROM xi_commands WHERE processing_time < 1317805978::abstime::timestamp without time zone
CLEANING nagiosxi TABLE 'events'...
SQL: DELETE FROM xi_events WHERE processing_time < 1317805978::abstime::timestamp without time zone
SQL1: SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL
SQL2: DELETE FROM xi_meta WHERE meta_id IN (SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL)
CLEANING nagiosql TABLE 'logbook'...
SQL: DELETE FROM tbl_logbook WHERE time < FROM_UNIXTIME(1317805978)
Repair Complete: Removing Lock File
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Databases Performance settings

Post by mguthrie »

Ok, I'm wondering if you need to set your timezone for PHP. See the below wiki to set your PHP timezone, I'm wondering if that will take care of it:
http://support.nagios.com/wiki/index.ph ... e.22_Error
alabzhinov
Posts: 25
Joined: Mon Aug 30, 2010 10:56 pm

Re: Databases Performance settings

Post by alabzhinov »

I set the timezone in php.ini (date.timezone = America/Toronto), restarted httpd :

[root@sehc-nagiosxi ~]# service httpd restart
Stopping httpd: [ OK ]
Starting httpd: httpd: apr_sockaddr_info_get() failed for sehc-nagiosxi
httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
[ OK ]

then ran the db maintenance script, but the issue is still there.

By the way we are using NTP server to provide time. Nagios server is pointed to the corporate NTP server which in turn synchronizes from the Internet.
alabzhinov
Posts: 25
Joined: Mon Aug 30, 2010 10:56 pm

Re: Databases Performance settings

Post by alabzhinov »

here is mine dbmaint.php file. As you can see state history parameter is 730 days, although it is 180 days when you look in GUI ! There are also "comment history" and "process events" that i cannot change in GUI.

Code: Select all

#!/usr/bin/php -q
<?php
//
// Copyright (c) 2008-2009 Nagios Enterprises, LLC.  All rights reserved.
//  
// $Id: dbmaint.php 658 2011-06-01 21:49:38Z mguthrie $

define("SUBSYSTEM",1);
//define("BACKEND",1);

require_once(dirname(__FILE__).'/../html/config.inc.php');
require_once(dirname(__FILE__).'/../html/includes/utils.inc.php');

$dbmaint_lockfile="/usr/local/nagiosxi/var/dbmaint.lock";

init_dbmaint();
do_dbmaint_jobs();



function init_dbmaint(){
	global $dbmaint_lockfile;

	// check lock file
	if(@file_exists($dbmaint_lockfile)){
		$ft=filemtime($dbmaint_lockfile);
		$now=time();
		if(($now-$ft)>1800){
			echo "LOCKFILE '".$dbmaint_lockfile."' IS OLD - REMOVING\n";
			unlink($dbmaint_lockfile);
			}
		else{
			echo "LOCKFILE '".$dbmaint_lockfile."' EXISTS - EXITING!\n";
			exit();
			}
		}
	
	// create lock file
	echo "CREATING: $dbmaint_lockfile\n";
	file_put_contents($dbmaint_lockfile,"");
	//touch($dbmaint_lockfile);

	// make database connections
	$dbok=db_connect_all();
	if($dbok==false){
		echo "ERROR CONNECTING TO DATABASES!\n";
		exit();
		}

	return;
	}

function do_dbmaint_jobs(){
	global $dbmaint_lockfile;
	global $db_tables;
	global $cfg;
	global $max_time;
	global $sleep_time;

	$now=time();

	
	/////////////////////////////////////////////////////////////
	// TRIM NDOUTILS TABLES
	/////////////////////////////////////////////////////////////
	$dbminfo=$cfg['db_info']['ndoutils']['dbmaint'];
	
	// comment history (DAYS)
	$age=get_database_interval("ndoutils","max_commenthistory_age",365);
	$cutoff=$now-(intval($age)*60*60*24);
	clean_db_table(DB_NDOUTILS,"commenthistory","entry_time",$cutoff);
	
	// process events (DAYS)
	$age=get_database_interval("ndoutils","max_processevents_age",365);
	$cutoff=$now-(intval($age)*60*60*24);
	clean_db_table(DB_NDOUTILS,"processevents","event_time",$cutoff);
	
	// external commands (DAYS)
	$age=get_database_interval("ndoutils","max_externalcommands_age",7);
	$cutoff=$now-(intval($age)*60*60*24);
	clean_db_table(DB_NDOUTILS,"externalcommands","entry_time",$cutoff);
	
	// log entries (DAYS)
	$age=get_database_interval("ndoutils","max_logentries_age",90);
	$cutoff=$now-(intval($age)*60*60*24);
	clean_db_table(DB_NDOUTILS,"logentries","logentry_time",$cutoff);
	
	// notifications (DAYS)
	$age=get_database_interval("ndoutils","max_notifications_age",90);
	$cutoff=$now-(intval($age)*60*60*24);
	clean_db_table(DB_NDOUTILS,"notifications","start_time",$cutoff);
	clean_db_table(DB_NDOUTILS,"contactnotifications","start_time",$cutoff);
	clean_db_table(DB_NDOUTILS,"contactnotificationmethods","start_time",$cutoff);
	
	// state history (DAYS)
	$age=get_database_interval("ndoutils","max_statehistory_age",730);
	$cutoff=$now-(intval($age)*60*60*24);
	clean_db_table(DB_NDOUTILS,"statehistory","state_time",$cutoff);
	
	// timed events
	$age=get_database_interval("ndoutils","max_timedevents_age",5);
	$cutoff=$now-(intval($age)*60);
	clean_db_table(DB_NDOUTILS,"timedevents","event_time",$cutoff);
	
	// system commands
	$age=get_database_interval("ndoutils","max_systemcommands_age",5);
	$cutoff=$now-(intval($age)*60);
	clean_db_table(DB_NDOUTILS,"systemcommands","start_time",$cutoff);
	
	// service checks
	$age=get_database_interval("ndoutils","max_servicechecks_age",5);
	$cutoff=$now-(intval($age)*60);
	clean_db_table(DB_NDOUTILS,"servicechecks","start_time",$cutoff);
	
	// host checks
	$age=get_database_interval("ndoutils","max_hostchecks_age",5);
	$cutoff=$now-(intval($age)*60);
	clean_db_table(DB_NDOUTILS,"hostchecks","start_time",$cutoff);

	// event handlers
	$age=get_database_interval("ndoutils","max_eventhandlers_age",5);
	$cutoff=$now-(intval($age)*60);
	clean_db_table(DB_NDOUTILS,"eventhandlers","start_time",$cutoff);
	
		
	/////////////////////////////////////////////////////////////
	// OPTIMIZE NDOUTILS TABLES
	/////////////////////////////////////////////////////////////
	
	$optimize_interval=get_database_interval("ndoutils","optimize_interval",60);
	
	$optimize=false;
	$lastopt=get_meta(METATYPE_NONE,0,"last_ndoutils_optimization");
	if($lastopt==null){
		$optimize=true;
		echo "NEVER OPTIMIZED\n";
		}
	else{
		$opt_time=($lastopt + ($optimize_interval*60));
		if($now > $opt_time){
			$optimize=true;
			echo "TIME TO OPTIMIZE\n";
			}
		echo "LASTOPT:  $lastopt\n";
		echo "INTERVAL: $optimize_interval\n";
		echo "NOW:      $now\n";
		echo "OPTTIME:  $opt_time\n";
		}
	if($optimize_interval==0){
		echo "OPTIMIZE INTERVAL=0\n";
		$optimize=false;
		}
	if($optimize==true){
		foreach($db_tables[DB_NDOUTILS] as $table){
			echo "OPTIMIZING NDOUTILS TABLE: $table\n";
			optimize_table(DB_NDOUTILS,$table);
			}
		set_meta(METATYPE_NONE,0,"last_ndoutils_optimization",$now);
		}

	/////////////////////////////////////////////////////////////
	// REPAIR NDOUTILS TABLES
	/////////////////////////////////////////////////////////////
/*	
	$repair_interval=get_database_interval("ndoutils","repair_interval",0);
	
	$repair=false;
	$lastopt=get_meta(METATYPE_NONE,0,"last_ndoutils_repair");
	if($lastopt==null)
		$repair=true;
	else{
		if($now > ($lastopt + ($repair_interval*60)))
			$repair=true;
		}
	if($repair_interval==0)
		$repair=false;
	if($repair==true){
		foreach($db_tables[DB_NDOUTILS] as $table){
			echo "REPAIRING NDOUTILS TABLE: $table\n";
			repair_table(DB_NDOUTILS,$table);
			}
		set_meta(METATYPE_NONE,0,"last_ndoutils_repair",$now);
		}
*/ // -- this corrupts tables and can crash mysql on large tables 


	/////////////////////////////////////////////////////////////
	// TRIM NAGIOSXI TABLES
	/////////////////////////////////////////////////////////////
	$dbminfo=$cfg['db_info']['nagiosxi']['dbmaint'];
	
	// commands
	$cutoff=$now-(intval(get_database_interval("nagiosxi","max_commands_age",480))*60);
	clean_db_table(DB_NAGIOSXI,"commands","processing_time",$cutoff);
	
	// events
	$cutoff=$now-(intval(get_database_interval("nagiosxi","max_events_age",480))*60);
	clean_db_table(DB_NAGIOSXI,"events","processing_time",$cutoff);
	// event meta....
	// first find meta records with no matching event record...
	$sql="SELECT ".$db_tables[DB_NAGIOSXI]["meta"].".meta_id FROM ".$db_tables[DB_NAGIOSXI]["meta"]." LEFT JOIN ".$db_tables[DB_NAGIOSXI]["events"]." ON ".$db_tables[DB_NAGIOSXI]["meta"].".metaobj_id=".$db_tables[DB_NAGIOSXI]["events"].".event_id WHERE metatype_id='1' AND event_id IS NULL";
	echo "SQL1: $sql\n";
	// now delete the meta records
	//$sql2="DELETE FROM ".$db_tables[DB_NAGIOSXI]["meta"]." WHERE metatype_id IN (".$mids.")";
	$sql2="DELETE FROM ".$db_tables[DB_NAGIOSXI]["meta"]." WHERE meta_id IN (".$sql.")";
	echo "SQL2: $sql2\n";
	$rs=exec_sql_query(DB_NAGIOSXI,$sql2);
	
	
	/////////////////////////////////////////////////////////////
	// OPTIMIZE NAGIOSXI TABLES
	/////////////////////////////////////////////////////////////
	
	$optimize_interval=get_database_interval("nagiosxi","optimize_interval",60);
	
	$optimize=false;
	$lastopt=get_meta(METATYPE_NONE,0,"last_db_optimization");
	if($lastopt==null)
		$optimize=true;
	else{
		if($now > ($lastopt + ($optimize_interval*60)))
			$optimize=true;
		}
	if($optimize_interval==0)
		$optimize=false;
	if($optimize==true){
		foreach($db_tables[DB_NAGIOSXI] as $table){
			echo "OPTIMIZING NAGIOSXI TABLE: $table\n";
			optimize_table(DB_NAGIOSXI,$table);
			}
		set_meta(METATYPE_NONE,0,"last_db_optimization",$now);
		}

	/////////////////////////////////////////////////////////////
	// REPAIR NAGIOSXI TABLES
	/////////////////////////////////////////////////////////////
	/*
	$repair_interval=get_database_interval("nagiosxi","repair_interval",0);
	
	$optimize=false;
	$lastopt=get_meta(METATYPE_NONE,0,"last_db_repair");
	if($lastopt==null)
		$repair=true;
	else{
		if($now > ($lastopt + ($repair_interval*60)))
			$repair=true;
		}
	if(intval($repair_interval)==0)
		$repair=false;
	if($repair==true){
		foreach($db_tables[DB_NAGIOSXI] as $table){
			echo "REPAIRING NAGIOSXI TABLE: $table\n";
			repair_table(DB_NAGIOSXI,$table);
			}
		set_meta(METATYPE_NONE,0,"last_db_repair",$now);
		}
	*/ 
		
	/////////////////////////////////////////////////////////////
	// TRIM NAGIOSQL TABLES
	/////////////////////////////////////////////////////////////
	$dbminfo=$cfg['db_info']['nagiosql']['dbmaint'];
	
	// FIRST WE MUST CONNECT!
	//db_connect_nagiosql();
	
	// log book records
	$cutoff=$now-(intval(get_database_interval("nagiosql","max_logbook_age",480))*60);
	clean_db_table(DB_NAGIOSQL,"logbook","time",$cutoff);
	
	
	/////////////////////////////////////////////////////////////
	// OPTIMIZE NAGIOSQL TABLES
	/////////////////////////////////////////////////////////////
	
	$optimize_interval=get_database_interval("nagiosql","optimize_interval",60);
	
	$optimize=false;
	$lastopt=get_meta(METATYPE_NONE,0,"last_nagiosql_optimization");
	if($lastopt==null)
		$optimize=true;
	else{
		if($now > ($lastopt + ($optimize_interval*60)))
			$optimize=true;
		}
	if($optimize_interval==0)
		$optimize=false;
	if($optimize==true){
		foreach($db_tables[DB_NAGIOSQL] as $table){
			echo "OPTIMIZING NAGIOSQL TABLE: $table\n";
			optimize_table(DB_NAGIOSQL,$table);
			}
		set_meta(METATYPE_NONE,0,"last_nagiosql_optimization",$now);
		}

	/////////////////////////////////////////////////////////////
	// REPAIR NAGIOSQL TABLES
	/////////////////////////////////////////////////////////////
/*	
	$repair_interval=get_database_interval("nagiosql","repair_interval",0);
	
	$repair=false;
	$lastopt=get_meta(METATYPE_NONE,0,"last_nagiosql_repair");
	if($lastopt==null)
		$repair=true;
	else{
		if($now > ($lastopt + ($repair_interval*60)))
			$repair=true;
		}
	if($repair_interval==0)
		$repair=false;
	if($repair==true){
		foreach($db_tables[DB_NAGIOSQL] as $table){
			echo "REPAIRING NAGIOSQL TABLE: $table\n";
			repair_table(DB_NAGIOSQL,$table);
			}
		set_meta(METATYPE_NONE,0,"last_nagiosql_repair",$now);
		}
*/  //-- this crashes and corrupts tables


	update_sysstat();
	
	// delete lock file
	if(unlink($dbmaint_lockfile)) echo "Repair Complete: Removing Lock File\n";
	else echo "Repair Complete: FAILED TO REMOVE LOCK FILE\n"; 
	}
	
function clean_db_table($db,$table,$field,$ts){
	global $db_tables;
	
	echo "CLEANING $db TABLE '$table'...\n";
	
	$sql="DELETE FROM ".$db_tables[$db][$table]." WHERE ".$field." < ".sql_time_from_timestamp($ts,$db)."";
	echo "SQL: $sql\n";
	$rs=exec_sql_query($db,$sql);
	}
	
	
function optimize_table($db,$table){
	global $cfg;
	global $db_tables;
	
	$dbtype=$cfg['db_info'][$db]["dbtype"];
	
	// postgres
	if($dbtype=='pgsql'){
		$sql="VACUUM ANALYZE ".$table.";";
		}
	// mysql
	else{
		$sql="OPTIMIZE TABLE ".$table."";
		}
		
	echo "SQL: $sql\n";
	$rs=exec_sql_query($db,$sql);
	}
	
function repair_table($db,$table){
	global $db_tables;
	global $cfg;
	
	$dbtype=$cfg['db_info'][$db]["dbtype"];

	// only works with mysql
	if($dbtype=='mysql'){
		$sql="REPAIR TABLE ".$table."";
		echo "SQL: $sql\n";
		$rs=exec_sql_query($db,$sql);
		}
	}
	
function update_sysstat(){
	// record our run in sysstat table
	$arr=array(
		"last_check" => time(),
		);
	$sdata=serialize($arr);
	update_systat_value("dbmaint",$sdata);
	}
	
	

?>
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Databases Performance settings

Post by mguthrie »

What version of XI are you currently running and what distribution are you using? The hard-coded numbers that you're seeing are default values, so if the settings have been specified in the interface, they will override the defaults.
alabzhinov
Posts: 25
Joined: Mon Aug 30, 2010 10:56 pm

Re: Databases Performance settings

Post by alabzhinov »

1. CentOs 5 32-bit
2. VMWare virtual machine
3. NagiosXI 2011 R1.7
4. no additional features
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Databases Performance settings

Post by mguthrie »

[root@sehc-nagiosxi ~]# service httpd restart
Stopping httpd: [ OK ]
Starting httpd: httpd: apr_sockaddr_info_get() failed for sehc-nagiosxi
httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
[ OK ]
I hadn't seen this particular error before, so I googled it, and came across this quick howto. I'm not sure if it's the source of the issue, but lets give it a try and see what happens. If there's a hostname issue then there would potentially be other problems on the system as well.

http://tgrove.com/2007/12/02/httpd-apr_ ... -hostname/
alabzhinov
Posts: 25
Joined: Mon Aug 30, 2010 10:56 pm

Re: Databases Performance settings

Post by alabzhinov »

I fixed name resolution issue, now i don't have the error when i restart httpd.
I ran maintenance script, but it again have not deleted old information.
Locked