Page 2 of 3
Re: High CPU
Posted: Fri May 06, 2011 1:11 pm
by mguthrie
Hmm, lets have you try a few more things...
And then can you run the following two commands and send the output.
Code: Select all
rm -f /usr/local/nagiosxi/var/dbmaint.lock
/usr/local/nagiosxi/cron/dbmaint.php
Re: High CPU
Posted: Fri May 06, 2011 2:30 pm
by r.jaynes
Code: Select all
[root@monitor init.d]# time service postgresql restart
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ]
real 0m4.266s
user 0m0.156s
sys 0m0.517s
Code: Select all
[root@monitor init.d]# rm -f /usr/local/nagiosxi/var/dbmaint.lock
[root@monitor init.d]# /usr/local/nagiosxi/cron/dbmaint.php
No log handling enabled - turning on stderr logging
add_mibdir: strings scanned in from /usr/share/snmp/mibs/.index are too large. count = 143
CREATING: /usr/local/nagiosxi/var/dbmaint.lock
CLEANING ndoutils TABLE 'commenthistory'...
SQL: DELETE FROM nagios_commenthistory WHERE entry_time < FROM_UNIXTIME(1273174148)
CLEANING ndoutils TABLE 'processevents'...
SQL: DELETE FROM nagios_processevents WHERE event_time < FROM_UNIXTIME(1273174148)
CLEANING ndoutils TABLE 'externalcommands'...
SQL: DELETE FROM nagios_externalcommands WHERE entry_time < FROM_UNIXTIME(1304105348)
CLEANING ndoutils TABLE 'logentries'...
SQL: DELETE FROM nagios_logentries WHERE logentry_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'notifications'...
SQL: DELETE FROM nagios_notifications WHERE start_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'contactnotifications'...
SQL: DELETE FROM nagios_contactnotifications WHERE start_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'contactnotificationmethods'...
SQL: DELETE FROM nagios_contactnotificationmethods WHERE start_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'statehistory'...
SQL: DELETE FROM nagios_statehistory WHERE state_time < FROM_UNIXTIME(1241638148)
CLEANING ndoutils TABLE 'timedevents'...
SQL: DELETE FROM nagios_timedevents WHERE event_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'systemcommands'...
SQL: DELETE FROM nagios_systemcommands WHERE start_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'servicechecks'...
SQL: DELETE FROM nagios_servicechecks WHERE start_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'hostchecks'...
SQL: DELETE FROM nagios_hostchecks WHERE start_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'eventhandlers'...
SQL: DELETE FROM nagios_eventhandlers WHERE start_time < FROM_UNIXTIME(1304709848)
LASTOPT: 1304706613
INTERVAL: 60
NOW: 1304710148
OPTTIME: 1304710213
CLEANING nagiosxi TABLE 'commands'...
SQL: DELETE FROM xi_commands WHERE processing_time < 1304681348::abstime::timestamp without time zone
CLEANING nagiosxi TABLE 'events'...
SQL: DELETE FROM xi_events WHERE processing_time < 1304681348::abstime::timestamp without time zone
SQL1: SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL
SQL2: DELETE FROM xi_meta WHERE meta_id IN (SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL)
CLEANING nagiosql TABLE 'logbook'...
SQL: DELETE FROM tbl_logbook WHERE time < FROM_UNIXTIME(1304681348)
Repair Complete: FAILED TO REMOVE LOCK FILE
Re: High CPU
Posted: Fri May 06, 2011 4:31 pm
by r.jaynes
I think we may be getting closer to the solution. I've closed the browser session to the NagiosXI page, and I've been monitoring top. Currently the averages are "load average: 2.12, 2.51, 2.47". I opened a browser, logged in, and am just sitting here, and it's still reporting roughly the same so far.
Re: High CPU
Posted: Mon May 09, 2011 10:10 am
by mguthrie
Ok, go ahead and keep us updated if the issue persists. I've got it on our TODO list to make sure postgresql vacuuming occurs without it timing out and stalling. The XI interface utilizes postgresql for user account info, dashboards, and interface settings.
Re: High CPU
Posted: Mon May 09, 2011 10:26 am
by r.jaynes
Right now current load is 1-min 2.04, 5-min 2.07, 15-min 2.23, CPU idle 78.31%. I'm pretty sure all of this happened around the 2011 upgrade, or the upgrade I did right before going to 2011. I'm sorry I don't remember the exact release.
Re: High CPU
Posted: Tue May 10, 2011 3:15 pm
by mguthrie
What does running "top" show as the top couple of items?
If you run:
How does your CPU load look?
Re: High CPU
Posted: Tue May 10, 2011 4:18 pm
by r.jaynes
I changed the update rate from 3 seconds to 1 to get a better picture, and typically the process highest (when sorted by %CPU) is httpd, then php. After stopping postgresql and watching top for a few minutes, the 5 minute load dropped from ~2.6 to ~1. As i'm typing this post it's still going down. Currently at 0.71 and dropping.
Re: High CPU
Posted: Tue May 10, 2011 4:21 pm
by r.jaynes
After restarting postgresql, and letting top smooth out for a couple of minutes, the 5 minute average is at 2.20-2.60. I will sometimes see it dip into the 1.5 to 1.7 range, but then it's right back up to ~2.4.
Re: High CPU
Posted: Wed May 11, 2011 1:47 pm
by mguthrie
That's not horrible, but it's still higher than it should be. Any chance you can send me a Configuration Snapshot tarball by PM or email?
Re: High CPU
Posted: Thu May 12, 2011 8:23 am
by r.jaynes
Certainly! How would you go about doing that, please
