Page 2 of 3

Re: High CPU

Posted: Fri May 06, 2011 1:11 pm
by mguthrie
Hmm, lets have you try a few more things...

Code: Select all

service postgresql restart
And then can you run the following two commands and send the output.

Code: Select all

rm -f /usr/local/nagiosxi/var/dbmaint.lock
/usr/local/nagiosxi/cron/dbmaint.php

Re: High CPU

Posted: Fri May 06, 2011 2:30 pm
by r.jaynes

Code: Select all

[root@monitor init.d]# time service postgresql restart
Stopping postgresql service:                               [  OK  ]
Starting postgresql service:                               [  OK  ]

real	0m4.266s
user	0m0.156s
sys	0m0.517s

Code: Select all

[root@monitor init.d]# rm -f /usr/local/nagiosxi/var/dbmaint.lock
[root@monitor init.d]# /usr/local/nagiosxi/cron/dbmaint.php 
No log handling enabled - turning on stderr logging
add_mibdir: strings scanned in from /usr/share/snmp/mibs/.index are too large.  count = 143
 CREATING: /usr/local/nagiosxi/var/dbmaint.lock
CLEANING ndoutils TABLE 'commenthistory'...
SQL: DELETE FROM nagios_commenthistory WHERE entry_time < FROM_UNIXTIME(1273174148)
CLEANING ndoutils TABLE 'processevents'...
SQL: DELETE FROM nagios_processevents WHERE event_time < FROM_UNIXTIME(1273174148)
CLEANING ndoutils TABLE 'externalcommands'...
SQL: DELETE FROM nagios_externalcommands WHERE entry_time < FROM_UNIXTIME(1304105348)
CLEANING ndoutils TABLE 'logentries'...
SQL: DELETE FROM nagios_logentries WHERE logentry_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'notifications'...
SQL: DELETE FROM nagios_notifications WHERE start_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'contactnotifications'...
SQL: DELETE FROM nagios_contactnotifications WHERE start_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'contactnotificationmethods'...
SQL: DELETE FROM nagios_contactnotificationmethods WHERE start_time < FROM_UNIXTIME(1296934148)
CLEANING ndoutils TABLE 'statehistory'...
SQL: DELETE FROM nagios_statehistory WHERE state_time < FROM_UNIXTIME(1241638148)
CLEANING ndoutils TABLE 'timedevents'...
SQL: DELETE FROM nagios_timedevents WHERE event_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'systemcommands'...
SQL: DELETE FROM nagios_systemcommands WHERE start_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'servicechecks'...
SQL: DELETE FROM nagios_servicechecks WHERE start_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'hostchecks'...
SQL: DELETE FROM nagios_hostchecks WHERE start_time < FROM_UNIXTIME(1304709848)
CLEANING ndoutils TABLE 'eventhandlers'...
SQL: DELETE FROM nagios_eventhandlers WHERE start_time < FROM_UNIXTIME(1304709848)
LASTOPT:  1304706613
INTERVAL: 60
NOW:      1304710148
OPTTIME:  1304710213
CLEANING nagiosxi TABLE 'commands'...
SQL: DELETE FROM xi_commands WHERE processing_time < 1304681348::abstime::timestamp without time zone
CLEANING nagiosxi TABLE 'events'...
SQL: DELETE FROM xi_events WHERE processing_time < 1304681348::abstime::timestamp without time zone
SQL1: SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL
SQL2: DELETE FROM xi_meta WHERE meta_id IN (SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL)
CLEANING nagiosql TABLE 'logbook'...
SQL: DELETE FROM tbl_logbook WHERE time < FROM_UNIXTIME(1304681348)
Repair Complete: FAILED TO REMOVE LOCK FILE

Re: High CPU

Posted: Fri May 06, 2011 4:31 pm
by r.jaynes
I think we may be getting closer to the solution. I've closed the browser session to the NagiosXI page, and I've been monitoring top. Currently the averages are "load average: 2.12, 2.51, 2.47". I opened a browser, logged in, and am just sitting here, and it's still reporting roughly the same so far.

Re: High CPU

Posted: Mon May 09, 2011 10:10 am
by mguthrie
Ok, go ahead and keep us updated if the issue persists. I've got it on our TODO list to make sure postgresql vacuuming occurs without it timing out and stalling. The XI interface utilizes postgresql for user account info, dashboards, and interface settings.

Re: High CPU

Posted: Mon May 09, 2011 10:26 am
by r.jaynes
Right now current load is 1-min 2.04, 5-min 2.07, 15-min 2.23, CPU idle 78.31%. I'm pretty sure all of this happened around the 2011 upgrade, or the upgrade I did right before going to 2011. I'm sorry I don't remember the exact release.

Re: High CPU

Posted: Tue May 10, 2011 3:15 pm
by mguthrie
What does running "top" show as the top couple of items?

If you run:

Code: Select all

service postgresql stop
How does your CPU load look?

Re: High CPU

Posted: Tue May 10, 2011 4:18 pm
by r.jaynes
I changed the update rate from 3 seconds to 1 to get a better picture, and typically the process highest (when sorted by %CPU) is httpd, then php. After stopping postgresql and watching top for a few minutes, the 5 minute load dropped from ~2.6 to ~1. As i'm typing this post it's still going down. Currently at 0.71 and dropping.

Re: High CPU

Posted: Tue May 10, 2011 4:21 pm
by r.jaynes
After restarting postgresql, and letting top smooth out for a couple of minutes, the 5 minute average is at 2.20-2.60. I will sometimes see it dip into the 1.5 to 1.7 range, but then it's right back up to ~2.4.

Re: High CPU

Posted: Wed May 11, 2011 1:47 pm
by mguthrie
That's not horrible, but it's still higher than it should be. Any chance you can send me a Configuration Snapshot tarball by PM or email?

Re: High CPU

Posted: Thu May 12, 2011 8:23 am
by r.jaynes
Certainly! How would you go about doing that, please :)