Page 1 of 3

XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:01 pm
by StefanGu
Hello,

We are evaluating the XI product and have run into a problem that may preclude XI completely.
We use the downloadable centos appliance, latest version.

Problem being that the CPU(s) peg to 100% after approximately 2-3 days of up time, faster if anyone use the web interface or try and add/change hosts/services etc.
Memory seem to stay stable (50% or 1MB for 20 devices), but the number of processes increase 50% at these events, and CPU goes to 100%.
The added processes seems to be cron jobs, as we see duplication on these.

The internal perl engine has been turned off already, but with no change in behavior. Reboot corrects the issue until next time.

We can not take this system to production at this point and we have started looking into competing products now.

Please advice!

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:13 pm
by abrist
Lets try to hunt down the load issues. Could you pm me your profile.zip? (admin --> system profile --> download)

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:23 pm
by StefanGu
Don't seem to be able to PM... Attaching file here.

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:26 pm
by tmcdonald
I downloaded your profile and deleted it from the forum. It has potentially sensitive info in there you don't want to post publicly. You should be able to PM after two posts on the forum.

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:33 pm
by abrist
I got the attachment, though it looks like everything is running fine currently. Say, have you made use of our free quickstart? We offer a one time, one hour long, remote session with a tech support rep to sort out any evaluation/configuration issues you are experiencing. This might be the best way to hunt down the problem. Otherwise, once the server starts to have load issues, download a new profile and get it to us.

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:39 pm
by StefanGu
I'll download a new log once the issue re-appears.

We have not used the quickstart, perhaps we should, but everything else except for this issue is working fine. One would expect the appliance to be pre-packaged in such a way it works out of the box for at least a small number of hosts and services, like what we are doing for a PoC.

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:44 pm
by abrist
StefanGu wrote:One would expect the appliance to be pre-packaged in such a way it works out of the box for at least a small number of hosts and services, like what we are doing for a PoC.
And we do. There are many people running this appliance with reasonably large installs without issue. How many checks are you running every 5 minutes?

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:52 pm
by StefanGu
All 115 services are checked every 5 minutes, most are very simple, ping and nrpe data retrieval per host. 5 are simple http checks that complete very quickly.
A few Oracle SQL server monitors are in there, but only for 2 servers.

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 1:57 pm
by abrist
What is the average latency on the oracle checks?
With less than 1000 checks, the appliance should have no problem whatsoever. You may have checks that are timing out and staying open for the duration of their timeout.
If you wish to leverage a quickstart, fill out the form at:
http://www.nagios.com/downloadxi
And check the training and consultation boxes before submitting.

Re: XI Appliance 100% CPU

Posted: Thu Jan 02, 2014 2:14 pm
by slansing
What hardware did you provision this VM with? Can you copy a list and send it our way?