Page 2 of 3
Re: XI Appliance 100% CPU
Posted: Thu Jan 02, 2014 2:24 pm
by StefanGu
Not sure I follow you question properly.
The VM is provisioned on a DELL 2950 gen III, gave it 2 vCPU's w. 2GB RAM. Normally it only consumes 6-700 MHz CPU (of 4.6G available) unless this issue appears, with some spikes due to the checking, but still low. Active guest mem is 700MB (removed ConsoleKit from Centos, Virt Mem hog). 10 GB Disk space left of 42GB.
Or, are you looking for monitored Hosts info?
Re: XI Appliance 100% CPU
Posted: Thu Jan 02, 2014 2:59 pm
by abrist
Hold on. Only 10GB left of the original 42gb? This would imply that XI has used 30GB of space during your trial . . . For the number of checks you are performing, that is waaaay too high. Lets find what is using all that space:
Code: Select all
cd /
du -hsx * | sort -rh | head -10
find . -type f -print0 | xargs -0 du | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
find . -type d -print0 | xargs -0 du | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
Re: XI Appliance 100% CPU
Posted: Thu Jan 02, 2014 3:17 pm
by StefanGu
My Apologies... The VMware Snapshop manager tripped me up... Been taking snapshots as we have tried different things and I missed that in the vSphere Resource Page.
Code: Select all
[root@nagios ~]# cd /
[root@nagios /]# du -hsx * | sort -rh | head -10
du: cannot access `proc/12689/task/12689/fd/4': No such file or directory
du: cannot access `proc/12689/task/12689/fdinfo/4': No such file or directory
du: cannot access `proc/12689/fd/4': No such file or directory
du: cannot access `proc/12689/fdinfo/4': No such file or directory
1.8G usr
382M var
335M tmp
218M lib
41M root
40M boot
27M etc
21M lib64
11M sbin
6.8M bin
[root@nagios /]# find . -type f -print0 | xargs -0 du | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
find: `./proc/12733/task/12733/fd/5': No such file or directory
find: `./proc/12733/task/12733/fdinfo/5': No such file or directory
find: `./proc/12733/fd/5': No such file or directory
find: `./proc/12733/fdinfo/5': No such file or directory
find: `./proc/12766': No such file or directory
23M ./var/tmp/oracle-instantclient11.2-basiclite-11.2.0.4.0-1.x86_64.rpm
25M ./var/cache/yum/x86_64/6/epel/0b444eb501e46d0e4a6212862e504fbccebe552dd3071721eae98561778d7252-primary.sqlite
28M ./var/lib/rpm/Packages
37M ./var/log/httpd/access_log-20131216
53M ./usr/lib/oracle/12.1/client64/lib/libclntsh.so.12.1
60M ./tmp/VMwareTools-9.4.0-1280544.tar.gz
61M ./var/log/httpd/access_log-20131230
67M ./tmp/oracle-instantclient12.1-basic-12.1.0.1.0-1.x86_64.rpm
95M ./usr/lib/locale/locale-archive
130M ./usr/lib/oracle/12.1/client64/lib/libociei.so
[root@nagios /]# find . -type d -print0 | xargs -0 du | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
du: cannot access `./proc/13064/task/13064/fd/4': No such file or directory
du: cannot access `./proc/13064/task/13064/fdinfo/4': No such file or directory
du: cannot access `./proc/13064/fd/4': No such file or directory
du: cannot access `./proc/13064/fdinfo/4': No such file or directory
find: `./proc/13058/task/13058/fd/5': No such file or directory
find: `./proc/13058/task/13058/fdinfo/5': No such file or directory
find: `./proc/13058/fd/5': No such file or directory
find: `./proc/13058/fdinfo/5': No such file or directory
find: `./proc/13068/task/13068/fd/3': No such file or directory
find: `./proc/13068/task/13068/fd/6': No such file or directory
find: `./proc/13068/task/13068/fd/7': No such file or directory
find: `./proc/13068/task/13068/fdinfo/7': No such file or directory
find: `./proc/13068/fd/6': No such file or directory
find: `./proc/13068/fdinfo/7': No such file or directory
find: `./proc/13071': No such file or directory
find: `./proc/13072': No such file or directory
du: cannot access `./proc/13073/task/13073/fd/4': No such file or directory
du: cannot access `./proc/13073/task/13073/fdinfo/4': No such file or directory
du: cannot access `./proc/13073/fd/4': No such file or directory
du: cannot access `./proc/13073/fdinfo/4': No such file or directory
du: cannot access `./proc/13058': No such file or directory
du: cannot access `./proc/13058/task': No such file or directory
du: cannot access `./proc/13058/task/13058': No such file or directory
du: cannot access `./proc/13058/task/13058/fd': No such file or directory
du: cannot access `./proc/13058/task/13058/fdinfo': No such file or directory
du: cannot access `./proc/13058/task/13058/attr': No such file or directory
du: cannot access `./proc/13058/fd': No such file or directory
du: cannot access `./proc/13058/fdinfo': No such file or directory
du: cannot access `./proc/13058/net': No such file or directory
du: cannot access `./proc/13058/net/dev_snmp6': No such file or directory
du: cannot access `./proc/13058/net/netfilter': No such file or directory
du: cannot access `./proc/13058/net/stat': No such file or directory
du: cannot access `./proc/13058/attr': No such file or directory
du: cannot access `./proc/13068': No such file or directory
du: cannot access `./proc/13068/task': No such file or directory
du: cannot access `./proc/13068/task/13068': No such file or directory
du: cannot access `./proc/13068/task/13068/fd': No such file or directory
du: cannot access `./proc/13068/task/13068/fdinfo': No such file or directory
du: cannot access `./proc/13068/task/13068/attr': No such file or directory
du: cannot access `./proc/13068/fd': No such file or directory
du: cannot access `./proc/13068/fdinfo': No such file or directory
du: cannot access `./proc/13068/net': No such file or directory
du: cannot access `./proc/13068/net/dev_snmp6': No such file or directory
du: cannot access `./proc/13068/net/netfilter': No such file or directory
du: cannot access `./proc/13068/net/stat': No such file or directory
du: cannot access `./proc/13068/attr': No such file or directory
382M ./var
503M ./usr/lib
503M ./usr/lib
503M ./usr/lib
625M ./usr/share
625M ./usr/share
625M ./usr/share
1.8G ./usr
1.8G ./usr
du: cannot access `./proc/13104/task/13104/fd/4': No such file or directory
du: cannot access `./proc/13104/task/13104/fdinfo/4': No such file or directory
du: cannot access `./proc/13104/fd/4': No such file or directory
du: cannot access `./proc/13104/fdinfo/4': No such file or directory
2.8G .
Re: XI Appliance 100% CPU
Posted: Thu Jan 02, 2014 3:46 pm
by abrist
Judging from those finds, it looks like you may only have used 10gb of the 40gb, does that sound right?
Re: XI Appliance 100% CPU
Posted: Thu Jan 02, 2014 3:52 pm
by StefanGu
10GB thin provisioned, could be expanded if needed.
Code: Select all
[root@nagios ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root
7.5G 2.9G 4.2G 41% /
tmpfs 939M 0 939M 0% /dev/shm
/dev/sda1 485M 50M 410M 11% /boot
Re: XI Appliance 100% CPU
Posted: Thu Jan 02, 2014 6:09 pm
by StefanGu
Still no issue, even after heavy use with reconfigurations and scheduling of immediate checks
Re: XI Appliance 100% CPU
Posted: Fri Jan 03, 2014 10:35 am
by slansing
Okay, just keep us apprised.
Re: XI Appliance 100% CPU
Posted: Fri Jan 03, 2014 6:17 pm
by StefanGu
Profile take at the end of the graphing period, unfortunately, we missed the 100% period this time.
Nagios 100 CPU episode 2014-01-02.png
Example of 100% CPU episode, this time it recovered on it's own.
Re: XI Appliance 100% CPU
Posted: Mon Jan 06, 2014 10:44 am
by slansing
Have you verified with out a doubt that no other systems on your Vsphere server are having this issue was well? If that profile was downloaded during the 100% CPU incident, I can't see anything regarding it in the logs contained since they were taken about 2 hours after the incident. How long did it last for, just the amount of time shown?
Re: XI Appliance 100% CPU
Posted: Mon Jan 06, 2014 1:06 pm
by StefanGu
No other VM is exhibiting this issue, including a competing Nagios product.
It did recover as indicated in the CPU graph, but is now out of memory this morning instead. This means that I cannot log in and take any log file snapshots.
It appears that php jobs never end in some cases (cron?)
About to give up on XI now.