Page 4 of 6
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Thu Feb 05, 2015 10:13 am
by carlos.atos
Hello scottwilkerson ,
Sorry I did that yesterday as well, sorry again.
here is the output at the same moment of the spikes yesterday for both VMs
4 vCPU VM:
Code: Select all
[root@localhost ~]# ps -ef|grep php
nagios 24771 24769 0 15:23 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios 24774 24768 0 15:23 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
nagios 24775 24771 5 15:23 ? 00:00:02 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 24777 24766 0 15:23 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios 24779 24777 9 15:23 ? 00:00:03 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 24781 24774 7 15:23 ? 00:00:03 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios 24784 24770 0 15:23 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 24786 24784 13 15:23 ? 00:00:05 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
root 25011 15036 0 15:23 pts/0 00:00:00 grep php
[root@localhost ~]#
2 vCPU VM:
Code: Select all
[root@localhost ~]# ps -ef|grep php
nagios 4937 4934 0 14:48 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
nagios 4939 4936 0 14:48 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 4940 4932 0 14:48 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
nagios 4941 4937 3 14:48 ? 00:00:01 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios 4943 4939 8 14:48 ? 00:00:03 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios 4944 4940 3 14:48 ? 00:00:01 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 4949 4935 0 14:48 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
nagios 4952 4949 3 14:48 ? 00:00:01 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
root 5171 4797 0 14:48 pts/0 00:00:00 grep php
what do you see?
I will be waiting for any spike to catch them again with ps -ef|grep php|grep -v /bin/sh .
cheers
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Thu Feb 05, 2015 10:46 am
by scottwilkerson
I was wondering if there was going to be a bunch of these processes but it doesn't look like it
I guess if you see a spike again a better command would be just
One thing to note though, if this is a VM environment, if another VM starts to totally eat all of the disk I/O you would see a spike in load even of your CPU's had free cycles, and Nagios itself has a large appetite for disk I/O
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Mon Feb 09, 2015 8:05 am
by carlos.atos
Hi scottwilkerson
I had a big load spike yesterday but I was't in the office to notice it and catch the procceses. both nagios VMs on the Host went crazy up to 300 in load.
4vCPUs-FEB-09-current_load.png
I connected to the console on the 4vCPUs VM and there were a message saying that i have a message in /var/spool/mail/root
and this is what I got from there.
Code: Select all
###### WARNING ######
Errors reported during AutoMySQLBackup execution.. Backup failed
Error log below..
-- Warning: Skipping the data of table mysql.event. Specify the --events option explicitly.
From [email protected] Tue Dec 23 08:00:09 2014
Return-Path: <[email protected]>
X-Original-To: root@localhost
Delivered-To: [email protected]
Received: by localhost.localdomain (Postfix, from userid 0)
id 8A50565E; Tue, 23 Dec 2014 08:00:09 +0000 (GMT)
Date: Tue, 23 Dec 2014 08:00:09 +0000
To: [email protected]
Subject: PostgreSQL Backup Log for localhost - 2014-12-23
User-Agent: Heirloom mailx 12.4 7/29/08
MIME-Version: 1.0
2015-01-26 01:50:40: ERROR: Target[193.138.100.2_49][_OUT_] ' $target->[50]{$mode} ' did not eval into defined data
2015-01-26 01:50:40: ERROR: Target[193.138.100.2_50][_IN_] ' $target->[51]{$mode} ' did not eval into defined data
2015-01-26 01:50:40: ERROR: Target[193.138.100.2_50][_OUT_] ' $target->[51]{$mode} ' did not eval into defined data
2015-01-26 01:50:40: ERROR: Target[193.138.100.2_51][_IN_] ' $target->[52]{$mode} ' did not eval into defined data
2015-01-26 01:50:40: ERROR: Target[193.138.100.2_51][_OUT_] ' $target->[52]{$mode} ' did not eval into defined data
2015-01-26 01:50:40: ERROR: Target[193.138.100.2_66][_IN_] ' $target->[53]{$mode} ' did not eval into defined data
2015-01-26 01:50:40: ERROR: Target[193.138.100.2_66][_OUT_] ' $target->[53]{$mode} ' did not eval into defined data
From [email protected] Sun Feb 8 00:32:02 2015
Return-Path: <[email protected]>
X-Original-To: root
Delivered-To: [email protected]
Received: by localhost.localdomain (Postfix, from userid 0)
id 4C96E70B; Sun, 8 Feb 2015 00:21:54 +0000 (GMT)
From: [email protected] (Cron Daemon)
To: [email protected]
Subject: Cron <root@localhost> LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
X-Cron-Env: <LANG=en_US.UTF-8>
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
X-Cron-Env: <USER=root>
Message-Id: <[email protected]>
Date: Sun, 8 Feb 2015 00:21:29 +0000 (GMT)
2015-02-08 00:16:54: ERROR: I guess another mrtg is running. A lockfile (/var/lock/mrtg/mrtg_l) aged
223 seconds is hanging around. If you are sure that no other mrtg
is running you can remove the lockfile
From [email protected] Sun Feb 8 12:09:30 2015
Return-Path: <[email protected]>
X-Original-To: root
Delivered-To: [email protected]
Received: by localhost.localdomain (Postfix, from userid 0)
id 6FED9940; Sun, 8 Feb 2015 08:25:24 +0000 (GMT)
From: [email protected] (Cron Daemon)
To: [email protected]
Subject: Cron <root@localhost> /root/scripts/autopostgresqlbackup
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
X-Cron-Env: <LANG=en_US.UTF-8>
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
X-Cron-Env: <USER=root>
Message-Id: <[email protected]>
Date: Sun, 8 Feb 2015 08:15:48 +0000 (GMT)
psql: FATAL: sorry, too many clients already
I ran ps aux (but not at the spike moment) and this is what I've got.
Code: Select all
[root@localhost ~]# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 19360 1020 ? Ss Feb03 1:42 /sbin/init
root 2 0.0 0.0 0 0 ? S Feb03 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Feb03 0:49 [migration/0]
root 4 0.0 0.0 0 0 ? S Feb03 0:06 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/0]
root 6 0.0 0.0 0 0 ? S Feb03 0:06 [watchdog/0]
root 7 0.0 0.0 0 0 ? S Feb03 0:21 [migration/1]
root 8 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/1]
root 9 0.0 0.0 0 0 ? S Feb03 0:13 [ksoftirqd/1]
root 10 0.0 0.0 0 0 ? S Feb03 0:06 [watchdog/1]
root 11 0.0 0.0 0 0 ? S Feb03 0:56 [migration/2]
root 12 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/2]
root 13 0.0 0.0 0 0 ? S Feb03 0:09 [ksoftirqd/2]
root 14 0.0 0.0 0 0 ? S Feb03 0:05 [watchdog/2]
root 15 0.0 0.0 0 0 ? S Feb03 0:24 [migration/3]
root 16 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/3]
root 17 0.0 0.0 0 0 ? S Feb03 0:11 [ksoftirqd/3]
root 18 0.0 0.0 0 0 ? S Feb03 0:06 [watchdog/3]
root 19 0.0 0.0 0 0 ? S Feb03 2:13 [events/0]
root 20 0.0 0.0 0 0 ? S Feb03 3:55 [events/1]
root 21 0.0 0.0 0 0 ? S Feb03 2:20 [events/2]
root 22 0.0 0.0 0 0 ? S Feb03 2:58 [events/3]
root 23 0.0 0.0 0 0 ? S Feb03 0:00 [cgroup]
root 24 0.0 0.0 0 0 ? S Feb03 0:00 [khelper]
root 25 0.0 0.0 0 0 ? S Feb03 0:00 [netns]
root 26 0.0 0.0 0 0 ? S Feb03 0:00 [async/mgr]
root 27 0.0 0.0 0 0 ? S Feb03 0:00 [pm]
root 28 0.0 0.0 0 0 ? S Feb03 0:11 [sync_supers]
root 29 0.0 0.0 0 0 ? S Feb03 0:09 [bdi-default]
root 30 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/0]
root 31 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/1]
root 32 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/2]
root 33 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/3]
root 34 0.0 0.0 0 0 ? S Feb03 1:34 [kblockd/0]
root 35 0.0 0.0 0 0 ? S Feb03 0:53 [kblockd/1]
root 36 0.0 0.0 0 0 ? S Feb03 1:23 [kblockd/2]
root 37 0.0 0.0 0 0 ? S Feb03 0:50 [kblockd/3]
root 38 0.0 0.0 0 0 ? S Feb03 0:00 [kacpid]
root 39 0.0 0.0 0 0 ? S Feb03 0:00 [kacpi_notify]
root 40 0.0 0.0 0 0 ? S Feb03 0:00 [kacpi_hotplug]
root 41 0.0 0.0 0 0 ? S Feb03 0:00 [ata_aux]
root 42 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/0]
root 43 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/1]
root 44 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/2]
root 45 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/3]
root 46 0.0 0.0 0 0 ? S Feb03 0:00 [ksuspend_usbd]
root 47 0.0 0.0 0 0 ? S Feb03 0:00 [khubd]
root 48 0.0 0.0 0 0 ? S Feb03 0:00 [kseriod]
root 49 0.0 0.0 0 0 ? S Feb03 0:00 [md/0]
root 50 0.0 0.0 0 0 ? S Feb03 0:00 [md/1]
root 51 0.0 0.0 0 0 ? S Feb03 0:00 [md/2]
root 52 0.0 0.0 0 0 ? S Feb03 0:00 [md/3]
root 53 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/0]
root 54 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/1]
root 55 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/2]
root 56 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/3]
root 57 0.0 0.0 0 0 ? S Feb03 0:00 [linkwatch]
root 58 0.0 0.0 0 0 ? S Feb03 0:08 [khungtaskd]
root 59 0.1 0.0 0 0 ? S Feb03 10:30 [kswapd0]
root 60 0.0 0.0 0 0 ? SN Feb03 0:00 [ksmd]
root 61 0.0 0.0 0 0 ? SN Feb03 2:01 [khugepaged]
root 62 0.0 0.0 0 0 ? S Feb03 0:00 [aio/0]
root 63 0.0 0.0 0 0 ? S Feb03 0:00 [aio/1]
root 64 0.0 0.0 0 0 ? S Feb03 0:00 [aio/2]
root 65 0.0 0.0 0 0 ? S Feb03 0:00 [aio/3]
root 66 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/0]
root 67 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/1]
root 68 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/2]
root 69 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/3]
root 77 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/0]
root 78 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/1]
root 79 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/2]
root 80 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/3]
root 81 0.0 0.0 0 0 ? S Feb03 0:00 [pciehpd]
root 83 0.0 0.0 0 0 ? S Feb03 0:00 [kpsmoused]
root 84 0.0 0.0 0 0 ? S Feb03 0:00 [usbhid_resumer]
root 85 0.0 0.0 0 0 ? S Feb03 0:00 [deferwq]
root 116 0.0 0.0 0 0 ? S Feb03 0:00 [kdmremove]
root 117 0.0 0.0 0 0 ? S Feb03 0:00 [kstriped]
root 283 0.0 0.0 0 0 ? S Feb03 0:46 [mpt_poll_0]
root 284 0.0 0.0 0 0 ? S Feb03 0:00 [mpt/0]
root 309 0.0 0.0 0 0 ? S Feb03 0:00 [scsi_eh_0]
root 318 0.0 0.0 0 0 ? S Feb03 0:00 [scsi_eh_1]
root 319 0.0 0.0 0 0 ? S Feb03 0:00 [scsi_eh_2]
root 422 0.0 0.0 0 0 ? S Feb03 0:00 [kdmflush]
root 424 0.0 0.0 0 0 ? S Feb03 0:00 [kdmflush]
root 442 0.0 0.0 0 0 ? S Feb03 3:16 [jbd2/dm-0-8]
root 443 0.0 0.0 0 0 ? S Feb03 0:00 [ext4-dio-unwrit]
root 516 0.0 0.0 11028 260 ? S<s Feb03 0:00 /sbin/udevd -d
root 712 0.0 0.0 0 0 ? S Feb03 0:35 [vmmemctl]
root 720 0.0 0.0 0 0 ? S Feb03 1:45 [flush-253:0]
root 858 0.0 0.0 11028 244 ? S< Feb03 0:00 /sbin/udevd -d
root 865 0.0 0.0 10636 248 ? S< Feb03 0:00 /sbin/udevd -d
root 889 0.0 0.0 0 0 ? S Feb03 0:00 [jbd2/sda1-8]
root 890 0.0 0.0 0 0 ? S Feb03 0:00 [ext4-dio-unwrit]
root 928 0.0 0.0 0 0 ? S Feb03 2:10 [kauditd]
root 1175 0.1 0.0 93144 728 ? S<sl Feb03 11:05 auditd
root 1199 0.1 0.1 249476 6860 ? Sl Feb03 13:09 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
dbus 1212 0.0 0.0 21532 448 ? Ss Feb03 0:00 dbus-daemon --system
root 1401 0.0 0.0 66688 592 ? Ss Feb03 0:45 /usr/sbin/sshd
root 1410 0.0 0.0 22188 620 ? Ss Feb03 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root 1420 0.3 0.0 189580 2296 ? Sl Feb03 26:12 /usr/sbin/vmtoolsd
ntp 1425 0.0 0.0 30732 1464 ? Ss Feb03 1:14 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root 1475 0.0 0.0 108168 1244 ? S Feb03 0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/my
mysql 1592 0.6 1.0 2256124 40968 ? Sl Feb03 57:01 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql -
postgres 1631 0.1 0.1 216356 4160 ? S Feb03 16:49 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
postgres 1702 0.0 0.0 179368 684 ? Ss Feb03 2:02 postgres: logger process
postgres 1713 0.0 0.0 216472 2952 ? Ss Feb03 7:09 postgres: writer process
postgres 1714 0.0 0.0 216356 968 ? Ss Feb03 5:18 postgres: wal writer process
postgres 1715 0.0 0.0 216644 1180 ? Ss Feb03 4:35 postgres: autovacuum launcher process
postgres 1716 0.0 0.0 179636 868 ? Ss Feb03 8:29 postgres: stats collector process
root 1717 0.0 0.0 81328 2820 ? Ss Feb03 2:41 /usr/libexec/postfix/master
postfix 1727 0.0 0.0 83144 3016 ? S Feb03 1:54 qmgr -l -t fifo -u
root 1752 0.0 0.3 336788 15316 ? Ss Feb03 2:31 /usr/sbin/httpd
root 1762 0.1 0.0 117336 740 ? Ss Feb03 9:08 crond
nagios 1772 0.1 0.0 368888 936 ? S Feb03 15:30 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
ajaxterm 1780 0.1 0.0 170340 1616 ? Sl Feb03 10:25 python /usr/share/ajaxterm/ajaxterm.py --daemon --port=8022 --uid=ajaxterm
nagios 1852 0.0 0.0 50296 236 ? Ss Feb03 0:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
root 1861 0.0 0.0 67552 1360 ? Ss Feb03 0:00 login -- root
root 1863 0.0 0.0 4064 488 tty2 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty2
root 1865 0.0 0.0 4064 488 tty3 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty3
root 1867 0.0 0.0 4064 488 tty4 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty4
root 1869 0.0 0.0 4064 488 tty5 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty5
root 1871 0.0 0.0 4064 488 tty6 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty6
root 2068 0.0 0.0 2085024 1212 ? Sl Feb03 0:00 /usr/sbin/console-kit-daemon --no-daemon
root 2138 0.0 0.0 108304 1340 tty1 Ss+ Feb03 0:00 -bash
apache 5872 0.8 0.9 464048 37492 ? S Feb08 12:12 /usr/sbin/httpd
apache 5875 0.8 0.9 464304 37544 ? S Feb08 11:54 /usr/sbin/httpd
apache 5877 0.8 0.9 464820 38324 ? S Feb08 12:34 /usr/sbin/httpd
apache 5881 0.8 0.9 464048 37488 ? S Feb08 12:22 /usr/sbin/httpd
apache 5883 0.8 0.9 464816 38264 ? S Feb08 11:53 /usr/sbin/httpd
apache 5886 0.8 0.9 464304 37528 ? S Feb08 12:08 /usr/sbin/httpd
apache 5888 0.9 0.9 465356 38772 ? S Feb08 13:44 /usr/sbin/httpd
apache 5890 0.9 0.9 465248 38720 ? S Feb08 13:44 /usr/sbin/httpd
apache 5905 0.9 0.9 465500 38828 ? S Feb08 13:57 /usr/sbin/httpd
apache 5921 0.9 0.9 465500 38960 ? S Feb08 13:52 /usr/sbin/httpd
postgres 5923 0.0 0.1 217688 5356 ? Ss Feb08 0:57 postgres: nagiosxi nagiosxi [local] idle
postgres 5924 0.0 0.1 217688 5356 ? Ss Feb08 0:59 postgres: nagiosxi nagiosxi [local] idle
apache 5928 0.9 0.9 465184 38624 ? S Feb08 13:33 /usr/sbin/httpd
postgres 5929 0.0 0.1 217688 5340 ? Ss Feb08 1:02 postgres: nagiosxi nagiosxi [local] idle
postgres 5934 0.0 0.1 217688 5372 ? Ss Feb08 0:59 postgres: nagiosxi nagiosxi [local] idle
postgres 5940 0.0 0.1 217688 5348 ? Ss Feb08 1:00 postgres: nagiosxi nagiosxi [local] idle
postgres 5946 0.0 0.1 217688 5360 ? Ss Feb08 0:56 postgres: nagiosxi nagiosxi [local] idle
apache 5953 0.8 0.9 464560 37780 ? S Feb08 12:42 /usr/sbin/httpd
apache 5958 0.8 0.9 464724 38176 ? S Feb08 12:16 /usr/sbin/httpd
postgres 5961 0.0 0.1 217688 5348 ? Ss Feb08 1:01 postgres: nagiosxi nagiosxi [local] idle
apache 5962 0.9 0.9 464948 38372 ? S Feb08 13:19 /usr/sbin/httpd
apache 5966 0.8 0.9 464560 38000 ? S Feb08 12:11 /usr/sbin/httpd
apache 6041 0.8 0.9 464048 37484 ? S Feb08 12:38 /usr/sbin/httpd
postgres 6061 0.0 0.1 217760 6020 ? Ss Feb08 1:05 postgres: nagiosxi nagiosxi [local] idle
postgres 6070 0.0 0.1 217688 5356 ? Ss Feb08 1:00 postgres: nagiosxi nagiosxi [local] idle
postgres 6071 0.0 0.1 217688 5348 ? Ss Feb08 0:58 postgres: nagiosxi nagiosxi [local] idle
postgres 6079 0.0 0.1 217688 5356 ? Ss Feb08 0:55 postgres: nagiosxi nagiosxi [local] idle
postgres 6087 0.0 0.1 217688 5356 ? Ss Feb08 1:03 postgres: nagiosxi nagiosxi [local] idle
apache 6100 0.9 0.9 465224 38676 ? S Feb08 13:17 /usr/sbin/httpd
postgres 6112 0.0 0.1 217688 5344 ? Ss Feb08 0:51 postgres: nagiosxi nagiosxi [local] idle
postgres 6122 0.0 0.1 217688 5360 ? Ss Feb08 1:07 postgres: nagiosxi nagiosxi [local] idle
postgres 6123 0.0 0.1 217688 5312 ? Ss Feb08 0:57 postgres: nagiosxi nagiosxi [local] idle
postgres 6124 0.0 0.1 217688 5352 ? Ss Feb08 0:59 postgres: nagiosxi nagiosxi [local] idle
postgres 6141 0.0 0.1 217688 5352 ? Ss Feb08 1:04 postgres: nagiosxi nagiosxi [local] idle
postfix 20498 0.0 0.1 81612 4040 ? S 12:49 0:00 smtp -t unix -u
postfix 20499 0.0 0.1 81612 4040 ? S 12:49 0:00 smtp -t unix -u
postfix 20500 0.0 0.1 81612 4040 ? S 12:49 0:00 smtp -t unix -u
postfix 20501 0.1 0.1 81612 4036 ? S 12:49 0:00 smtp -t unix -u
postfix 20502 0.0 0.1 81612 4044 ? S 12:49 0:00 smtp -t unix -u
root 20605 0.4 0.1 100448 4400 ? Ss 12:50 0:00 sshd: root@pts/0
root 20614 0.0 0.0 140224 1336 ? S 12:50 0:00 CROND
root 20616 0.0 0.0 140224 1336 ? S 12:50 0:00 CROND
root 20617 0.0 0.0 140224 1332 ? S 12:50 0:00 CROND
root 20618 0.0 0.0 140224 1336 ? S 12:50 0:00 CROND
nagios 20624 0.0 0.0 106060 1264 ? Ss 12:50 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/lo
nagios 20625 0.0 0.0 106060 1268 ? Ss 12:50 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/loc
nagios 20626 0.0 0.0 106060 1264 ? Ss 12:50 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/l
nagios 20628 0.0 0.0 106060 1264 ? Ss 12:50 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /us
nagios 20635 2.7 0.6 329156 25556 ? S 12:50 0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios 20636 1.7 0.5 319644 22704 ? S 12:50 0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 20639 1.8 0.5 319936 23028 ? S 12:50 0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 20640 1.7 0.7 327476 30744 ? S 12:50 0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
postgres 20647 0.4 0.1 217688 4992 ? Ss 12:50 0:00 postgres: nagiosxi nagiosxi [local] idle
postgres 20648 0.1 0.1 217744 5472 ? Ss 12:50 0:00 postgres: nagiosxi nagiosxi [local] idle
postgres 20651 0.0 0.1 217688 4980 ? Ss 12:50 0:00 postgres: nagiosxi nagiosxi [local] idle
postgres 20660 0.5 0.1 217788 5312 ? Ss 12:50 0:00 postgres: nagiosxi nagiosxi [local] idle
root 20735 0.1 0.0 108300 1844 pts/0 Ss 12:50 0:00 -bash
root 20804 0.0 0.0 110228 1152 pts/0 R+ 12:50 0:00 ps aux
nagios 24965 0.2 0.0 27168 1796 ? Ss Feb03 21:58 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 24967 0.0 0.0 10016 780 ? S Feb03 3:59 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 24968 0.0 0.0 10016 780 ? S Feb03 4:20 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 24969 0.0 0.0 10016 780 ? S Feb03 5:12 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 24970 0.0 0.0 10016 780 ? S Feb03 5:44 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 24971 0.0 0.0 10016 784 ? S Feb03 5:21 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 24972 0.0 0.0 10016 776 ? S Feb03 4:09 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 24975 0.0 0.0 50296 844 ? S Feb03 3:42 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 24976 0.0 0.0 50432 1048 ? S Feb03 7:39 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 24986 0.0 0.0 22336 284 ? S Feb03 2:28 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
postfix 28212 0.0 0.0 81408 3804 ? S 11:28 0:00 pickup -l -t fifo -u
The other VM ( 2vCPUS) was totally frozen, so I shutted down and move to another Datastore (if the case is that I have I/O disk issues). I think that with the VMs on diferent Datastores we could discard and I/O issue right?
so let see what can you suggest from this ?
Cheers,
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Mon Feb 09, 2015 4:51 pm
by lmiltchev
Can you run the following command and report any errors?
Code: Select all
LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
Also, I noticed the following message:
Code: Select all
psql: FATAL: sorry, too many clients already
You can try to increase the max_connections number in the "/var/lib/pgsql/data/postgresql.conf" file. Are you using the "default" setting of 100? What is the output of the following command?
Code: Select all
echo 'show max_connections;' | psql nagiosxi nagiosxi
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Tue Feb 10, 2015 6:26 am
by carlos.atos
Hello lmiltchev
The Server went crazy again in out of office hours, so I couldn't get the top at the moment. the 2 VMs are in different DataStores but both of them got a high spike. I will finally shutdown 2vCPU VM tho discard any jammig between them, although I dont think so, could it be bothering the other VM?
Load 4vCPUs FEB-10.PNG
There is no new entries on /var/spool/mail/root
About the comand that you ask me to run:
Code: Select all
[root@localhost ~]# LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
[root@localhost ~]# echo 'show max_connections;' | psql nagiosxi nagiosxi
max_connections
-----------------
100
(1 row)
[root@localhost ~]#
I'm using the defaults values on the postgresql.conf , do you want me to increase the max_connections up to which value?
Cheers
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Tue Feb 10, 2015 5:41 pm
by abrist
carlos.atos wrote:I'm using the defaults values on the postgresql.conf , do you want me to increase the max_connections up to which value?
Double it for good measure. Remember to restart mysqld.
I think Ludmil was curious about issues with mrtg, so lets time the script to see if it is taking longer than expected:
Code: Select all
LANG=C LC_ALL=C time /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
As of late, how consistent is the spike? Is it on a predictable interval or at a specific time?
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Wed Feb 11, 2015 6:02 am
by carlos.atos
Hello Abrist
the max_connections has been set to 200 on /var/lib/pgsql/data/postgresql.conf
the mysqld service was restarted after this.
about the other script, this is what I've got.
Code: Select all
[root@localhost ~]# LANG=C LC_ALL=C time /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok
-bash: time: command not found
[root@localhost ~]#
As of late, how consistent is the spike? Is it on a predictable interval or at a specific time?
I could say that there were up to 300 and then I lost the graphs some hours ( i think because the high load), and had ocurred since this saturday 8th at 11 pm up to 2am and then on monday at similar hours. we could dare to say that it could occurr tonight, i'll try to monitor it to check for the spike.
This the graph for this week.
localhost-current_load week5-11 FEB.jpg
Cheers,
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Wed Feb 11, 2015 11:08 am
by scottwilkerson
Are these VM's running on server that has other VM's that could have scheduled jobs or something monopolizing the resources on the machine?
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Thu Feb 12, 2015 8:13 am
by carlos.atos
Hi Guys,
No good news for today, there was a peak nearly 100 in load from 4 am to 8 am, Unfortunately I wasn't at the office to connect and check the TOP or ps aux. but the load was in 2,46 so I took this in order to see what can you analize from this:
Code: Select all
top - 09:20:42 up 8 days, 21:56, 2 users, load average: 2.46, 1.64, 1.77
Tasks: 197 total, 2 running, 195 sleeping, 0 stopped, 0 zombie
Cpu(s): 8.4%us, 0.7%sy, 0.0%ni, 90.8%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3924212k total, 1624644k used, 2299568k free, 43228k buffers
Swap: 2064380k total, 43532k used, 2020848k free, 136916k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11357 apache 20 0 454m 34m 8308 S 8.3 0.9 9:28.06 httpd
11620 postgres 20 0 212m 5252 3720 S 8.0 0.1 1:30.63 postmaster
24327 postgres 20 0 212m 5240 3708 R 7.0 0.1 0:57.22 postmaster
24282 apache 20 0 454m 34m 8280 S 5.3 0.9 4:36.85 httpd
5074 apache 20 0 454m 34m 8316 S 4.0 0.9 7:12.62 httpd
14247 apache 20 0 454m 34m 8284 S 3.6 0.9 7:40.70 httpd
22 root 20 0 0 0 0 S 0.3 0.0 5:10.66 events/3
9365 mysql 20 0 2203m 41m 4948 S 0.3 1.1 13:52.40 mysqld
13928 nagios 20 0 312m 22m 8064 S 0.3 0.6 0:01.77 php
13988 postgres 20 0 212m 5324 3752 S 0.3 0.1 0:00.37 postmaster
13992 postgres 20 0 212m 5520 3944 S 0.3 0.1 0:00.82 postmaster
14152 root 20 0 15128 1360 964 R 0.3 0.0 0:00.20 top
27755 apache 20 0 454m 34m 8252 S 0.3 0.9 5:12.75 httpd
1 root 20 0 19360 1020 856 S 0.0 0.0 3:19.37 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 1:26.19 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:13.11 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/0
6 root RT 0 0 0 0 S 0.0 0.0 0:08.68 watchdog/0
7 root RT 0 0 0 0 S 0.0 0.0 0:31.46 migration/1
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/1
9 root 20 0 0 0 0 S 0.0 0.0 0:20.16 ksoftirqd/1
10 root RT 0 0 0 0 S 0.0 0.0 0:07.96 watchdog/1
11 root RT 0 0 0 0 S 0.0 0.0 1:32.61 migration/2
12 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/2
13 root 20 0 0 0 0 S 0.0 0.0 0:14.34 ksoftirqd/2
14 root RT 0 0 0 0 S 0.0 0.0 0:07.34 watchdog/2
15 root RT 0 0 0 0 S 0.0 0.0 0:34.85 migration/3
16 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/3
17 root 20 0 0 0 0 S 0.0 0.0 0:17.85 ksoftirqd/3
18 root RT 0 0 0 0 S 0.0 0.0 0:08.24 watchdog/3
19 root 20 0 0 0 0 S 0.0 0.0 3:39.24 events/0
20 root 20 0 0 0 0 S 0.0 0.0 6:24.68 events/1
21 root 20 0 0 0 0 S 0.0 0.0 4:16.54 events/2
23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup
24 root 20 0 0 0 0 S 0.0 0.0 0:00.17 khelper
25 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns
26 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr
27 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm
28 root 20 0 0 0 0 S 0.0 0.0 0:15.63 sync_supers
29 root 20 0 0 0 0 S 0.0 0.0 0:13.72 bdi-default
30 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/0
31 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/1
32 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/2
33 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/3
34 root 20 0 0 0 0 S 0.0 0.0 2:11.62 kblockd/0
ps aux
Code: Select all
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 19360 1020 ? Ss Feb03 3:19 /sbin/init
root 2 0.0 0.0 0 0 ? S Feb03 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Feb03 1:26 [migration/0]
root 4 0.0 0.0 0 0 ? S Feb03 0:13 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/0]
root 6 0.0 0.0 0 0 ? S Feb03 0:08 [watchdog/0]
root 7 0.0 0.0 0 0 ? S Feb03 0:31 [migration/1]
root 8 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/1]
root 9 0.0 0.0 0 0 ? S Feb03 0:20 [ksoftirqd/1]
root 10 0.0 0.0 0 0 ? S Feb03 0:07 [watchdog/1]
root 11 0.0 0.0 0 0 ? S Feb03 1:32 [migration/2]
root 12 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/2]
root 13 0.0 0.0 0 0 ? S Feb03 0:14 [ksoftirqd/2]
root 14 0.0 0.0 0 0 ? S Feb03 0:07 [watchdog/2]
root 15 0.0 0.0 0 0 ? S Feb03 0:34 [migration/3]
root 16 0.0 0.0 0 0 ? S Feb03 0:00 [stopper/3]
root 17 0.0 0.0 0 0 ? S Feb03 0:17 [ksoftirqd/3]
root 18 0.0 0.0 0 0 ? S Feb03 0:08 [watchdog/3]
root 19 0.0 0.0 0 0 ? S Feb03 3:39 [events/0]
root 20 0.0 0.0 0 0 ? S Feb03 6:24 [events/1]
root 21 0.0 0.0 0 0 ? S Feb03 4:16 [events/2]
root 22 0.0 0.0 0 0 ? S Feb03 5:10 [events/3]
root 23 0.0 0.0 0 0 ? S Feb03 0:00 [cgroup]
root 24 0.0 0.0 0 0 ? S Feb03 0:00 [khelper]
root 25 0.0 0.0 0 0 ? S Feb03 0:00 [netns]
root 26 0.0 0.0 0 0 ? S Feb03 0:00 [async/mgr]
root 27 0.0 0.0 0 0 ? S Feb03 0:00 [pm]
root 28 0.0 0.0 0 0 ? S Feb03 0:15 [sync_supers]
root 29 0.0 0.0 0 0 ? S Feb03 0:13 [bdi-default]
root 30 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/0]
root 31 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/1]
root 32 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/2]
root 33 0.0 0.0 0 0 ? S Feb03 0:00 [kintegrityd/3]
root 34 0.0 0.0 0 0 ? S Feb03 2:11 [kblockd/0]
root 35 0.0 0.0 0 0 ? S Feb03 1:20 [kblockd/1]
root 36 0.0 0.0 0 0 ? S Feb03 2:03 [kblockd/2]
root 37 0.0 0.0 0 0 ? S Feb03 1:17 [kblockd/3]
root 38 0.0 0.0 0 0 ? S Feb03 0:00 [kacpid]
root 39 0.0 0.0 0 0 ? S Feb03 0:00 [kacpi_notify]
root 40 0.0 0.0 0 0 ? S Feb03 0:00 [kacpi_hotplug]
root 41 0.0 0.0 0 0 ? S Feb03 0:00 [ata_aux]
root 42 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/0]
root 43 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/1]
root 44 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/2]
root 45 0.0 0.0 0 0 ? S Feb03 0:00 [ata_sff/3]
root 46 0.0 0.0 0 0 ? S Feb03 0:00 [ksuspend_usbd]
root 47 0.0 0.0 0 0 ? S Feb03 0:00 [khubd]
root 48 0.0 0.0 0 0 ? S Feb03 0:00 [kseriod]
root 49 0.0 0.0 0 0 ? S Feb03 0:00 [md/0]
root 50 0.0 0.0 0 0 ? S Feb03 0:00 [md/1]
root 51 0.0 0.0 0 0 ? S Feb03 0:00 [md/2]
root 52 0.0 0.0 0 0 ? S Feb03 0:00 [md/3]
root 53 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/0]
root 54 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/1]
root 55 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/2]
root 56 0.0 0.0 0 0 ? S Feb03 0:00 [md_misc/3]
root 57 0.0 0.0 0 0 ? S Feb03 0:00 [linkwatch]
root 58 0.0 0.0 0 0 ? S Feb03 0:24 [khungtaskd]
root 59 0.1 0.0 0 0 ? S Feb03 16:37 [kswapd0]
root 60 0.0 0.0 0 0 ? SN Feb03 0:00 [ksmd]
root 61 0.0 0.0 0 0 ? SN Feb03 2:58 [khugepaged]
root 62 0.0 0.0 0 0 ? S Feb03 0:00 [aio/0]
root 63 0.0 0.0 0 0 ? S Feb03 0:00 [aio/1]
root 64 0.0 0.0 0 0 ? S Feb03 0:00 [aio/2]
root 65 0.0 0.0 0 0 ? S Feb03 0:00 [aio/3]
root 66 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/0]
root 67 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/1]
root 68 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/2]
root 69 0.0 0.0 0 0 ? S Feb03 0:00 [crypto/3]
root 77 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/0]
root 78 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/1]
root 79 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/2]
root 80 0.0 0.0 0 0 ? S Feb03 0:00 [kthrotld/3]
root 81 0.0 0.0 0 0 ? S Feb03 0:00 [pciehpd]
root 83 0.0 0.0 0 0 ? S Feb03 0:00 [kpsmoused]
root 84 0.0 0.0 0 0 ? S Feb03 0:00 [usbhid_resumer]
root 85 0.0 0.0 0 0 ? S Feb03 0:00 [deferwq]
root 116 0.0 0.0 0 0 ? S Feb03 0:00 [kdmremove]
root 117 0.0 0.0 0 0 ? S Feb03 0:00 [kstriped]
root 283 0.0 0.0 0 0 ? S Feb03 1:02 [mpt_poll_0]
root 284 0.0 0.0 0 0 ? S Feb03 0:00 [mpt/0]
root 309 0.0 0.0 0 0 ? S Feb03 0:00 [scsi_eh_0]
root 318 0.0 0.0 0 0 ? S Feb03 0:00 [scsi_eh_1]
root 319 0.0 0.0 0 0 ? S Feb03 0:00 [scsi_eh_2]
root 422 0.0 0.0 0 0 ? S Feb03 0:00 [kdmflush]
root 424 0.0 0.0 0 0 ? S Feb03 0:00 [kdmflush]
root 442 0.0 0.0 0 0 ? R Feb03 5:21 [jbd2/dm-0-8]
root 443 0.0 0.0 0 0 ? S Feb03 0:00 [ext4-dio-unwrit]
root 516 0.0 0.0 11028 260 ? S<s Feb03 0:00 /sbin/udevd -d
root 712 0.0 0.0 0 0 ? S Feb03 0:48 [vmmemctl]
root 720 0.0 0.0 0 0 ? S Feb03 2:53 [flush-253:0]
root 858 0.0 0.0 11028 244 ? S< Feb03 0:00 /sbin/udevd -d
root 865 0.0 0.0 10636 248 ? S< Feb03 0:00 /sbin/udevd -d
root 889 0.0 0.0 0 0 ? S Feb03 0:00 [jbd2/sda1-8]
root 890 0.0 0.0 0 0 ? S Feb03 0:00 [ext4-dio-unwrit]
root 928 0.0 0.0 0 0 ? S Feb03 2:58 [kauditd]
root 1175 0.1 0.0 93144 724 ? S<sl Feb03 17:39 auditd
root 1199 0.1 0.1 249476 7016 ? Sl Feb03 21:51 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
dbus 1212 0.0 0.0 21532 448 ? Ss Feb03 0:00 dbus-daemon --system
root 1401 0.0 0.0 66688 560 ? Ss Feb03 1:14 /usr/sbin/sshd
root 1410 0.0 0.0 22188 620 ? Ss Feb03 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root 1420 0.3 0.0 189580 2224 ? Sl Feb03 39:08 /usr/sbin/vmtoolsd
ntp 1425 0.0 0.0 30732 1448 ? Ss Feb03 2:01 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
postgres 1631 0.1 0.1 216356 4156 ? S Feb03 25:37 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
postgres 1702 0.0 0.0 179368 668 ? Ss Feb03 2:59 postgres: logger process
postgres 1713 0.0 0.0 216472 2676 ? Ss Feb03 10:24 postgres: writer process
postgres 1714 0.0 0.0 216356 996 ? Ss Feb03 7:32 postgres: wal writer process
postgres 1715 0.0 0.0 216644 1164 ? Ss Feb03 7:53 postgres: autovacuum launcher process
postgres 1716 0.1 0.0 179636 868 ? Ss Feb03 13:29 postgres: stats collector process
root 1717 0.0 0.0 81328 2784 ? Ss Feb03 5:06 /usr/libexec/postfix/master
postfix 1727 0.0 0.0 83144 3052 ? S Feb03 3:40 qmgr -l -t fifo -u
root 1762 0.1 0.0 117336 740 ? Ss Feb03 14:24 crond
nagios 1772 0.1 0.0 368888 1020 ? S Feb03 24:24 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
ajaxterm 1780 0.1 0.0 170340 1612 ? Sl Feb03 16:08 python /usr/share/ajaxterm/ajaxterm.py --daemon --port=8022 --uid=ajaxterm
nagios 1852 0.0 0.0 50296 268 ? Ss Feb03 0:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
root 1861 0.0 0.0 67552 1360 ? Ss Feb03 0:00 login -- root
root 1863 0.0 0.0 4064 488 tty2 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty2
root 1865 0.0 0.0 4064 488 tty3 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty3
root 1867 0.0 0.0 4064 488 tty4 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty4
root 1869 0.0 0.0 4064 488 tty5 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty5
root 1871 0.0 0.0 4064 488 tty6 Ss+ Feb03 0:00 /sbin/mingetty /dev/tty6
root 2068 0.0 0.0 2085024 1212 ? Sl Feb03 0:00 /usr/sbin/console-kit-daemon --no-daemon
root 2138 0.0 0.0 108304 1340 tty1 Ss+ Feb03 0:00 -bash
root 9257 0.0 0.0 108168 1296 ? S Feb11 0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/my
mysql 9365 1.0 1.0 2256396 42284 ? Sl Feb11 13:54 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql -
root 13940 0.4 0.1 100448 4412 ? Ss 09:20 0:01 sshd: root@pts/0
root 14042 0.2 0.0 108300 1876 pts/0 Ss 09:20 0:00 -bash
postfix 14711 0.2 0.1 81612 4076 ? S 09:22 0:00 smtp -t unix -u
postfix 14716 0.2 0.1 81612 4080 ? S 09:22 0:00 smtp -t unix -u
postfix 14717 0.1 0.1 81612 4076 ? S 09:22 0:00 smtp -t unix -u
postfix 14718 0.1 0.1 81612 4080 ? S 09:22 0:00 smtp -t unix -u
nagios 15186 0.7 0.0 22852 1820 ? Ss 09:23 0:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 15188 0.1 0.0 10016 920 ? S 09:23 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15189 0.1 0.0 10016 920 ? S 09:23 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15190 0.1 0.0 10016 920 ? S 09:23 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15191 0.1 0.0 10016 912 ? S 09:23 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15192 0.2 0.0 10016 924 ? S 09:23 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15193 0.0 0.0 10016 912 ? S 09:23 0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 15194 0.5 0.0 50296 1208 ? S 09:23 0:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 15195 0.8 0.0 50432 1384 ? S 09:23 0:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios 15199 0.0 0.0 22336 828 ? S 09:23 0:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root 15282 0.1 0.0 140224 1336 ? S 09:24 0:00 CROND
root 15284 0.2 0.0 140224 1336 ? S 09:24 0:00 CROND
root 15285 0.1 0.0 140224 1332 ? S 09:24 0:00 CROND
root 15286 0.1 0.0 140224 1336 ? S 09:24 0:00 CROND
nagios 15289 0.2 0.0 106060 1264 ? Ss 09:24 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /us
nagios 15290 0.0 0.0 106060 1264 ? Ss 09:24 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/loc
nagios 15292 0.0 0.0 106060 1268 ? Ss 09:24 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/l
nagios 15293 5.2 0.5 319672 23120 ? S 09:24 0:01 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios 15294 1.8 0.5 319644 22820 ? S 09:24 0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
nagios 15297 0.1 0.0 106060 1268 ? Ss 09:24 0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/lo
nagios 15298 4.5 0.5 319936 23024 ? S 09:24 0:01 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios 15301 2.1 0.7 327476 30736 ? S 09:24 0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
postgres 15308 1.7 0.1 217724 4992 ? Ss 09:24 0:00 postgres: nagiosxi nagiosxi [local] idle
postgres 15309 1.0 0.1 217788 5316 ? Ss 09:24 0:00 postgres: nagiosxi nagiosxi [local] idle
postgres 15318 0.9 0.1 217740 5492 ? Ss 09:24 0:00 postgres: nagiosxi nagiosxi [local] idle
postgres 15336 1.2 0.1 217724 4984 ? Ss 09:24 0:00 postgres: nagiosxi nagiosxi [local] idle
root 15524 0.0 0.0 110232 1156 pts/0 R+ 09:24 0:00 ps aux
postfix 21829 0.0 0.0 81408 3840 ? S 08:00 0:00 pickup -l -t fifo -u
there wasn't new messages in /var/spool/mail/root rather than the last ones.
about
scottwilkerson wrote:Are these VM's running on server that has other VM's that could have scheduled jobs or something monopolizing the resources on the machine?
there was another Nagios VM ( the same copy using 2vCPUS and aonther HDD) but I shutted it down days ago. Both presented the same Load behaviour, for that reason I shutted it dow. I was thinking that It would be a Jamming between each others
what else do you think?
Cheers,
Re: Nagios 2014R2.3 on VM HIGH LOAD SPIKES
Posted: Thu Feb 12, 2015 8:23 am
by WillemDH
Maybe, as a test, you could disable the services and hosts making use of mrtg and temporarily stop mrtg. If you still get load spikes, at least you know it's not caused by mrtg. Maybe wait for Nagios support to react to see if they think it's a good idea.