Hi Guys,
We are getting a lot these warnings
Notification Type: PROBLEM
Service: Total Processes
Host: localhost
Address: 127.0.0.1
State: WARNING
Info:
PROCS WARNING: 269 processes with STATE = RSZDT
Date/Time: 07/03/2013 08:51:07
Load avg on the box is around 1.5 to 2 so not really under stress. This seems to have started happen recently. Nothing is odd with the output of top .
So wondering whats going on.
Getting a lot of total process warning from XI Server
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Getting a lot of total process warning from XI Server
Hello arnab,
Has your team been making any major system changes to the XI server? Such as hosting another major piece of software on it now?
This could also be caused by an increased amount of checks being done by the Nagios process, as it forks to complete checks as the cron happens for each.
Has your team been making any major system changes to the XI server? Such as hosting another major piece of software on it now?
This could also be caused by an increased amount of checks being done by the Nagios process, as it forks to complete checks as the cron happens for each.
Re: Getting a lot of total process warning from XI Server
Hi Slan,
No not at all. We are seeing a lot of these in both our xi servers.
No not at all. We are seeing a lot of these in both our xi servers.
Re: Getting a lot of total process warning from XI Server
Could you run the following command (when total service numbers are high) and post the output in code wraps?
Code: Select all
ps -aefFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Getting a lot of total process warning from XI Server
Sorry guys this got a bit left out as I got busy with some other stuff
Code: Select all
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Mar07 ? 00:00:07 init [3]
root 2 1 0 Mar07 ? 00:00:47 [migration/0]
root 3 1 0 Mar07 ? 00:00:20 [ksoftirqd/0]
root 4 1 0 Mar07 ? 00:00:37 [migration/1]
root 5 1 0 Mar07 ? 00:00:59 [ksoftirqd/1]
root 6 1 0 Mar07 ? 00:00:38 [migration/2]
root 7 1 0 Mar07 ? 00:00:01 [ksoftirqd/2]
root 8 1 0 Mar07 ? 00:00:39 [migration/3]
root 9 1 0 Mar07 ? 00:00:00 [ksoftirqd/3]
root 10 1 0 Mar07 ? 00:10:02 [events/0]
root 11 1 0 Mar07 ? 00:00:08 [events/1]
root 12 1 0 Mar07 ? 00:00:01 [events/2]
root 13 1 0 Mar07 ? 00:00:01 [events/3]
root 14 1 0 Mar07 ? 00:00:02 [khelper]
root 87 1 0 Mar07 ? 00:00:00 [kthread]
root 94 87 0 Mar07 ? 00:00:44 [kblockd/0]
root 95 87 0 Mar07 ? 00:00:56 [kblockd/1]
root 96 87 0 Mar07 ? 00:00:30 [kblockd/2]
root 97 87 0 Mar07 ? 00:00:29 [kblockd/3]
root 98 87 0 Mar07 ? 00:00:00 [kacpid]
root 261 87 0 Mar07 ? 00:00:00 [cqueue/0]
root 262 87 0 Mar07 ? 00:00:00 [cqueue/1]
root 263 87 0 Mar07 ? 00:00:00 [cqueue/2]
root 264 87 0 Mar07 ? 00:00:00 [cqueue/3]
root 267 87 0 Mar07 ? 00:00:00 [khubd]
root 269 87 0 Mar07 ? 00:00:00 [kseriod]
root 363 87 0 Mar07 ? 00:00:00 [khungtaskd]
root 364 87 0 Mar07 ? 00:00:00 [pdflush]
root 365 87 0 Mar07 ? 00:37:06 [pdflush]
root 366 87 0 Mar07 ? 00:00:27 [kswapd0]
root 367 87 0 Mar07 ? 00:00:00 [aio/0]
root 368 87 0 Mar07 ? 00:00:00 [aio/1]
root 369 87 0 Mar07 ? 00:00:00 [aio/2]
root 370 87 0 Mar07 ? 00:00:00 [aio/3]
root 576 87 0 Mar07 ? 00:00:00 [kpsmoused]
root 645 87 0 Mar07 ? 00:00:08 [mpt_poll_0]
root 646 87 0 Mar07 ? 00:00:00 [mpt/0]
root 647 87 0 Mar07 ? 00:00:00 [scsi_eh_0]
root 653 87 0 Mar07 ? 00:00:00 [ata/0]
root 654 87 0 Mar07 ? 00:00:00 [ata/1]
root 655 87 0 Mar07 ? 00:00:00 [ata/2]
root 656 87 0 Mar07 ? 00:00:00 [ata/3]
root 657 87 0 Mar07 ? 00:00:00 [ata_aux]
root 668 87 0 Mar07 ? 00:00:00 [kstriped]
root 689 87 0 Mar07 ? 00:00:00 [ksnapd]
root 712 87 0 Mar07 ? 01:39:40 [kjournald]
root 737 87 0 Mar07 ? 00:00:32 [kauditd]
root 770 1 0 Mar07 ? 00:00:00 /sbin/udevd -d
apache 2051 4369 1 14:05 ? 00:01:59 /usr/sbin/httpd
postgres 2231 4297 0 14:05 ? 00:00:02 postgres: nagiosxi nagiosxi 127.
root 2274 87 0 Mar07 ? 00:00:00 [kmpathd/0]
root 2275 87 0 Mar07 ? 00:00:00 [kmpathd/1]
root 2276 87 0 Mar07 ? 00:00:00 [kmpathd/2]
root 2277 87 0 Mar07 ? 00:00:00 [kmpathd/3]
root 2278 87 0 Mar07 ? 00:00:00 [kmpath_handlerd]
root 2345 87 0 Mar07 ? 00:00:00 [kjournald]
root 2827 87 0 Mar07 ? 00:43:30 [vmmemctl]
root 2963 1 0 Mar07 ? 00:01:15 /usr/sbin/vmtoolsd
root 3046 87 0 Mar07 ? 00:00:00 [iscsi_eh]
root 3108 87 0 Mar07 ? 00:00:00 [cnic_wq]
root 3113 87 0 Mar07 ? 00:00:00 [bnx2i_thread/0]
root 3114 87 0 Mar07 ? 00:00:00 [bnx2i_thread/1]
root 3116 87 0 Mar07 ? 00:00:00 [bnx2i_thread/2]
root 3117 87 0 Mar07 ? 00:00:00 [bnx2i_thread/3]
root 3134 87 0 Mar07 ? 00:00:00 [ib_addr]
root 3149 87 0 Mar07 ? 00:00:00 [ib_mcast]
root 3150 87 0 Mar07 ? 00:00:00 [ib_inform]
root 3151 87 0 Mar07 ? 00:00:00 [local_sa]
root 3157 87 0 Mar07 ? 00:00:00 [iw_cm_wq]
root 3163 87 0 Mar07 ? 00:00:00 [ib_cm/0]
root 3165 87 0 Mar07 ? 00:00:00 [ib_cm/1]
root 3166 87 0 Mar07 ? 00:00:00 [ib_cm/2]
root 3167 87 0 Mar07 ? 00:00:00 [ib_cm/3]
root 3173 87 0 Mar07 ? 00:00:00 [rdma_cm]
root 3194 1 0 Mar07 ? 00:00:00 iscsiuio
root 3201 1 0 Mar07 ? 00:00:00 iscsid
root 3202 1 0 Mar07 ? 00:00:00 iscsid
root 3527 1 0 Mar07 ? 00:06:23 auditd
root 3529 3527 0 Mar07 ? 00:01:21 /sbin/audispd
root 3559 1 0 Mar07 ? 00:01:57 syslogd -m 0
root 3562 1 0 Mar07 ? 00:00:00 klogd -x
root 3666 1 0 Mar07 ? 00:00:45 irqbalance
rpc 3697 1 0 Mar07 ? 00:00:00 portmap
root 3734 87 0 Mar07 ? 00:00:00 [rpciod/0]
root 3735 87 0 Mar07 ? 00:00:00 [rpciod/1]
root 3736 87 0 Mar07 ? 00:00:00 [rpciod/2]
root 3737 87 0 Mar07 ? 00:00:00 [rpciod/3]
rpcuser 3746 1 0 Mar07 ? 00:00:00 rpc.statd
root 3783 1 0 Mar07 ? 00:00:00 rpc.idmapd
dbus 3813 1 0 Mar07 ? 00:00:00 dbus-daemon --system
root 3856 1 0 Mar07 ? 00:00:00 pcscd
root 3870 1 0 Mar07 ? 00:00:00 /usr/sbin/acpid
68 3883 1 0 Mar07 ? 00:00:41 hald
root 3884 3883 0 Mar07 ? 00:00:00 hald-runner
68 3892 3884 0 Mar07 ? 00:00:00 hald-addon-acpi: listening on ac
68 3898 3884 0 Mar07 ? 00:00:00 hald-addon-keyboard: listening o
root 3907 3884 0 Mar07 ? 00:01:43 hald-addon-storage: polling /dev
root 3945 1 0 Mar07 ? 00:00:00 /usr/bin/hidd --server
root 3993 1 0 Mar07 ? 00:00:01 automount --pid-file /var/run/au
root 4014 1 0 Mar07 ? 00:12:55 /usr/sbin/snmptrapd -Lsd -On -p
root 4032 1 0 Mar07 ? 00:00:03 /usr/sbin/sshd
root 4050 1 0 Mar07 ? 00:00:50 xinetd -stayalive -pidfile /var/
ntp 4066 1 0 Mar07 ? 00:00:01 ntpd -u ntp:ntp -p /var/run/ntpd
root 4084 1 0 Mar07 ? 00:00:00 /usr/sbin/vsftpd /etc/vsftpd/vsf
root 4125 1 0 Mar07 ? 00:00:00 /bin/sh /usr/bin/mysqld_safe --d
mysql 4207 4125 2 Mar07 ? 06:27:20 /usr/libexec/mysqld --basedir=/u
postgres 4297 1 0 Mar07 ? 00:01:26 /usr/bin/postmaster -p 5432 -D /
root 4326 1 0 Mar07 ? 00:00:02 sendmail: accepting connections
smmsp 4336 1 0 Mar07 ? 00:00:00 sendmail: Queue runner@01:00:00
root 4350 1 0 Mar07 ? 00:00:00 gpm -m /dev/input/mice -t exps2
postgres 4364 4297 0 Mar07 ? 00:00:00 postgres: logger process
postgres 4366 4297 0 Mar07 ? 00:00:12 postgres: writer process
postgres 4367 4297 0 Mar07 ? 00:00:12 postgres: stats buffer process
postgres 4368 4367 0 Mar07 ? 00:00:08 postgres: stats collector proces
root 4369 1 0 Mar07 ? 00:00:03 /usr/sbin/httpd
root 4382 1 0 Mar07 ? 00:00:18 crond
xfs 4405 1 0 Mar07 ? 00:00:00 xfs -droppriv -daemon
nagios 4413 1 0 Mar07 ? 00:00:59 /usr/local/nagios/bin/npcd -d -f
root 4437 1 0 Mar07 ? 00:00:00 /usr/sbin/atd
avahi 4463 1 0 Mar07 ? 00:00:01 avahi-daemon: running [karma.loc
avahi 4464 4463 0 Mar07 ? 00:00:00 avahi-daemon: chroot helper
ajaxterm 4481 1 0 Mar07 ? 00:00:02 python /usr/share/ajaxterm/ajaxt
nagios 4564 1 0 Mar07 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c
root 4591 1 0 Mar07 ? 00:00:00 /usr/sbin/smartd -q never
root 4602 1 0 Mar07 tty1 00:00:00 /sbin/mingetty tty1
root 4606 1 0 Mar07 tty2 00:00:00 /sbin/mingetty tty2
root 4608 1 0 Mar07 tty3 00:00:00 /sbin/mingetty tty3
root 4610 1 0 Mar07 tty4 00:00:00 /sbin/mingetty tty4
root 4611 1 0 Mar07 tty5 00:00:00 /sbin/mingetty tty5
root 4614 1 0 Mar07 tty6 00:00:00 /sbin/mingetty tty6
root 4627 1 0 Mar07 ? 00:00:01 /usr/bin/python -tt /usr/sbin/yu
root 4629 1 0 Mar07 ? 00:00:01 /usr/libexec/gam_server
apache 8483 4369 1 11:22 ? 00:03:44 /usr/sbin/httpd
root 9792 1 0 17:18 ? 00:00:00 sudo /usr/local/nagios/libexec/c
root 9799 9792 0 17:18 ? 00:00:00 /usr/local/nagios/libexec/check_
root 9800 9799 0 17:18 ? 00:00:00 /usr/bin/ssh 2.0.1.163 /root/RMS
apache 10552 4369 1 11:55 ? 00:03:21 /usr/sbin/httpd
apache 10675 4369 1 11:55 ? 00:03:30 /usr/sbin/httpd
postgres 10824 4297 0 11:55 ? 00:00:04 postgres: nagiosxi nagiosxi 127.
postgres 11024 4297 0 11:55 ? 00:00:04 postgres: nagiosxi nagiosxi 127.
postgres 11347 4297 0 11:23 ? 00:00:04 postgres: nagiosxi nagiosxi 127.
nagios 12133 14553 0 17:18 ? 00:00:00 /usr/local/nagios/bin/nagios -d
root 12134 12133 0 17:18 ? 00:00:00 sudo /usr/local/nagios/libexec/c
root 12135 12134 0 17:18 ? 00:00:00 /usr/local/nagios/libexec/check_
root 12136 12135 0 17:18 ? 00:00:00 /usr/bin/ssh 2.0.1.164 /root/RMS
nagios 13607 14553 0 17:18 ? 00:00:00 /usr/local/nagios/bin/nagios -d
root 13608 13607 0 17:18 ? 00:00:00 sudo /usr/local/nagios/libexec/c
nagios 13609 14553 0 17:18 ? 00:00:00 /usr/local/nagios/bin/nagios -d
root 13610 13609 0 17:18 ? 00:00:00 sudo /usr/local/nagios/libexec/c
nagios 13611 14553 0 17:18 ? 00:00:00 /usr/local/nagios/bin/nagios -d
root 13612 13611 0 17:18 ? 00:00:00 sudo /usr/local/nagios/libexec/c
root 13613 13608 0 17:18 ? 00:00:00 /usr/local/nagios/libexec/check_
root 13614 13610 0 17:18 ? 00:00:00 /usr/local/nagios/libexec/check_
root 13615 13612 0 17:18 ? 00:00:00 /usr/local/nagios/libexec/check_
root 13616 13613 0 17:18 ? 00:00:00 /usr/bin/ssh 2.0.1.163 /root/RMS
root 13617 13615 0 17:18 ? 00:00:00 /usr/bin/ssh 2.0.1.164 /root/RMS
root 13618 13614 0 17:18 ? 00:00:00 /usr/bin/ssh 2.0.1.164 /root/RMS
nagios 14549 4564 0 Mar14 ? 00:02:59 /usr/local/nagios/bin/ndo2db -c
nagios 14550 14549 0 Mar14 ? 00:30:44 /usr/local/nagios/bin/ndo2db -c
nagios 14553 1 1 Mar14 ? 01:56:01 /usr/local/nagios/bin/nagios -d
apache 14631 4369 1 10:08 ? 00:04:56 /usr/sbin/httpd
postgres 14989 4297 0 10:08 ? 00:00:06 postgres: nagiosxi nagiosxi 127.
nagios 15205 4382 0 17:19 ? 00:00:00 crond
nagios 15206 4382 0 17:19 ? 00:00:00 crond
nagios 15209 4382 0 17:19 ? 00:00:00 crond
nagios 15210 4382 0 17:19 ? 00:00:00 crond
nagios 15211 4382 0 17:19 ? 00:00:00 crond
nagios 15215 15206 0 17:19 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/
nagios 15217 15209 0 17:19 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/
nagios 15218 15210 0 17:19 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/
nagios 15219 15211 0 17:19 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/
nagios 15220 15205 0 17:19 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/
nagios 15221 15220 0 17:19 ? 00:00:00 /usr/bin/php -q /usr/local/nagio
nagios 15223 15215 0 17:19 ? 00:00:00 /usr/bin/php -q /usr/local/nagio
nagios 15224 15218 0 17:19 ? 00:00:00 /usr/bin/php -q /usr/local/nagio
nagios 15234 15217 1 17:19 ? 00:00:00 /usr/bin/php -q /usr/local/nagio
nagios 15235 15219 0 17:19 ? 00:00:00 /usr/bin/php -q /usr/local/nagio
postgres 15237 4297 0 17:19 ? 00:00:00 postgres: nagiosxi nagiosxi 127.
postgres 15257 4297 0 17:19 ? 00:00:00 postgres: nagiosxi nagiosxi 127.
postgres 15272 4297 0 17:19 ? 00:00:00 postgres: nagiosxi nagiosxi 127.
postgres 15301 4297 1 17:19 ? 00:00:00 postgres: nagiosxi nagiosxi 127.
nagios 15680 14553 0 17:19 ? 00:00:00 /usr/local/nagios/bin/nagios -d
root 15681 15680 0 17:19 ? 00:00:00 sudo /usr/local/nagios/libexec/c
root 15686 15681 0 17:19 ? 00:00:00 /usr/local/nagios/libexec/check_
root 15687 15686 0 17:19 ? 00:00:00 /usr/bin/ssh 2.0.1.163 /root/RMS
apache 15715 4369 1 16:05 ? 00:00:56 /usr/sbin/httpd
root 15887 4032 0 17:19 ? 00:00:00 sshd: root@pts/0
postgres 15979 4297 0 16:05 ? 00:00:01 postgres: nagiosxi nagiosxi 127.
postgres 16428 4297 0 17:19 ? 00:00:00 postgres: nagiosxi nagiosxi 127.
apache 16678 4369 1 15:36 ? 00:01:11 /usr/sbin/httpd
root 16855 15887 0 17:19 pts/0 00:00:00 -bash
nagios 16931 14553 0 17:19 ? 00:00:00 /usr/local/nagios/bin/nagios -d
nagios 16932 16931 0 17:19 ? 00:00:00 /usr/bin/perl /usr/local/nagios/
apache 17098 4369 1 10:08 ? 00:05:01 /usr/sbin/httpd
postgres 17521 4297 0 10:08 ? 00:00:06 postgres: nagiosxi nagiosxi 127.
postgres 17564 4297 0 15:36 ? 00:00:01 postgres: nagiosxi nagiosxi 127.
apache 17744 4369 1 14:01 ? 00:02:05 /usr/sbin/httpd
apache 17816 4369 1 10:50 ? 00:04:15 /usr/sbin/httpd
postgres 18000 4297 0 10:50 ? 00:00:05 postgres: nagiosxi nagiosxi 127.
postgres 18132 4297 0 14:01 ? 00:00:02 postgres: nagiosxi nagiosxi 127.
nagios 19113 14553 0 17:19 ? 00:00:00 /usr/local/nagios/bin/nagios -d
nagios 19114 19113 0 17:19 ? 00:00:00 /usr/local/nagios/libexec/check_
nagios 19186 14553 0 17:19 ? 00:00:00 /usr/local/nagios/bin/nagios -d
root 19187 19186 0 17:19 ? 00:00:00 sudo /usr/local/nagios/libexec/c
root 19199 19187 0 17:19 ? 00:00:00 /usr/local/nagios/libexec/check_
root 19200 19199 0 17:19 ? 00:00:00 /usr/bin/ssh 2.0.1.164 /root/RMS
nagios 19549 16932 0 17:19 ? 00:00:00 sh -c snmpwalk -c 35k1m05 10.1.1
nagios 19550 19549 5 17:19 ? 00:00:00 snmpwalk -c 10.1.10.67 -
nagios 19551 19549 0 17:19 ? 00:00:00 head -1
nagios 19812 15235 0 17:19 ? 00:00:00 sh -c /usr/bin/iostat -c 5 2 | t
nagios 19813 19812 0 17:19 ? 00:00:00 /usr/bin/iostat -c 5 2
nagios 19814 19812 0 17:19 ? 00:00:00 tail --lines=2
nagios 19815 19812 0 17:19 ? 00:00:00 head --lines=1
nagios 19816 19812 0 17:19 ? 00:00:00 awk { print $1,$2,$3,$4,$5,$6 }
nagios 19832 4050 0 17:19 ? 00:00:00 nsca -c /usr/local/nagios/etc/ns
nagios 19849 14553 0 17:19 ? 00:00:00 /usr/local/nagios/bin/nagios -d
nagios 19850 19849 0 17:19 ? 00:00:00 /usr/bin/php /usr/local/nagiosxi
root 19851 16855 0 17:19 pts/0 00:00:00 ps -aef
apache 25380 4369 1 10:27 ? 00:04:32 /usr/sbin/httpd
postgres 25663 4297 0 10:27 ? 00:00:05 postgres: nagiosxi nagiosxi 127.
apache 27992 4369 1 09:22 ? 00:05:35 /usr/sbin/httpd
postgres 28052 4297 0 09:22 ? 00:00:06 postgres: nagiosxi nagiosxi 127.
apache 28616 4369 1 09:05 ? 00:05:41 /usr/sbin/httpd
postgres 28668 4297 0 09:05 ? 00:00:07 postgres: nagiosxi nagiosxi 127.
apache 28675 4369 1 09:05 ? 00:05:47 /usr/sbin/httpd
postgres 28684 4297 0 09:05 ? 00:00:07 postgres: nagiosxi nagiosxi 127.
Re: Getting a lot of total process warning from XI Server
None of these processes looks problematic, though the number is on the higher side. Did you add any new checks/hosts or decrease the interval on any checks recently? What was the average number of processes before this issues arose?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.