Nagios performance woes continue
Posted: Mon Jan 07, 2013 9:13 am
Sorry to beat a dead horse but our XI instance continues to head downhill. The web GUI is sluggish, system load still goes high with many blocked processes. Now barely any services show up and now postgresql has scores of processes running, so many that last night we went over the total processes threshold. I've followed all the steps in the orphaned service FAQ, rebooted several times to no avail. This is still 2011R3.3 on CentOS 5.7 64 bit. I thought I had this licked when I stopped syslog from writing to local files as it seemed to be the culprit, but things soon became bad again. Here's the a sample of the postgresql processes running this morning.
There were scores the about sets of processes.
Code: Select all
[root@psm-itmon ~]# service postgresql status
postmaster (pid 32238 31930 31422 31127 30616 29757 29509 29254 29101 28969 28637 28462 28003 26681 26220 26181 25735 25592 25205 25150 24771 24685 24263 24222 24218 23778 23637 23216 22172 21531 21101 20768 20736 20271 19374 19031 18803 17711 17596 17368 16969 16945 16328 16067 14643 14202 14028 13624 13560 13193 13183 12647 12601 12280 12085 12056 11778 11542 11532 11339 10005 9591 9425 9154 8810 8662 7804 7723 7381 7302 7101 6912 6907 6726 6470 6059 5653 5629 5479 5476 5473 5428 4518 4515 4512 4509 4506 4494 4201 4200 4199 4178 4176 4047 3626 3553 3126 2586 2145 2020 1658 1512 1099) is running...
[root@psm-itmon ~]# vmstat 5 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 4856212 201328 743328 0 0 2 15 5 24 14 0 85 0 0
1 1 0 4855668 201332 743344 0 0 0 1706 1749 251 15 0 76 8 0
[root@psm-itmon ~]# service nagios status
nagios is not running
[root@psm-itmon ~]# service nagios start
Starting nagios: done.
[root@psm-itmon ~]# ps -deaf | grep nagios
nagios 1024 4255 0 Jan06 ? 00:00:00 crond
nagios 1038 1024 0 Jan06 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 1042 1038 0 Jan06 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
postgres 1099 4176 0 Jan06 ? 00:00:00 postgres: nagiosxi nagiosxi 127.0.0.1(40332) idle
nagios 1445 4255 0 Jan06 ? 00:00:00 crond
nagios 1460 1445 0 Jan06 ? 00:00:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
nagios 1467 1460 0 Jan06 ? 00:00:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
postgres 1512 4176 0 Jan06 ? 00:00:00 postgres: nagiosxi nagiosxi 127.0.0.1(58160) idle