Nagios server memory use

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Nagios server memory use

Post by gormank »

mcollectived shows its been running since yesterday on both servers. One server was restarted 12/23 and memory usage is pretty low right now, but climbing.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Nagios server memory use

Post by tmcdonald »

I guess one thing we should step back and look at is whether you're seeing any adverse effects from the high RAM. It's entirely possible that your actual usage is just fine, and we're interpreting the results incorrectly:

http://www.linuxatemyram.com/

The 0.0% IO wait you have might be the result of your climbing RAM usage. Are you getting any errors in var/log/messages about OOM (out of memory) or any random application crashes?
Former Nagios employee
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Nagios server memory use

Post by gormank »

rkennedy wrote:It's not the relationship between NRPE and Nagios that takes resources, but rather the actual plugins that fuel checks. The compiled Nagios plugins that come with XI use far less resources over other plugins.

See this slideshow for a bit more information about how the plugins use resources (starts on page #27) -
https://assets.nagios.com/presentations ... ins.pdf#27

As check_nrpe is compiled I don't foresee that as the issue, are you running additional plugins that are custom at all?
Thhe script in the slideshow has an error at line 6. I'm not sure what its trying to do with the awk file that doesn't exist. I commented the line, copied it and removed the "| awk -f awk" near the end.

Code: Select all

#!/bin/bash
i=0
until [ $i -eq 300 ]
do
        ps axo pid,ppid,pcpu,size,etime,priority,cmd | grep -v awk | grep -v bash | awk '/nagios/ {print}' >> nagios.txt
#       ps axo pid,ppid,pcpu,size,etime,cmd | grep nagios | grep -v grep | grep -v nagios_total.sh | awk -f awk >> nagios.txt
        ps axo pid,ppid,pcpu,size,etime,cmd | grep nagios | grep -v grep | grep -v nagios_total.sh >> nagios.txt
        echo "# # # # # # # # #" >> nagios.txt
        sleep 1
        i=$(( $i +1 ))
done
echo "##### Nagios Process Report #####" >> nagios_summary.txt
echo "#################################" >> nagios_summary.txt
echo "##### Total Memory Summary  #####" >> nagios_summary.txt
grep "Total Memory" nagios.txt | sort | uniq -c >> nagios_summary.txt
echo "##### Average Memory Use MB   #####" >> nagios_summary.txt
awk '{ SUM += $4} END { print SUM/300/1024 }' nagios.txt >> nagios_summary.txt
echo                                     >> nagios_summary.txt
echo "#################################" >> nagios_summary.txt
echo "#####  Total CPU Summary %  #####" >> nagios_summary.txt
grep "Total CPU" nagios.txt | sort -n | uniq -c >> nagios_summary.txt
grep "Total CPU" nagios.txt | awk '{printf "%3.2f\n", $3}' | sort -n > /tmp/nagios_summary
echo "#####    Average CPU Use    #####" >> nagios_summary.txt
awk '{ SUM += $1} END { print SUM/300 }' /tmp/nagios_summary >> nagios_summary.txt
echo "#####    Maximum CPU Use    #####" >> nagios_summary.txt
sort -n /tmp/nagios_summary | tail -n1 >> nagios_summary.txt
Last edited by gormank on Mon Dec 28, 2015 1:37 pm, edited 1 time in total.
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Nagios server memory use

Post by gormank »

Memory usage is monitored so we get alerts that people have to respond to. Its more the alerts than the usage, but usage never seems to level off--it creeps up every day.
I don't see anything in the syslog indicating a problem from RAM usage increasing.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Nagios server memory use

Post by rkennedy »

After the machine is rebooted, how much ram is in use an hour later?

I'm just trying to get a a grasp on how much memory is used initially versus ~2 weeks later.
Former Nagios Employee
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Nagios server memory use

Post by gormank »

All 3 examples have the same config, but the failover normally has checks disabled.

A primary server:

Code: Select all

# uptime
 21:24:09 up 14 days,  4:45,  3 users,  load average: 0.14, 0.23, 0.22

# top -n 1
top - 21:24:32 up 14 days,  4:46,  3 users,  load average: 0.21, 0.24, 0.23
Tasks: 246 total,   1 running, 245 sleeping,   0 stopped,   0 zombie
Cpu(s):  6.7%us,  1.3%sy,  0.0%ni, 91.8%id,  0.1%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  32880880k total, 32375376k used,   505504k free,   308428k buffers
Swap: 16777212k total,      344k used, 16776868k free,  6283764k cached
Another primary server recently restarted:

Code: Select all

# uptime
 21:24:57 up 3 days, 10:56,  1 user,  load average: 0.21, 0.28, 0.24

# top -n 1
top - 21:25:04 up 3 days, 10:56,  1 user,  load average: 0.20, 0.27, 0.24
Tasks: 250 total,   1 running, 249 sleeping,   0 stopped,   0 zombie
Cpu(s):  9.6%us,  2.0%sy,  0.0%ni, 88.1%id,  0.1%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  32880880k total, 16180552k used, 16700328k free,   323668k buffers
Swap: 16777212k total,        0k used, 16777212k free,  3053256k cached
A failover server I enabled active and passive checks on ~3 hours ago. The failover has notifications disabled.
RAM use has been slowly increasing since checks were enabled, from <2G used to >3G.

Code: Select all

 # uptime
 21:25:21 up 14 days,  5:13,  1 user,  load average: 0.08, 0.17, 0.33

# top -n 1
top - 21:25:27 up 14 days,  5:13,  1 user,  load average: 0.07, 0.16, 0.33
Tasks: 244 total,   1 running, 243 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.7%us,  0.5%sy,  0.0%ni, 97.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32880880k total,  9899788k used, 22981092k free,   333880k buffers
Swap: 16777212k total,        0k used, 16777212k free,  6319872k cached
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagios server memory use

Post by ssax »

I've seen this type of behavior before, in that case it was a custom plugin that they had that wasn't exiting properly so it just kept spawning new process until it maxed out the memory.

What is the output of this command?

Code: Select all

ps aux | grep nagios
You should probably do a 'ps aux' by itself though if you don't see anything that's filling it up and post the output.
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Nagios server memory use

Post by gormank »

I let checks run for a bit less than 24 hours overnight, then stopped them. RAM usage went from <2G to >7G in that time. RAM usage is still >7G several hours later so the issue seems to be deeper than scripts or runaway scripts.

ps aux output below. I didn't see much to indicate a problem, but hopefully someone else will...

Code: Select all

# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  19364  1452 ?        Ss   Dec14   0:00 /sbin/init
root         2  0.0  0.0      0     0 ?        S    Dec14   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Dec14   4:23 [migration/0]
root         4  0.0  0.0      0     0 ?        S    Dec14   5:33 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S    Dec14   0:00 [stopper/0]
root         6  0.0  0.0      0     0 ?        S    Dec14   0:21 [watchdog/0]
root         7  0.0  0.0      0     0 ?        S    Dec14   0:29 [migration/1]
root         8  0.0  0.0      0     0 ?        S    Dec14   0:00 [stopper/1]
root         9  0.0  0.0      0     0 ?        S    Dec14   0:10 [ksoftirqd/1]
root        10  0.0  0.0      0     0 ?        S    Dec14   0:01 [watchdog/1]
root        11  0.0  0.0      0     0 ?        S    Dec14   1:04 [migration/2]
root        12  0.0  0.0      0     0 ?        S    Dec14   0:00 [stopper/2]
root        13  0.0  0.0      0     0 ?        S    Dec14   0:59 [ksoftirqd/2]
root        14  0.0  0.0      0     0 ?        S    Dec14   0:06 [watchdog/2]
root        15  0.0  0.0      0     0 ?        S    Dec14   1:55 [migration/3]
root        16  0.0  0.0      0     0 ?        S    Dec14   0:00 [stopper/3]
root        17  0.0  0.0      0     0 ?        S    Dec14   2:11 [ksoftirqd/3]
root        18  0.0  0.0      0     0 ?        S    Dec14   0:14 [watchdog/3]
root        19  0.0  0.0      0     0 ?        S    Dec14   3:56 [events/0]
root        20  0.0  0.0      0     0 ?        S    Dec14   0:52 [events/1]
root        21  0.0  0.0      0     0 ?        S    Dec14   1:20 [events/2]
root        22  0.0  0.0      0     0 ?        S    Dec14   2:40 [events/3]
root        23  0.0  0.0      0     0 ?        S    Dec14   0:00 [cgroup]
root        24  0.0  0.0      0     0 ?        S    Dec14   0:00 [khelper]
root        25  0.0  0.0      0     0 ?        S    Dec14   0:00 [netns]
root        26  0.0  0.0      0     0 ?        S    Dec14   0:00 [async/mgr]
root        27  0.0  0.0      0     0 ?        S    Dec14   0:00 [pm]
root        28  0.0  0.0      0     0 ?        S    Dec14   0:03 [sync_supers]
root        29  0.0  0.0      0     0 ?        S    Dec14   0:08 [bdi-default]
root        30  0.0  0.0      0     0 ?        S    Dec14   0:00 [kintegrityd/0]
root        31  0.0  0.0      0     0 ?        S    Dec14   0:00 [kintegrityd/1]
root        32  0.0  0.0      0     0 ?        S    Dec14   0:00 [kintegrityd/2]
root        33  0.0  0.0      0     0 ?        S    Dec14   0:00 [kintegrityd/3]
root        34  0.0  0.0      0     0 ?        S    Dec14  12:25 [kblockd/0]
root        35  0.0  0.0      0     0 ?        S    Dec14   2:19 [kblockd/1]
root        36  0.0  0.0      0     0 ?        S    Dec14   7:43 [kblockd/2]
root        37  0.0  0.0      0     0 ?        S    Dec14  10:54 [kblockd/3]
root        38  0.0  0.0      0     0 ?        S    Dec14   0:00 [kacpid]
root        39  0.0  0.0      0     0 ?        S    Dec14   0:00 [kacpi_notify]
root        40  0.0  0.0      0     0 ?        S    Dec14   0:00 [kacpi_hotplug]
root        41  0.0  0.0      0     0 ?        S    Dec14   0:00 [ata_aux]
root        42  0.0  0.0      0     0 ?        S    Dec14   0:00 [ata_sff/0]
root        43  0.0  0.0      0     0 ?        S    Dec14   0:00 [ata_sff/1]
root        44  0.0  0.0      0     0 ?        S    Dec14   0:00 [ata_sff/2]
root        45  0.0  0.0      0     0 ?        S    Dec14   0:00 [ata_sff/3]
root        46  0.0  0.0      0     0 ?        S    Dec14   0:00 [ksuspend_usbd]
root        47  0.0  0.0      0     0 ?        S    Dec14   0:00 [khubd]
root        48  0.0  0.0      0     0 ?        S    Dec14   0:00 [kseriod]
root        49  0.0  0.0      0     0 ?        S    Dec14   0:00 [md/0]
root        50  0.0  0.0      0     0 ?        S    Dec14   0:00 [md/1]
root        51  0.0  0.0      0     0 ?        S    Dec14   0:00 [md/2]
root        52  0.0  0.0      0     0 ?        S    Dec14   0:00 [md/3]
root        53  0.0  0.0      0     0 ?        S    Dec14   0:00 [md_misc/0]
root        54  0.0  0.0      0     0 ?        S    Dec14   0:00 [md_misc/1]
root        55  0.0  0.0      0     0 ?        S    Dec14   0:00 [md_misc/2]
root        56  0.0  0.0      0     0 ?        S    Dec14   0:00 [md_misc/3]
root        57  0.0  0.0      0     0 ?        S    Dec14   0:00 [linkwatch]
root        59  0.0  0.0      0     0 ?        S    Dec14   0:00 [khungtaskd]
root        60  0.0  0.0      0     0 ?        S    Dec14   0:17 [kswapd0]
root        61  0.0  0.0      0     0 ?        SN   Dec14   0:00 [ksmd]
root        62  0.0  0.0      0     0 ?        SN   Dec14   1:03 [khugepaged]
root        63  0.0  0.0      0     0 ?        S    Dec14   0:00 [aio/0]
root        64  0.0  0.0      0     0 ?        S    Dec14   0:00 [aio/1]
root        65  0.0  0.0      0     0 ?        S    Dec14   0:00 [aio/2]
root        66  0.0  0.0      0     0 ?        S    Dec14   0:00 [aio/3]
root        67  0.0  0.0      0     0 ?        S    Dec14   0:00 [crypto/0]
root        68  0.0  0.0      0     0 ?        S    Dec14   0:00 [crypto/1]
root        69  0.0  0.0      0     0 ?        S    Dec14   0:00 [crypto/2]
root        70  0.0  0.0      0     0 ?        S    Dec14   0:00 [crypto/3]
root        78  0.0  0.0      0     0 ?        S    Dec14   0:00 [kthrotld/0]
root        79  0.0  0.0      0     0 ?        S    Dec14   0:00 [kthrotld/1]
root        80  0.0  0.0      0     0 ?        S    Dec14   0:00 [kthrotld/2]
root        81  0.0  0.0      0     0 ?        S    Dec14   0:00 [kthrotld/3]
root        82  0.0  0.0      0     0 ?        S    Dec14   0:00 [pciehpd]
root        84  0.0  0.0      0     0 ?        S    Dec14   0:00 [kpsmoused]
root        85  0.0  0.0      0     0 ?        S    Dec14   0:00 [usbhid_resumer]
root        86  0.0  0.0      0     0 ?        S    Dec14   0:00 [deferwq]
root       117  0.0  0.0      0     0 ?        S    Dec14   0:00 [kdmremove]
root       118  0.0  0.0      0     0 ?        S    Dec14   0:00 [kstriped]
root       201  0.0  0.0      0     0 ?        S    Dec14   0:00 [scsi_eh_0]
root       202  0.0  0.0      0     0 ?        S    Dec14   0:00 [scsi_eh_1]
root       212  0.0  0.0      0     0 ?        S    Dec14   0:00 [scsi_eh_2]
root       213  0.0  0.0      0     0 ?        S    Dec14   0:00 [vmw_pvscsi_wq_2]
root       331  0.0  0.0      0     0 ?        S    Dec14   9:03 [jbd2/sda4-8]
root       332  0.0  0.0      0     0 ?        S    Dec14   0:00 [ext4-dio-unwrit]
root       424  0.0  0.0  10652   416 ?        S<s  Dec14   0:00 /sbin/udevd -d
root       535  0.0  0.0      0     0 ?        S    Dec14   0:23 [vmmemctl]
root       781  0.0  0.0      0     0 ?        S    Dec14   0:00 [jbd2/sda1-8]
root       782  0.0  0.0      0     0 ?        S    Dec14   0:00 [ext4-dio-unwrit]
root       783  0.0  0.0      0     0 ?        S    Dec14   8:15 [jbd2/sda2-8]
root       784  0.0  0.0      0     0 ?        S    Dec14   0:00 [ext4-dio-unwrit]
root       836  0.0  0.0      0     0 ?        S    Dec14   0:24 [kauditd]
apache     862  0.4  0.1 458208 39216 ?        S    Dec28   7:08 /usr/sbin/httpd
postgres   869  0.0  0.0 217400  6320 ?        Ss   Dec28   0:10 postgres: nagiosxi nagiosxi [local] idle
root       874  1.0  0.0      0     0 ?        S    Dec14 223:38 [flush-8:0]
root      1145  0.0  0.0  93144   748 ?        S<sl Dec14   1:28 auditd
root      1175  0.0  0.0 251964  5696 ?        Sl   Dec14   0:18 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root      1205  0.0  0.0  10960   572 ?        Ss   Dec14   4:25 irqbalance --pid=/var/run/irqbalance.pid
rpc       1221  0.0  0.0  18976   740 ?        Ss   Dec14   0:01 rpcbind
rpcuser   1242  0.0  0.0  23348  1232 ?        Ss   Dec14   0:00 rpc.statd
dbus      1270  0.0  0.0  21432   924 ?        Ss   Dec14   0:00 dbus-daemon --system
root      1310  0.0  0.0 188928  3336 ?        Ss   Dec14   0:00 cupsd -C /etc/cups/cupsd.conf
root      1339  0.0  0.0   4080   636 ?        Ss   Dec14   0:00 /usr/sbin/acpid
68        1349  0.0  0.0  38120  3856 ?        Ssl  Dec14   0:05 hald
root      1350  0.0  0.0  20400  1156 ?        S    Dec14   0:00 hald-runner
root      1382  0.0  0.0  22520  1084 ?        S    Dec14   0:00 hald-addon-input: Listening on /dev/input/event2 /dev/input/event0
68        1394  0.0  0.0  18008  1036 ?        S    Dec14   0:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
root      1414  0.0  0.0 386132  1900 ?        Ssl  Dec14   0:18 automount --pid-file /var/run/autofs.pid
root      1520  0.0  0.0   6260   292 ?        Ss   Dec14   0:00 /usr/sbin/mcelog --daemon
root      1555  0.0  0.0  66216  1220 ?        Ss   Dec14   0:00 /usr/sbin/sshd
ntp       1573  0.0  0.0  26508  1944 ?        Ss   Dec14   0:01 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root      1789  0.0  0.0  88836  2688 ?        Ss   Dec14   0:28 sendmail: accepting connections
smmsp     1798  0.0  0.0  78212  2096 ?        Ss   Dec14   0:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      1822  0.0  0.0 114644  1076 ?        Ss   Dec14   0:00 /usr/sbin/abrtd
root      1919  0.0  0.0 116892  1280 ?        Ss   Dec14   0:36 crond
root      1946  0.0  0.0  21104   484 ?        Ss   Dec14   0:00 /usr/sbin/atd
root      1960  0.0  0.0 108340   612 ?        Ss   Dec14   0:00 /usr/bin/rhsmcertd
nagios    2025  0.0  0.0  51552  2232 ?        S    19:32   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
ajaxterm  2055  0.0  0.0 170340  7628 ?        Sl   Dec14   4:35 python /usr/share/ajaxterm/ajaxterm.py --daemon --port=8022 --uid=ajaxterm
root      2070  0.0  0.0  64232  1208 ?        Ss   Dec14   0:00 /usr/sbin/certmonger -S -p /var/run/certmonger.pid
apache    2434  0.5  0.1 461988 42808 ?        S    15:15   1:44 /usr/sbin/httpd
root      2523  0.0  0.0   4064   552 tty1     Ss+  Dec14   0:00 /sbin/mingetty /dev/tty1
root      2525  0.0  0.0   4064   548 tty2     Ss+  Dec14   0:00 /sbin/mingetty /dev/tty2
root      2527  0.0  0.0   4064   548 tty3     Ss+  Dec14   0:00 /sbin/mingetty /dev/tty3
root      2529  0.0  0.0   4064   548 tty4     Ss+  Dec14   0:00 /sbin/mingetty /dev/tty4
root      2532  0.0  0.0   4064   552 tty5     Ss+  Dec14   0:00 /sbin/mingetty /dev/tty5
root      2533  0.0  0.0  10648   368 ?        S<   Dec14   0:00 /sbin/udevd -d
root      2534  0.0  0.0  10648   408 ?        S<   Dec14   0:00 /sbin/udevd -d
root      2537  0.0  0.0   4064   548 tty6     Ss+  Dec14   0:00 /sbin/mingetty /dev/tty6
nagios    2678  0.0  0.0  51020  2892 ?        S    18:06   0:01 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
postgres  2929  0.0  0.0 217448  5896 ?        Ss   15:16   0:02 postgres: nagiosxi nagiosxi [local] idle
apache    3250  0.5  0.1 458024 38612 ?        R    15:57   1:29 /usr/sbin/httpd
postgres  3259  0.0  0.0 217448  5948 ?        Ss   15:57   0:02 postgres: nagiosxi nagiosxi [local] idle
nagios    3442  0.0  0.0  50660  2132 ?        S    19:34   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
apache    3649  0.4  0.1 463500 44504 ?        S    Dec28   7:15 /usr/sbin/httpd
postgres  3729  0.0  0.0 217448  6352 ?        Ss   Dec28   0:10 postgres: nagiosxi nagiosxi [local] idle
nagios    3733  0.0  0.0  51536  2856 ?        S    18:07   0:01 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
apache    4724  0.4  0.1 463492 44404 ?        S    Dec28   6:31 /usr/sbin/httpd
postgres  4864  0.0  0.0 217400  6376 ?        Ss   Dec28   0:09 postgres: nagiosxi nagiosxi [local] idle
nagios    5672  0.0  0.0  50508  1908 ?        S    20:21   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
root      6049  0.0  0.0 139780  1760 ?        S    20:22   0:00 CROND
root      6050  0.0  0.0 139780  1760 ?        S    20:22   0:00 CROND
root      6051  0.0  0.0 139780  1760 ?        S    20:22   0:00 CROND
root      6052  0.0  0.0 139780  1760 ?        S    20:22   0:00 CROND
root      6053  0.0  0.0 139780  1760 ?        S    20:22   0:00 CROND
nagios    6056  0.0  0.0 106096  1132 ?        Ss   20:22   0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc
nagios    6058  0.0  0.0 106096  1136 ?        Ss   20:22   0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman
nagios    6060  0.3  0.0 321140 23380 ?        S    20:22   0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
nagios    6061  0.0  0.0 106096  1136 ?        Ss   20:22   0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.l
nagios    6062  0.4  0.0 329372 32052 ?        S    20:22   0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
nagios    6063  0.0  0.0 106096  1136 ?        Ss   20:22   0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perf
nagios    6065  0.0  0.0 106096  1136 ?        Ss   20:22   0:00 /bin/sh -c /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubs
nagios    6066  0.3  0.0 321072 24012 ?        S    20:22   0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
nagios    6067  0.3  0.0 321312 23888 ?        S    20:22   0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
nagios    6068  0.4  0.0 320908 23564 ?        S    20:22   0:00 /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
postgres  6074  0.0  0.0 217296  5272 ?        Ss   20:22   0:00 postgres: nagiosxi nagiosxi [local] idle
postgres  6076  0.0  0.0 217296  5288 ?        Ss   20:22   0:00 postgres: nagiosxi nagiosxi [local] idle
postgres  6081  0.0  0.0 217296  5260 ?        Ss   20:22   0:00 postgres: nagiosxi nagiosxi [local] idle
postgres  6083  0.0  0.0 217372  5812 ?        Ss   20:22   0:00 postgres: nagiosxi nagiosxi [local] idle
postgres  6090  0.0  0.0 217372  5592 ?        Ss   20:22   0:00 postgres: nagiosxi nagiosxi [local] idle
nagios    6530  0.0  0.0  49852  1624 ?        S    20:22   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios    6601 18.0  0.0 192412 21596 ?        R    20:22   0:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_esx3.pl -H 10.133.134.71 -f /usr/local/nagiosxi/et
nagios    6602  0.0  0.0  37244  2308 ?        S    20:22   0:00 /usr/local/nagios/libexec/check_nrpe -H 10.133.134.21 -u -t 30 -c winfsio
nagios    6603  0.0  0.0  37244  2304 ?        S    20:22   0:00 /usr/local/nagios/libexec/check_nrpe -H 10.133.134.93 -u -t 30 -c check_vastool
nagios    6604  0.0  0.0  37244  2312 ?        S    20:22   0:00 /usr/local/nagios/libexec/check_nrpe -H 10.133.134.96 -u -t 30 -c check_cpu_queue -a 4 6
nagios    6605  0.0  0.0  37244  2312 ?        S    20:22   0:00 /usr/local/nagios/libexec/check_nrpe -H 10.133.134.95 -u -t 30 -c check_cpu_queue -a 4 6
nagios    6606  0.0  0.0  37244  2312 ?        S    20:22   0:00 /usr/local/nagios/libexec/check_nrpe -H 10.133.134.47 -u -t 30 -c check_cpu_queue -a 4 6
nagios    6607  0.0  0.0  37244  2308 ?        S    20:22   0:00 /usr/local/nagios/libexec/check_nrpe -H 10.133.134.34 -u -t 30 -c check_cpu_queue -a 4 6
nagios    6608 11.0  0.0 145016 10980 ?        S    20:22   0:00 /usr/bin/perl /usr/local/nagios/libexec/check_hp -H 10.133.133.57 -C sp1der -x cpqFcaHostCntlrStatu
nagios    6609 21.0  0.0 196452 25916 ?        R    20:22   0:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_esx3.pl -H 10.133.134.72 -f /usr/local/nagiosxi/et
nagios    6610 23.0  0.0 200152 29720 ?        R    20:22   0:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_esx3.pl -H 10.133.134.71 -f /usr/local/nagiosxi/et
nagios    6611 12.0  0.0 145016 10980 ?        S    20:22   0:00 /usr/bin/perl /usr/local/nagios/libexec/check_hp -H 10.133.133.12 -C sp1der -x cpqFcaHostCntlrStatu
nagios    6612 22.0  0.0 199888 29388 ?        R    20:22   0:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_esx3.pl -H 10.133.134.71 -f /usr/local/nagiosxi/et
nagios    6613 22.0  0.0 197672 27140 ?        R    20:22   0:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_esx3.pl -H 10.133.134.72 -f /usr/local/nagiosxi/et
nagios    6614 20.0  0.0 192952 22208 ?        R    20:22   0:00 /usr/bin/perl -w /usr/local/nagios/libexec/check_esx3.pl -H 10.133.134.71 -f /usr/local/nagiosxi/et
nagios    6615 12.0  0.0 145016 10976 ?        R    20:22   0:00 /usr/bin/perl /usr/local/nagios/libexec/check_hp -H 10.133.133.51 -C sp1der -x cpqFcaHostCntlrStatu
nagios    6616 12.0  0.0 145016 10980 ?        S    20:22   0:00 /usr/bin/perl /usr/local/nagios/libexec/check_hp -H 10.133.133.28 -C sp1der -x cpqFcaHostCntlrStatu
nagios    6617 11.0  0.0 145016 10980 ?        S    20:22   0:00 /usr/bin/perl /usr/local/nagios/libexec/check_hp -H 10.133.133.60 -C sp1der -x cpqFcaHostCntlrStatu
nagios    6620  0.0  0.0 106228  1332 ?        S    20:22   0:00 /bin/bash /usr/local/nagios/libexec/check_3par_perf 10.133.133.79 nagios check_port_fc
nagios    6621  0.0  0.0 106228  1332 ?        S    20:22   0:00 /bin/bash /usr/local/nagios/libexec/check_3par_perf 10.133.133.92 nagios check_ld
root      6668  0.0  0.0 110240  1120 pts/0    R+   20:22   0:00 ps aux
nagios    6994  0.0  0.0  51044  2196 ?        S    19:39   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios    8102  0.0  0.0  51544  2152 ?        S    19:41   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios    8717  0.0  0.0  51540  2516 ?        S    18:58   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios    9404  0.0  0.0  51036  2516 ?        S    18:59   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
apache   10302  0.5  0.1 453140 33912 ?        S    16:05   1:29 /usr/sbin/httpd
postgres 10369  0.0  0.0 217400  5892 ?        Ss   16:06   0:02 postgres: nagiosxi nagiosxi [local] idle
nagios   10466  0.0  0.0  51564  2132 ?        S    19:44   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
root     11740  0.0  0.1 237024 38304 ?        Ssl  Dec23   0:10 /opt/puppet/bin/ruby /opt/puppet/bin/puppet agent
root     14337  0.0  0.0 102500  4804 ?        Ss   16:11   0:00 sshd: root@pts/0
root     14403  0.0  0.0 108432  1884 pts/0    Ss   16:11   0:00 -bash
root     14500  0.0  0.0 102500  4804 ?        Ss   16:11   0:00 sshd: root@pts/1
root     14678  0.0  0.0 108432  1908 pts/1    Ss+  16:12   0:00 -bash
nagios   15777  0.0  0.0  51028  2076 ?        S    19:51   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
apache   15842  0.5  0.1 455396 36180 ?        S    16:57   1:08 /usr/sbin/httpd
postgres 15850  0.0  0.0 217448  5872 ?        Ss   16:57   0:01 postgres: nagiosxi nagiosxi [local] idle
postgres 16031  0.0  0.0 215972  5076 ?        S    Dec23   0:49 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
postgres 16084  0.0  0.0 178984  1132 ?        Ss   Dec23   0:16 postgres: logger process
postgres 16086  0.0  0.0 216088  4824 ?        Ss   Dec23   2:46 postgres: writer process
postgres 16087  0.0  0.0 215972  1400 ?        Ss   Dec23   1:11 postgres: wal writer process
postgres 16088  0.0  0.0 216256  1588 ?        Ss   Dec23   0:29 postgres: autovacuum launcher process
postgres 16089  0.0  0.0 179248  1304 ?        Ss   Dec23   0:59 postgres: stats collector process
gearmand 16457  0.3  0.0 475044  5320 ?        Ssl  Dec23  30:39 /usr/local/sbin/gearmand -d --log-file /var/log/gearman/gearmand.log
root     16725  0.0  0.0 108204  1456 ?        S    Dec23   0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file
mysql    16827  0.6  0.1 2229500 57064 ?       Sl   Dec23  62:11 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysql
nagios   16988  0.0  0.0 368888   964 ?        S    Dec23   1:09 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd.cfg
nagios   17122  0.6  3.6 1290632 1212748 ?     Ssl  Dec23  57:50 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   17124  0.0  0.0  10020   924 ?        S    Dec23   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   17125  0.0  0.0  10016   932 ?        S    Dec23   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   17126  0.0  0.0  10016   928 ?        S    Dec23   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   17127  0.0  0.0  10016   932 ?        S    Dec23   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   17128  0.0  0.0  10016   928 ?        S    Dec23   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   17129  0.0  0.0  10016   928 ?        S    Dec23   0:00 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
apache   17137  0.5  0.1 454976 35804 ?        S    15:34   1:39 /usr/sbin/httpd
nagios   17140  0.0  0.0  66124  4020 ?        S    Dec23   0:33 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   17146  0.0  0.0  47748  1144 ?        Ss   Dec23   0:37 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
postgres 17378  0.0  0.0 217448  5892 ?        Ss   15:34   0:02 postgres: nagiosxi nagiosxi [local] idle
root     17695  0.0  0.1 461320 39792 ?        SNl  Dec27   1:31 /opt/puppet/bin/ruby /opt/puppet/sbin/mcollectived --pid=/var/run/pe-mcollective.pid --config=/etc/
nagios   18231  0.0  0.0  51536  2420 ?        S    19:11   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
apache   18446  0.5  0.1 457008 37536 ?        S    14:55   1:45 /usr/sbin/httpd
postgres 19153  0.0  0.0 217448  5904 ?        Ss   14:56   0:02 postgres: nagiosxi nagiosxi [local] idle
nagios   19189  0.0  0.0  51024  2388 ?        S    19:12   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   19325  0.0  0.0  51536  2024 ?        S    19:56   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   19597  0.0  0.0  51620  2404 ?        S    19:13   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   20559  0.0  0.0  51092  2392 ?        S    19:14   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   20667  0.0  0.0  51540  2368 ?        S    19:14   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   21746  0.0  0.0  51536  2336 ?        S    19:15   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   22051  0.0  0.0  50520  2336 ?        S    19:15   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   22456  0.0  0.0  51540  2348 ?        S    19:16   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   23083  0.0  0.0  51024  2324 ?        S    19:17   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   23280  0.0  0.0  50508  1980 ?        S    20:01   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   23497  0.0  0.0  51540  2356 ?        S    19:17   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   24412  0.0  0.0  51536  2880 ?        S    17:51   0:01 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
apache   24590  0.5  0.1 453332 34352 ?        S    15:03   1:46 /usr/sbin/httpd
root     24689  0.0  0.0 339788 20172 ?        Ss   Dec23   0:20 /usr/sbin/httpd
nagios   24905  0.0  0.0  51536  2664 ?        S    18:36   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
postgres 24985  0.0  0.0 217964  6580 ?        Ss   15:03   0:02 postgres: nagiosxi nagiosxi [local] idle
nagios   28218  0.0  0.0  51044  2904 ?        S    17:57   0:01 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   28383  0.0  0.0  51548  2336 ?        S    19:24   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
root     28399  0.0  0.0  21716  1004 ?        Ss   Dec23   0:07 xinetd -stayalive -pidfile /var/run/xinetd.pid
nagios   29504  0.0  0.0  51032  2320 ?        S    19:26   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   29841  0.0  0.0  49824   620 ?        Ss   Dec23   0:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios   30044  0.0  0.0  51028  2888 ?        S    17:59   0:01 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   30055  0.0  0.0  49824  1400 ?        S    Dec23   1:39 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios   30056  0.1  0.0  50804  2488 ?        S    Dec23   9:30 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
nagios   30367  0.0  0.0  49992  1964 ?        S    20:11   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   30556  0.0  0.0  51032  2248 ?        S    19:27   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   30779  0.0  0.0  51032  2900 ?        S    18:00   0:01 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
apache   31027  0.5  0.1 457016 37744 ?        S    18:00   0:44 /usr/sbin/httpd
postgres 31101  0.0  0.0 217400  5888 ?        Ss   18:00   0:01 postgres: nagiosxi nagiosxi [local] idle
apache   31113  0.3  0.1 456904 37788 ?        S    Dec28   5:57 /usr/sbin/httpd
root     31130  0.0  0.0 197872  6392 ?        Ss   Dec23   0:10 /usr/sbin/snmptrapd -Lsd -On -p /var/run/snmptrapd.pid
postgres 31193  0.0  0.0 217916  6888 ?        Ss   Dec28   0:08 postgres: nagiosxi nagiosxi [local] idle
root     31362  0.0  0.0 167744 15008 ?        Ss   Dec23   0:07 /usr/bin/perl /usr/local/sbin/snmptt --daemon
snmptt   31363  0.0  0.0 172020 16264 ?        Ss   Dec23   0:42 /usr/bin/perl /usr/local/sbin/snmptt --daemon
nagios   31969  0.0  0.0  51544  2220 ?        S    19:29   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
nagios   32663  0.0  0.0  51028  2532 ?        S    18:46   0:00 /usr/local/bin/mod_gearman2_worker -d --config=/etc/mod_gearman/mod_gearman_worker.conf --pidfile=/
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagios server memory use

Post by ssax »

You have 36 gearman worker processes running, is that how many you have specified in your gearman config?
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Nagios server memory use

Post by gormank »

Looks like the min/max are 30 and 50.
Too many?

# grep -v ^# /etc/mod_gearman/mod_gearman_worker.conf | sort -u
debug=0
enable_embedded_perl=on
encryption=yes
eventhandler=yes
fork_on_exec=no
hosts=yes
idle-timeout=30
job_timeout=60
key=gbJJBpWfBC8ZVTAO4i
load_limit1=0
load_limit15=0
load_limit5=0
logfile=/var/log/mod_gearman_worker/mod_gearman_worker.log
max-jobs=1000
max-worker=50
min-worker=30
p1_file=/usr/local/share/mod_gearman2/mod_gearman_p1.pl
server=localhost:4730
services=yes
show_error_output=yes
spawn-rate=1
use_embedded_perl_implicitly=off
use_perl_cache=on
workaround_rc_25=off
Locked