Page 3 of 5

Re: Nagios server memory use

Posted: Wed Dec 30, 2015 5:36 pm
by gormank
I thought the workers, other than the parent were more short lived, but maybe not.

Code: Select all

# date
Wed Dec 30 22:32:49 UTC 2015

# ps -ef | grep 2_worker | grep -v grep | awk '{printf "%-6s %-6s %-6s\n", $5, $2, $3}' | sort
20:09  27750  17146
20:19  3037   17146
20:21  4114   17146
20:22  5182   17146
20:27  8688   17146
20:31  11515  17146
20:32  12587  17146
20:36  15103  17146
21:03  2922   17146
21:04  3584   17146
21:09  7147   17146
21:14  10994  17146
21:16  12583  17146
21:21  16557  17146
21:22  16828  17146
21:22  17266  17146
21:24  18300  17146
21:24  18666  17146
21:25  19074  17146
21:26  19792  17146
21:27  20821  17146
21:29  22212  17146
21:31  23669  17146
21:32  24729  17146
21:41  30764  17146
21:46  2538   17146
21:51  6057   17146
22:01  13556  17146
22:11  20644  17146
22:16  24898  17146
22:25  31628  17146
22:31  3666   17146
Dec23  17146  1

Re: Nagios server memory use

Posted: Mon Jan 04, 2016 4:23 pm
by ssax
The default for min-worker is 5, can you try dropping it down to 5 and see if you need them that high? Max is still 50 so it will use up to 50 but if they are not required I wouldn't start them as they will just consume resources.

Re: Nagios server memory use

Posted: Tue Jan 05, 2016 2:29 pm
by gormank
Ok,
I've started the ball rolling on reducing to 5 and adding RAM. It'll take til next week to get that done.
The live systems seem to be at ~25G and holding, but more monitoring will be added.

Re: Nagios server memory use

Posted: Wed Jan 06, 2016 10:21 am
by ssax
Are you running gearman2? What is the output of this command:

Code: Select all

rpm -qa | grep gearman

Re: Nagios server memory use

Posted: Wed Jan 06, 2016 10:50 am
by ssax
In addition to my post above, are you seeing anything in your nagios.log around that time?

Also, what is the output of this:

Code: Select all

grep "Core Worker" /usr/local/nagios/var/nagios.log

Re: Nagios server memory use

Posted: Thu Jan 07, 2016 3:06 pm
by gormank
Both commands below produce no result but there are 36 gearman2 processes running.

grep "Core Worker" /usr/local/nagios/var/nagios.log
rpm -qa | grep -i gearman

What time is "that time?"

Re: Nagios server memory use

Posted: Fri Jan 08, 2016 10:26 am
by ssax
I meant when it's getting around the time of max memory usage or before that.

We have not tested XI against using gearman2 and I'm not sure if these problems could be related, did you upgrade gearman recently? Has this ran fine before with gearman2?

We recommend using the version from our guide:

Code: Select all

https://assets.nagios.com/downloads/nagiosxi/docs/Integrating_Mod_Gearman_with_Nagios_XI.pdf

Re: Nagios server memory use

Posted: Fri Jan 08, 2016 11:09 pm
by gormank
I don't see anything of interest in the log.
No updates have been done.

Re: Nagios server memory use

Posted: Mon Jan 11, 2016 2:44 pm
by bheden
Recently, we've been able to identify some memory leak issues revolving around the mod_gearman Nagios event broker module and we've identified a possible solution. I'd like to see what happens if you upgrade to a newer version, using the following instructions. Please make sure you have a working backup of your server in case of failure.

Code: Select all

cd /tmp
yum remove libgearman-devel libgearman gearmand mod_gearman
mkdir gearman_install
cd gearman_install/
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-devel-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/gearmand-server-0.33-2.rhel6.x86_64.rpm
wget http://mod-gearman.org/download/v2.1.1/rhel6/x86_64/mod_gearman2-2.1.1-1.rhel6.x86_64.rpm
yum --nogpgcheck localinstall *
sed -i 's/\(^broker_module=.*mod_gearman.*\)/#\1/' /usr/local/nagios/etc/nagios.cfg
echo "broker_module=/usr/lib64/mod_gearman2/mod_gearman2.o config=/etc/mod_gearman/mod_gearman_neb.conf eventhandler=no" >> /usr/local/nagios/etc/nagios.cfg
service nagios stop
service mod_gearman_worker stop
service gearmand stop
service gearmand start
service mod_gearman_worker start
service nagios start
Please inform us if this resolves your issue. Thank you.

Re: Nagios server memory use

Posted: Wed Jan 20, 2016 12:58 pm
by gormank
I can't just update stuff on these things. They're in a datacenter w/ strict rules.
I reduced the minimum gearman workers and added RAM so they're at 64G. Memeory use is steadily increasing, but hopefully it'll stop at 25G as it has in the past.
The number of workers is at 35 which is the same as when the minimum was lower.

I'll keep an eye on it.