Page 1 of 1

high memory usage

Posted: Fri Jan 29, 2021 10:55 am
by kendallchenoweth
I have several instances of Nagios XI at version Nagios XI 2014R2.0 (Yes, I know it's old) and we're seeing a very high memory utilization.

The user response time of the application is slow; the action requested often occurs after a delay. When the memory is bumped up, we find that the nagios process just keeps using more memory. On one instance, we have 24G of RAM, where almost all has been used. The Nagios process is using between 87% and 95% of the memory consistently. I don't see any problem with I/O wait and the load average is pretty consistently around 2 with 420 tasks. There are no zombie processes. CPU utilization is pretty consistently 75% idle.

Grepping objects.cache, I can tell you how much is running on the host. Our check interval is typically 5 minutes.

7440 occurrences of "define service {"
516 occurrences of "define host {"

In nagios.log, I'm getting a lot of these errors.

[1611069120] Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1611069120.perfdata.service"
[1611936650] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;

Here is my php.ini (without comments)
[PHP]
engine = On
short_open_tag = Off
asp_tags = Off
precision = 14
y2k_compliance = On
output_buffering = 4096
zlib.output_compression = Off
implicit_flush = Off
unserialize_callback_func =
serialize_precision = 100
allow_call_time_pass_reference = Off
safe_mode = Off
safe_mode_gid = Off
safe_mode_include_dir =
safe_mode_exec_dir =
safe_mode_allowed_env_vars = PHP_
safe_mode_protected_env_vars = LD_LIBRARY_PATH
disable_functions =
disable_classes =
expose_php = Off
max_execution_time = 30
max_input_time = 60
memory_limit = 128M
error_reporting = E_ALL & ~E_DEPRECATED
display_errors = Off
display_startup_errors = Off
log_errors = On
log_errors_max_len = 1024
ignore_repeated_errors = Off
ignore_repeated_source = Off
report_memleaks = On
track_errors = Off
html_errors = Off
variables_order = "GPCS"
request_order = "GP"
register_globals = Off
register_long_arrays = Off
register_argc_argv = Off
auto_globals_jit = On
post_max_size = 8M
magic_quotes_gpc = Off
magic_quotes_runtime = Off
magic_quotes_sybase = Off
auto_prepend_file =
auto_append_file =
default_mimetype = "text/html"
doc_root =
user_dir =
enable_dl = Off
file_uploads = On
upload_max_filesize = 2M
allow_url_fopen = On
allow_url_include = Off
default_socket_timeout = 60

[Date]
date.timezone = America/New_York

[filter]

[iconv]

[intl]

[sqlite]

[sqlite3]

[Pcre]

[Pdo]

[Phar]

[Syslog]
define_syslog_variables = Off

[mail function]
SMTP = localhost
smtp_port = 25
sendmail_path = /usr/sbin/sendmail -t -i
mail.add_x_header = On

[SQL]
sql.safe_mode = Off

[ODBC]
odbc.allow_persistent = On
odbc.check_persistent = On
odbc.max_persistent = -1
odbc.max_links = -1
odbc.defaultlrl = 4096
odbc.defaultbinmode = 1

[MySQL]
mysql.allow_persistent = On
mysql.max_persistent = -1
mysql.max_links = -1
mysql.default_port =
mysql.default_socket =
mysql.default_host =
mysql.default_user =
mysql.default_password =
mysql.connect_timeout = 60
mysql.trace_mode = Off

[MySQLi]

mysqli.max_links = -1
mysqli.default_port = 3306
mysqli.default_socket =
mysqli.default_host =
mysqli.default_user =
mysqli.default_pw =
mysqli.reconnect = Off

[PostgresSQL]
pgsql.allow_persistent = On
pgsql.auto_reset_persistent = Off
pgsql.max_persistent = -1
pgsql.max_links = -1
pgsql.ignore_notice = 0
pgsql.log_notice = 0

[Sybase-CT]
sybct.allow_persistent = On
sybct.max_persistent = -1
sybct.max_links = -1
sybct.min_server_severity = 10
sybct.min_client_severity = 10

[bcmath]
bcmath.scale = 0

[browscap]

[Session]
session.save_handler = files
session.save_path = "/var/lib/php/session"
session.use_cookies = 1
session.use_only_cookies = 1
session.name = PHPSESSID
session.auto_start = 0
session.cookie_lifetime = 0
session.cookie_path = /
session.cookie_domain =
session.cookie_httponly =
session.serialize_handler = php
session.gc_probability = 1
session.gc_divisor = 1000
session.gc_maxlifetime = 1440
session.bug_compat_42 = Off
session.bug_compat_warn = Off
session.referer_check =
session.entropy_length = 0
session.entropy_file =
session.cache_limiter = nocache
session.cache_expire = 180
session.use_trans_sid = 0
session.hash_function = 0
session.hash_bits_per_character = 5
url_rewriter.tags = "a=href,area=href,frame=src,input=src,form=fakeentry"

[MSSQL]
mssql.allow_persistent = On
mssql.max_persistent = -1
mssql.max_links = -1
mssql.min_error_severity = 10
mssql.min_message_severity = 10
mssql.compatability_mode = Off
mssql.secure_connection = Off

[Assertion]

[COM]

[mbstring]

[gd]

[exif]

[Tidy]
tidy.clean_output = Off

[soap]
soap.wsdl_cache_enabled=1
soap.wsdl_cache_dir="/tmp"
soap.wsdl_cache_ttl=86400

[sysvshm]




Can you provide a scoping document to suggest how much memory should be required for this environment? Can you recommend any steps for understanding the root cause any further?

Thanks!

Re: high memory usage

Posted: Mon Feb 01, 2021 2:01 pm
by benjaminsmith
Hi @kendallchenoweth,

Your Nagios XI license allows for 3 activations, production, test, and backup. I would recommend setting up a new test system and start working towards migrating to the current version, 5.8.1, as this is not a supported version.

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

In the meantime, can you download the system profile and I can check the logs for you? Thanks, Benjamin

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button

Re: high memory usage

Posted: Mon Feb 08, 2021 9:16 am
by kendallchenoweth
Here is the profile. Thanks!

I saw in the profile data that the nagios.nagios_logentries database table was in a crashed state and fixed that.

Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.

Re: high memory usage

Posted: Mon Feb 08, 2021 6:12 pm
by benjaminsmith
Hi,
I saw in the profile data that the nagios.nagios_logentries database table was in a crashed state and fixed that
Did you notice any improvement, it should have helped. Almost all the errors in the systemlog are related to crashed database tables or database connection issues.

Also, getting these errors in the nagios.log. Are checks and notifications working properly?
[alerta] curl_easy_perform() failed: Couldn't connect to server
[alerta] curl_easy_perform() failed: Couldn't connect to server
Can you post a tail output of both the nagios.log and database log since running the repair script? Thanks, Benjamin