process_check_result_file Memory Leak on Nagios 4

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
theghostman
Posts: 3
Joined: Mon May 11, 2015 5:21 am

process_check_result_file Memory Leak on Nagios 4

Post by theghostman »

Hi,

I have been checking memory leaks from our monitoring system for the past few days. We are using Nagios 3.5.1 and we have developed our own web UI to create and view status of our devices. Researching online led me to use valgrind to check memory leaks (https://support.nagios.com/forum/viewto ... 34&t=20350) and I found three issues so far:

1. log_service_event Memory Leak

Valgrind reports the following logs for log_service_event:

(1) ==21781== 95 bytes in 1 blocks are definitely lost in loss record 242 of 317
(1) ==21781== at 0x4C2CE8E: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
(1) ==21781== by 0x55D11FA: vasprintf (vasprintf.c:84)
(1) ==21781== by 0x55B3596: asprintf (asprintf.c:35)
(1) ==21781== by 0x43793F: log_service_event (logging.c:297)
(1) ==21781== by 0x419535: handle_async_service_check_result (checks.c:1246)
(1) ==21781== by 0x416F83: reap_check_results (checks.c:180)
(1) ==21781== by 0x4349E3: handle_timed_event (events.c:1361)
(1) ==21781== by 0x433E31: event_execution_loop (events.c:1071)
(1) ==21781== by 0x413435: main (nagios.c:858)

This is solved after applying the fix from: https://sourceforge.net/p/nagios/nagios ... 0e201575d/

2. MK Live Status Memory Leak

Valgrind reports the following logs for MK Live Status Memory Leak:

(1) ==5679== 36 bytes in 1 blocks are definitely lost in loss record 1,857 of 8,053
(1) ==5679== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
(1) ==5679== by 0x55E7729: strdup (strdup.c:42)
(1) ==5679== by 0x65BB993: livestatus_parse_arguments (in /opt/nagios/main/var/nebmodsOeKUS)
(1) ==5679== by 0x65BC0B7: nebmodule_init (in /opt/nagios/main/var/nebmodsOeKUS)
(1) ==5679== by 0x415842: neb_load_module (in /opt/nagios/main/bin/nagios)
(1) ==5679== by 0x4154C7: neb_load_all_modules (in /opt/nagios/main/bin/nagios)
(1) ==5679== by 0x41307C: main (in /opt/nagios/main/bin/nagios)

I have not seen any fix for this on the internet and so we decided to turn off LiveStatus by commenting out broker_module on Nagios Configuration file. We will get the status from Nagios Status File instead.

3. process_check_result_file Memory Leak

This is the issue I am not able to fix and I cannot find any fix on the internet.

Valgrind logs reports as:

(1) ==24848== 11,960 (8,704 direct, 3,256 indirect) bytes in 64 blocks are definitely lost in loss record 306 of 321
(1) ==24848== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
(1) ==24848== by 0x44DA25: process_check_result_file (utils.c:2652)
(1) ==24848== by 0x44D53D: find_executing_checks (utils.c:2502)
(1) ==24848== by 0x41327B: main (nagios.c:825)

I have tried modifying utils.c to free new_cr in various places of the source code but nothing seems to work. I am not sure if this has something to do with my setup because I cannot find anybody having the same issue on the internet.

I hope somebody can suggest a fix for this. Any information you need please let me know.

Thank you.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: process_check_result_file Memory Leak on Nagios 3.5.1

Post by tmcdonald »

The latest version of Nagios Core is 4.2.4 - https://github.com/NagiosEnterprises/nagioscore

Quite a lot has changed since 3.5.1, which was released on August 30, 2013. We've addressed memory issues at least once in recent memory. If you are still seeing these issues in 4.2.4 we can address that, but we are not updating the 3.X branch anymore.
Former Nagios employee
theghostman
Posts: 3
Joined: Mon May 11, 2015 5:21 am

Re: process_check_result_file Memory Leak on Nagios 3.5.1

Post by theghostman »

Hi tcmdonald,

That's sad to know. We will try it on Nagios 4 and let you know if we encounter the same issue.

Thank you.

Best regards,

Mike
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: process_check_result_file Memory Leak on Nagios 3.5.1

Post by tmcdonald »

We'll keep the post open.
Former Nagios employee
theghostman
Posts: 3
Joined: Mon May 11, 2015 5:21 am

Re: process_check_result_file Memory Leak on Nagios 3.5.1

Post by theghostman »

Hi again,

After upgrading my system to Nagios 4, the other two memory issues are not anymore detected except for LiveStatus.

Here is the valgrind log for LiveStatus memory leak:

570 (560 direct, 10 indirect) bytes in 10 blocks are definitely lost in loss record 9,984 of 10,008
(1) ==2954== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
(1) ==2954== by 0x67BEFEC: operator new(unsigned long) (in /usr/local/lib/mk-livestatus/livestatus.o)
(1) ==2954== by 0x677CA3F: create_outputbuffer (in /usr/local/lib/mk-livestatus/livestatus.o)
(1) ==2954== by 0x67BB704: client_thread (in /usr/local/lib/mk-livestatus/livestatus.o)
(1) ==2954== by 0x6A12183: start_thread (pthread_create.c:312)
(1) ==2954== by 0x543B37C: clone (clone.S:111)

570 (560 direct, 10 indirect) bytes in 10 blocks are definitely lost in loss record 9,984 of 10,008
(1) ==2954== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
(1) ==2954== by 0x67BEFEC: operator new(unsigned long) (in /usr/local/lib/mk-livestatus/livestatus.o)
(1) ==2954== by 0x677CA3F: create_outputbuffer (in /usr/local/lib/mk-livestatus/livestatus.o)
(1) ==2954== by 0x67BB704: client_thread (in /usr/local/lib/mk-livestatus/livestatus.o)
(1) ==2954== by 0x6A12183: start_thread (pthread_create.c:312)
(1) ==2954== by 0x543B37C: clone (clone.S:111)

Sorry I'm not very good in C and I am not sure how to enable LiveStatus for valgrind.

Anyway, I left valgrind running overnight to make sure no more memory leaks will occur.

However this morning, valgrind reported the following:

(1) ==32287== 55 bytes in 1 blocks are definitely lost in loss record 244 of 354
(1) ==32287== at 0x4C2CE8E: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
(1) ==32287== by 0x544B44C: __vasprintf_chk (vasprintf_chk.c:88)
(1) ==32287== by 0x544B351: __asprintf_chk (asprintf_chk.c:32)
(1) ==32287== by 0x432F91: asprintf (stdio2.h:178)
(1) ==32287== by 0x432F91: rotate_log_file (logging.c:409)
(1) ==32287== by 0x4303AE: handle_timed_event (events.c:1189)
(1) ==32287== by 0x430D92: event_execution_loop (events.c:1110)
(1) ==32287== by 0x413C56: main (nagios.c:814)

I tried to search but I cannot find a fix for this yet.

According to modification history (https://www.nagios.org/projects/nagios-core/history/4x/) there was a fix on log rotattion on version 4.0.2 but no mention of related memory leaks.

Is this something to worry about?

Also, since this is Nagios 4, shall I repost this as another topic?

Many thanks.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: process_check_result_file Memory Leak on Nagios 3.5.1

Post by dwhitfield »

theghostman wrote: After upgrading my system to Nagios 4, the other two memory issues are not anymore detected except for LiveStatus.
Livestatus is not our product, so we won't be able to fix the memory leak in it.

I did change the title of the thread, although in hindsight that seems unnecessary, at least for the memory leak issue. If you do want to start a new thread, you could start one on what it is you are trying to do with livestatus. Perhaps there is a replacement, or is one coming in Core 5.
Locked