NRPE installed on Docker connectivity issue

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
snosie
Posts: 6
Joined: Tue Feb 28, 2017 4:46 am

NRPE installed on Docker connectivity issue

Post by snosie »

Hi,

I haven't managed to find any previous thread on this topic, so I apologize if it has already been answered.

I have Nagios Core 4.1.1 installed on a Docker, which is working fine and currently running ping tests to multiple machines and displaying results.

My next step is to have nrpe running on these machines so I can get more info on them, but because of our tolopology, I can only install nrpe inside a dedicated docker.
I found a docker build containing nrpe here: https://hub.docker.com/r/craigwillis/nagios-nrpe/ and here: https://github.com/totem/docker-nrpe

Both seem to work fine except for one big problem. The host keeps appearing and disappearing every few seconds from the Server host list. 1 second it's there, after 4-5 seconds it disappears and returns after 5-6 seconds again.
I can't understand the cause for this behavior but I'm sure it's probably a configuration issue.

Any help will be highly appreciated!
snosie
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: NRPE installed on Docker connectivity issue

Post by rkennedy »

Are there multiple Nagios processes running? That's the only time I've seen the conflicting running configs happen.
Former Nagios Employee
snosie
Posts: 6
Joined: Tue Feb 28, 2017 4:46 am

Re: NRPE installed on Docker connectivity issue

Post by snosie »

rkennedy wrote:Are there multiple Nagios processes running? That's the only time I've seen the conflicting running configs happen.
Thanks for answering, no there are not. I checked that as well.

Is there any cfg file that may be overwriting the hosts list every few seconds that I might be overlooking?
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: NRPE installed on Docker connectivity issue

Post by mcapra »

Anything NRPE is doing is unlikely to cause this sort of behavior. It's also going to be difficult to troubleshoot a pre-built Docker container that we didn't put together ourselves.

Can you post your nagios.cfg file so we can get a better idea of what this setup looks like? It's typically located in /usr/local/nagios/etc.
Former Nagios employee
https://www.mcapra.com/
snosie
Posts: 6
Joined: Tue Feb 28, 2017 4:46 am

Re: NRPE installed on Docker connectivity issue

Post by snosie »

mcapra wrote:Anything NRPE is doing is unlikely to cause this sort of behavior. It's also going to be difficult to troubleshoot a pre-built Docker container that we didn't put together ourselves.

Can you post your nagios.cfg file so we can get a better idea of what this setup looks like? It's typically located in /usr/local/nagios/etc.
@mcapra thanks a lot for replying

I attached nagios.cfg file
nagios.cfg
(45.24 KiB) Downloaded 436 times
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: NRPE installed on Docker connectivity issue

Post by tmcdonald »

One thing that confuses me a bit is the timing - the standard web interface does not refresh often enough for something to disappear and reappear that quickly unless you are manually refreshing. Can you send a screenshot of your Core interface? It almost sounds like you are running a third-party theme instead of our standard interface. If that is the case, it might be an issue with that theme.
Former Nagios employee
snosie
Posts: 6
Joined: Tue Feb 28, 2017 4:46 am

Re: NRPE installed on Docker connectivity issue

Post by snosie »

tmcdonald wrote:One thing that confuses me a bit is the timing - the standard web interface does not refresh often enough for something to disappear and reappear that quickly unless you are manually refreshing. Can you send a screenshot of your Core interface? It almost sounds like you are running a third-party theme instead of our standard interface. If that is the case, it might be an issue with that theme.
I am in fact manually refreshing (constantly clicking the "Hosts" on the side panel)
I'm not using any 3rd party theme, but I can provide screenshots only when connected to the system (at work) so that will be only next week.

Thank you
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: NRPE installed on Docker connectivity issue

Post by dwhitfield »

Would it be possible to set up a test server that is not running in Docker and see if you still run into the issue? I know you said this won't be possible in production, but if we had a working system to compare, it would help us narrow down the issue.

The screenshots are probably not necessary since you are manually refreshing.

Also, could you run tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc/objects and attach the zip? Thanks!
snosie
Posts: 6
Joined: Tue Feb 28, 2017 4:46 am

Re: NRPE installed on Docker connectivity issue

Post by snosie »

dwhitfield wrote:Would it be possible to set up a test server that is not running in Docker and see if you still run into the issue? I know you said this won't be possible in production, but if we had a working system to compare, it would help us narrow down the issue.

The screenshots are probably not necessary since you are manually refreshing.

Also, could you run tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc/objects and attach the zip? Thanks!
Unfortunatly it won't be possible to run the server (or client) outside a Docker, our system won't allow it.

We have another directory for objects called "fd_objects" (referred to in nagios.cfg) so I tar-ed both and attached them here.
supporttar.tar.gz
/usr/local/nagios/etc/objects
(7.25 KiB) Downloaded 378 times
supporttar_2.tar.gz
/usr/local/nagios/etc/fd_objects
(17.96 KiB) Downloaded 357 times
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: NRPE installed on Docker connectivity issue

Post by tmcdonald »

Wondering if you have multiple nagios processes stepping on each others' toes. Please run:

ps -ef | grep bin/nagios

and post the output. A healthy system should look like:

Code: Select all

[root@localhost ~]# ps -ef | grep bin/nagios
root      6358 17053  0 14:47 pts/0    00:00:00 grep bin/nagios
nagios   16321     1  0 Feb27 ?        00:02:42 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   16325 16321  0 Feb27 ?        00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   16326 16321  0 Feb27 ?        00:00:04 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   16327 16321  0 Feb27 ?        00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   16328 16321  0 Feb27 ?        00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios   16337 16321  0 Feb27 ?        00:00:26 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
If you have multiple parent nagios processes, you will need to run killall nagios then service nagios restart or otherwise kill all the existing processes and restart fresh.
Former Nagios employee
Locked