Page 1 of 2
NRPE installed on Docker connectivity issue
Posted: Tue Feb 28, 2017 7:33 am
by snosie
Hi,
I haven't managed to find any previous thread on this topic, so I apologize if it has already been answered.
I have Nagios Core 4.1.1 installed on a Docker, which is working fine and currently running ping tests to multiple machines and displaying results.
My next step is to have nrpe running on these machines so I can get more info on them, but because of our tolopology, I can only install nrpe inside a dedicated docker.
I found a docker build containing nrpe here:
https://hub.docker.com/r/craigwillis/nagios-nrpe/ and here:
https://github.com/totem/docker-nrpe
Both seem to work fine except for one big problem. The host keeps appearing and disappearing every few seconds from the Server host list. 1 second it's there, after 4-5 seconds it disappears and returns after 5-6 seconds again.
I can't understand the cause for this behavior but I'm sure it's probably a configuration issue.
Any help will be highly appreciated!
snosie
Re: NRPE installed on Docker connectivity issue
Posted: Tue Feb 28, 2017 4:29 pm
by rkennedy
Are there multiple Nagios processes running? That's the only time I've seen the conflicting running configs happen.
Re: NRPE installed on Docker connectivity issue
Posted: Wed Mar 01, 2017 2:29 am
by snosie
rkennedy wrote:Are there multiple Nagios processes running? That's the only time I've seen the conflicting running configs happen.
Thanks for answering, no there are not. I checked that as well.
Is there any cfg file that may be overwriting the hosts list every few seconds that I might be overlooking?
Re: NRPE installed on Docker connectivity issue
Posted: Wed Mar 01, 2017 3:58 pm
by mcapra
Anything NRPE is doing is unlikely to cause this sort of behavior. It's also going to be difficult to troubleshoot a pre-built Docker container that we didn't put together ourselves.
Can you post your nagios.cfg file so we can get a better idea of what this setup looks like? It's typically located in /usr/local/nagios/etc.
Re: NRPE installed on Docker connectivity issue
Posted: Thu Mar 02, 2017 4:35 am
by snosie
mcapra wrote:Anything NRPE is doing is unlikely to cause this sort of behavior. It's also going to be difficult to troubleshoot a pre-built Docker container that we didn't put together ourselves.
Can you post your nagios.cfg file so we can get a better idea of what this setup looks like? It's typically located in /usr/local/nagios/etc.
@mcapra thanks a lot for replying
I attached nagios.cfg file
Re: NRPE installed on Docker connectivity issue
Posted: Thu Mar 02, 2017 3:29 pm
by tmcdonald
One thing that confuses me a bit is the timing - the standard web interface does not refresh often enough for something to disappear and reappear that quickly unless you are manually refreshing. Can you send a screenshot of your Core interface? It almost sounds like you are running a third-party theme instead of our standard interface. If that is the case, it might be an issue with that theme.
Re: NRPE installed on Docker connectivity issue
Posted: Fri Mar 03, 2017 9:40 am
by snosie
tmcdonald wrote:One thing that confuses me a bit is the timing - the standard web interface does not refresh often enough for something to disappear and reappear that quickly unless you are manually refreshing. Can you send a screenshot of your Core interface? It almost sounds like you are running a third-party theme instead of our standard interface. If that is the case, it might be an issue with that theme.
I am in fact manually refreshing (constantly clicking the "Hosts" on the side panel)
I'm not using any 3rd party theme, but I can provide screenshots only when connected to the system (at work) so that will be only next week.
Thank you
Re: NRPE installed on Docker connectivity issue
Posted: Fri Mar 03, 2017 1:59 pm
by dwhitfield
Would it be possible to set up a test server that is not running in Docker and see if you still run into the issue? I know you said this won't be possible in production, but if we had a working system to compare, it would help us narrow down the issue.
The screenshots are probably not necessary since you are manually refreshing.
Also, could you run tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc/objects and attach the zip? Thanks!
Re: NRPE installed on Docker connectivity issue
Posted: Sun Mar 05, 2017 3:24 am
by snosie
dwhitfield wrote:Would it be possible to set up a test server that is not running in Docker and see if you still run into the issue? I know you said this won't be possible in production, but if we had a working system to compare, it would help us narrow down the issue.
The screenshots are probably not necessary since you are manually refreshing.
Also, could you run tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc/objects and attach the zip? Thanks!
Unfortunatly it won't be possible to run the server (or client) outside a Docker, our system won't allow it.
We have another directory for objects called "fd_objects" (referred to in nagios.cfg) so I tar-ed both and attached them here.
Re: NRPE installed on Docker connectivity issue
Posted: Mon Mar 06, 2017 3:48 pm
by tmcdonald
Wondering if you have multiple nagios processes stepping on each others' toes. Please run:
ps -ef | grep bin/nagios
and post the output. A healthy system should look like:
Code: Select all
[root@localhost ~]# ps -ef | grep bin/nagios
root 6358 17053 0 14:47 pts/0 00:00:00 grep bin/nagios
nagios 16321 1 0 Feb27 ? 00:02:42 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 16325 16321 0 Feb27 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16326 16321 0 Feb27 ? 00:00:04 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16327 16321 0 Feb27 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16328 16321 0 Feb27 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16337 16321 0 Feb27 ? 00:00:26 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
If you have multiple parent nagios processes, you will need to run
killall nagios then
service nagios restart or otherwise kill all the existing processes and restart fresh.