NRPE installed on Docker connectivity issue
NRPE installed on Docker connectivity issue
Hi,
I haven't managed to find any previous thread on this topic, so I apologize if it has already been answered.
I have Nagios Core 4.1.1 installed on a Docker, which is working fine and currently running ping tests to multiple machines and displaying results.
My next step is to have nrpe running on these machines so I can get more info on them, but because of our tolopology, I can only install nrpe inside a dedicated docker.
I found a docker build containing nrpe here: https://hub.docker.com/r/craigwillis/nagios-nrpe/ and here: https://github.com/totem/docker-nrpe
Both seem to work fine except for one big problem. The host keeps appearing and disappearing every few seconds from the Server host list. 1 second it's there, after 4-5 seconds it disappears and returns after 5-6 seconds again.
I can't understand the cause for this behavior but I'm sure it's probably a configuration issue.
Any help will be highly appreciated!
snosie
I haven't managed to find any previous thread on this topic, so I apologize if it has already been answered.
I have Nagios Core 4.1.1 installed on a Docker, which is working fine and currently running ping tests to multiple machines and displaying results.
My next step is to have nrpe running on these machines so I can get more info on them, but because of our tolopology, I can only install nrpe inside a dedicated docker.
I found a docker build containing nrpe here: https://hub.docker.com/r/craigwillis/nagios-nrpe/ and here: https://github.com/totem/docker-nrpe
Both seem to work fine except for one big problem. The host keeps appearing and disappearing every few seconds from the Server host list. 1 second it's there, after 4-5 seconds it disappears and returns after 5-6 seconds again.
I can't understand the cause for this behavior but I'm sure it's probably a configuration issue.
Any help will be highly appreciated!
snosie
Re: NRPE installed on Docker connectivity issue
Are there multiple Nagios processes running? That's the only time I've seen the conflicting running configs happen.
Former Nagios Employee
Re: NRPE installed on Docker connectivity issue
Thanks for answering, no there are not. I checked that as well.rkennedy wrote:Are there multiple Nagios processes running? That's the only time I've seen the conflicting running configs happen.
Is there any cfg file that may be overwriting the hosts list every few seconds that I might be overlooking?
Re: NRPE installed on Docker connectivity issue
Anything NRPE is doing is unlikely to cause this sort of behavior. It's also going to be difficult to troubleshoot a pre-built Docker container that we didn't put together ourselves.
Can you post your nagios.cfg file so we can get a better idea of what this setup looks like? It's typically located in /usr/local/nagios/etc.
Can you post your nagios.cfg file so we can get a better idea of what this setup looks like? It's typically located in /usr/local/nagios/etc.
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
Re: NRPE installed on Docker connectivity issue
@mcapra thanks a lot for replyingmcapra wrote:Anything NRPE is doing is unlikely to cause this sort of behavior. It's also going to be difficult to troubleshoot a pre-built Docker container that we didn't put together ourselves.
Can you post your nagios.cfg file so we can get a better idea of what this setup looks like? It's typically located in /usr/local/nagios/etc.
I attached nagios.cfg file
Re: NRPE installed on Docker connectivity issue
One thing that confuses me a bit is the timing - the standard web interface does not refresh often enough for something to disappear and reappear that quickly unless you are manually refreshing. Can you send a screenshot of your Core interface? It almost sounds like you are running a third-party theme instead of our standard interface. If that is the case, it might be an issue with that theme.
Former Nagios employee
Re: NRPE installed on Docker connectivity issue
I am in fact manually refreshing (constantly clicking the "Hosts" on the side panel)tmcdonald wrote:One thing that confuses me a bit is the timing - the standard web interface does not refresh often enough for something to disappear and reappear that quickly unless you are manually refreshing. Can you send a screenshot of your Core interface? It almost sounds like you are running a third-party theme instead of our standard interface. If that is the case, it might be an issue with that theme.
I'm not using any 3rd party theme, but I can provide screenshots only when connected to the system (at work) so that will be only next week.
Thank you
-
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: NRPE installed on Docker connectivity issue
Would it be possible to set up a test server that is not running in Docker and see if you still run into the issue? I know you said this won't be possible in production, but if we had a working system to compare, it would help us narrow down the issue.
The screenshots are probably not necessary since you are manually refreshing.
Also, could you run tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc/objects and attach the zip? Thanks!
The screenshots are probably not necessary since you are manually refreshing.
Also, could you run tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc/objects and attach the zip? Thanks!
Re: NRPE installed on Docker connectivity issue
Unfortunatly it won't be possible to run the server (or client) outside a Docker, our system won't allow it.dwhitfield wrote:Would it be possible to set up a test server that is not running in Docker and see if you still run into the issue? I know you said this won't be possible in production, but if we had a working system to compare, it would help us narrow down the issue.
The screenshots are probably not necessary since you are manually refreshing.
Also, could you run tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc/objects and attach the zip? Thanks!
We have another directory for objects called "fd_objects" (referred to in nagios.cfg) so I tar-ed both and attached them here.
Re: NRPE installed on Docker connectivity issue
Wondering if you have multiple nagios processes stepping on each others' toes. Please run:
ps -ef | grep bin/nagios
and post the output. A healthy system should look like:
If you have multiple parent nagios processes, you will need to run killall nagios then service nagios restart or otherwise kill all the existing processes and restart fresh.
ps -ef | grep bin/nagios
and post the output. A healthy system should look like:
Code: Select all
[root@localhost ~]# ps -ef | grep bin/nagios
root 6358 17053 0 14:47 pts/0 00:00:00 grep bin/nagios
nagios 16321 1 0 Feb27 ? 00:02:42 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 16325 16321 0 Feb27 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16326 16321 0 Feb27 ? 00:00:04 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16327 16321 0 Feb27 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16328 16321 0 Feb27 ? 00:00:03 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
nagios 16337 16321 0 Feb27 ? 00:00:26 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Former Nagios employee