I've put together a guide to share on how I solved a challenge I couldn't find great documentation on (although some pages and posts got me part of the way). Please let me know of any errors you find.
PROXYING NAGIOS HOSTS BEHIND A FIREWALL
What do you do when you have several hosts behind a firewall that Nagios can't reach?
You could have a separate Nagios server on that side of the network, but then you have to manage and maintain two Nagios servers.
You could open all the hosts up to the Nagios server, but this means doing this for every machine.
Or you could open one machine and have it act as a proxy for the others, thus making all of your hosts available from the same Nagios server.
Guide
On your Nagios server create a hostgroup for your proxy hosts
Typical paths - /usr/local/nagios/etc/... OR /etc/nagios/...
# objects/hostgroups.cfg
define hostgroup {
hostgroup_name proxy_hosts
alias Proxy hosts
}
Why - because we need to keep track of which hosts are proxy hosts & later assign which services will be monitoring them
Add a command to communicate with your proxy_host -
Note: all of this could go in one config - see your cgi.cfg / nagios.cfg for which configs are checked.
# objects/commands.cfg
define command{
command_name check_proxy
command_line $USER1$/check_nrpe -H PROXY -c $ARG1$ -a $HOSTADDRESS$
}
Add a servicegroup & service to check that the host is up -
# objects/servicegroups.cfg
define servicegroup {
servicegroup_name proxyload
alias Proxy load stats
}
# objects/services.cfg
define service{
use generic-service
hostgroup_name proxy_nodes
service_description Check_Proxy_Alive
servicegroups proxyload
check_command check_proxy!check_proxy_ping
}
Note: check_host_alive would work
Add a host template -
# objects/templates.cfg
define host {
name proxy-box
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check_proxy!check_proxy_ping
notification_period 24x7
notification_interval 1440
notification_options d,r
contact_groups admins
register 0
}
Add your host -
# objects/hosts.cfg
define host {
use proxy-box
host_name proxy_host_01
check_command check_proxy_ping
hostgroups proxy_nodes
alias ProxyHost01
address 10.##.##.##
}
Reload the configs -
systemctl reload nrpe
Note: This may be called "nagios" or even "nagios-nrpe-server"
On the Proxy Gateway add the proxy command:
# nrpe.cfg
command[check_proxy_ping]=/usr/lib64/nagios/plugins/check_ping -H $ARG1$ -w 3000.0,80% -c 5000.0,100% -p 5
Note: the path to the plugins may differ with your distribution
& if you are using existing commands you don't need to recreate them
On a host the only change that needs to be made is that the allowed_hosts should have the IP of the Proxy Gateway in it, but not the IP of the Nagios Server
allowed_hosts=10.##.##.##,127.0.0.1,::1
Test on the Nagios Server -
./nrpe_check -H PROXY-GATEWAY-IP -c check_proxy_alive -a HOST-IP
This will tell the PROXY-SERVER to run check_proxy_alive (ping the host and return the results)
If it fails it should tell you why, but see Troubleshooting below.
The hosts should shortly appear on the Nagio Server front end, with the results appearing shortly after.
Troubleshooting
Check that your Nagios Server and Proxy Gateway can communicate (the Proxy Gateway is already being monitored by Nagios Server).
Check that your Proxy Gateway can communicate with other hosts on that side of the network (on port 5666), and that they are all running Nagios.
Check that your Nagios configs are being loaded?
Check that the target commands are in nrpe.cfg on the Proxy Server and hosts?
(They don't need check_proxy or check_proxy_alive etc. but would need check_host_alive)
Check your paths - the Nagios configs, objects and plugins may be in various different places depending on the distro you are running.
Timeouts - if commands are timing out, taking longer than 10 seconds to return a result, consider adding -t 30
To Do - I'd like to add instructions on using a reverse proxy for the proxy gateway.
& to put this up on some web page for those searching for the info.