I have a basic setup. One server and one remote, with Nagios and NRPE installed on the server, and NRPE installed on the remote.
Forgetting about the web interface, I'm trying to run everything from the command-line so that things are easier to understand
On the Server (Nagios + NRPE)
Code: Select all
[centos@nagios-server ~]$ /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_users
USERS OK - 1 users currently logged in |users=1;5;10;0
[centos@nagios-server ~]$ /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE v3.0.1
[centos@nagios-server ~]$ systemctl status nagios && systemctl status nrpe
nagios.service - LSB: Starts and stops the Nagios monitoring server
Loaded: loaded (/etc/rc.d/init.d/nagios)
Active: active (running) since Tue 2016-11-29 06:47:44 UTC; 8min ago
Process: 1116 ExecStart=/etc/rc.d/init.d/nagios start (code=exited, status=0/SUCCESS)
CGroup: /system.slice/nagios.service
├─1172 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
├─1175 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─1176 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─1177 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─1178 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
└─1180 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nrpe.service - Nagios Remote Program Executor
Loaded: loaded (/usr/lib/systemd/system/nrpe.service; enabled)
Active: active (running) since Tue 2016-11-29 06:47:44 UTC; 8min ago
Docs: http://www.nagios.org/documentation
Main PID: 1114 (nrpe)
CGroup: /system.slice/nrpe.service
└─1114 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -f
Code: Select all
[centos@sinjihn-test ~]$ /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_users
USERS OK - 2 users currently logged in |users=2;5;10;0
[centos@sinjihn-test ~]$ /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE vnrpe-3.0
[centos@sinjihn-test ~]$ systemctl status nrp
nrp.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
[centos@sinjihn-test ~]$ systemctl status nrpe
nrpe.service - Nagios Remote Program Executor
Loaded: loaded (/usr/lib/systemd/system/nrpe.service; disabled)
Active: active (running) since Tue 2016-11-29 06:09:54 UTC; 47min ago
Docs: http://www.nagios.org/documentation
Process: 2347 ExecStopPost=/bin/rm -f /usr/local/nagios/var/nrpe.pid (code=exited, status=0/SUCCESS)
Main PID: 2350 (nrpe)
CGroup: /system.slice/nrpe.service
└─2350 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -f
Code: Select all
[centos@nagios-server ~]$ /usr/local/nagios/libexec/check_nrpe -H remote-ip -c check_users
CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds.
[centos@nagios-server ~]$ /usr/local/nagios/libexec/check_nrpe -H remote-ip -t 60 -c check_users
CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds.
[centos@nagios-server ~]$ sudo cat /usr/local/nagios/etc/servers/definitions.cfg
define host {
use linux-server
host_name nagios-host-1
hostgroups nagios-hosts
alias OpenStack-Nagios-H1
address remote-ip-address
max_check_attempts 5
check_period 24x7
check_interval 1
notification_interval 30
notification_period 24x7
check_command check-host-alive
}
define host {
use linux-server
host_name nagios-host-2
hostgroups nagios-hosts
alias OpenStack-Nagios-H2
address remote-ip-address
max_check_attempts 5
check_period 24x7
check_interval 1
notification_interval 30
notification_period 24x7
check_command check-host-alive
}
define hostgroup {
hostgroup_name nagios-hosts
alias Nagios-Remote-Hosts
}
define service {
hostgroup_name nagios-hosts
service_description CHECK PING
check_command check_ping!100.0,20%!500.0,60%
check_interval 1
check_period 24x7
retry_interval 1
notification_interval 30
notification_period 24x7
max_check_attempts 5
use generic-service
}
define service {
hostgroup_name nagios-hosts
service_description CHECK HTTP
check_command check_http
check_interval 1
check_period 24x7
retry_interval 1
notification_interval 30
notification_period 24x7
max_check_attempts 5
use generic-service
}
define service {
hostgroup_name nagios-hosts
service_description CHECK NRPE TEST WITH HTTP
check_command check_nrpe!check_http
check_interval 1
check_period 24x7
retry_interval 1
notification_interval 30
notification_period 24x7
max_check_attempts 5
use generic-service
}
For more reference, these nodes are all virtual, with firewalld and httpd installed. I unblocked port 5666/tcp and 80/tcp on all servers, as well as putting selinux into permissive mode.
Code: Select all
nmap -v localhost
Starting Nmap 6.40 ( http://nmap.org ) at 2016-11-29 07:09 UTC
Initiating Ping Scan at 07:09
Scanning localhost (127.0.0.1) [2 ports]
Completed Ping Scan at 07:09, 0.00s elapsed (1 total hosts)
Initiating Connect Scan at 07:09
Scanning localhost (127.0.0.1) [1000 ports]
Discovered open port 25/tcp on 127.0.0.1
Discovered open port 111/tcp on 127.0.0.1
Discovered open port 80/tcp on 127.0.0.1
Discovered open port 22/tcp on 127.0.0.1
Discovered open port 5666/tcp on 127.0.0.1
Completed Connect Scan at 07:09, 0.01s elapsed (1000 total ports)
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00026s latency).
Other addresses for localhost (not scanned): 127.0.0.1
Not shown: 995 closed ports
PORT STATE SERVICE
22/tcp open ssh
25/tcp open smtp
80/tcp open http
111/tcp open rpcbind
5666/tcp open nrpe