This support forum board is for support questions relating to
Nagios XI , our flagship commercial network monitoring solution.
vuduops
Posts: 81 Joined: Wed Sep 07, 2016 1:34 pm
Post
by vuduops » Mon Dec 05, 2016 7:29 pm
I am getting the below error when I am trying to start nrpe on one of the client servers. can you please help me resolving this issue ?
Code: Select all
[[email protected] ~]# service nrpe status
nrpe dead but subsys locked
[[email protected] ~]# rm -f /var/lock/subsys/nrpe
[[email protected] ~]# ls -ltr /var/lock/subsys/nrpe
ls: cannot access /var/lock/subsys/nrpe: No such file or directory
[[email protected] ~]# ls -ltr /var/lock/subsys/
total 8
-rw-r--r-- 1 root root 0 Oct 26 22:16 lvm2-monitor
-rw-r--r-- 1 root root 0 Oct 26 22:17 network
-rw-r--r-- 1 root root 0 Oct 26 22:17 auditd
-rw------- 1 root root 0 Oct 26 22:17 rsyslog
-rw-r--r-- 1 root root 0 Oct 26 22:17 messagebus
-rw-r--r-- 1 root root 0 Oct 26 22:17 blk-availability
-rw-r--r-- 1 root root 0 Oct 26 22:17 netfs
-rw-r--r-- 1 root root 0 Oct 26 22:17 acpid
-rw-r--r-- 1 root root 0 Oct 26 22:17 sshd
-rw-r--r-- 1 root root 0 Oct 26 22:17 xinetd
-rw-r--r-- 1 root root 0 Oct 26 22:17 ntpdate
-rw-r--r-- 1 root root 0 Oct 26 22:17 ntpd
-rw-r--r-- 1 root root 0 Oct 26 22:17 haveged
-rw-r--r-- 1 root root 0 Oct 26 22:17 postfix
-rw-r--r-- 1 root root 0 Oct 26 22:17 crond
-rw-r--r-- 1 root root 0 Oct 26 22:18 atd
-rw-r--r-- 1 root root 0 Oct 26 22:18 local
-rw-r--r-- 1 root root 0 Oct 26 22:19 ossec-hids
-rw-r--r-- 1 root root 0 Dec 3 04:21 splunk
-rw-r--r-- 1 root root 0 Dec 5 19:24 httpd
-rw-r--r-- 1 root root 0 Dec 5 19:24 vuduServices
drwxrwxr-x. 5 root lock 4096 Dec 5 22:02 ../
drwxr-xr-x. 2 vvond vvond 4096 Dec 6 00:22 ./
[[email protected] ~]# service nrpe restart
Shutting down nrpe: [FAILED]
Starting nrpe: [ OK ]
[[email protected] ~]# service nrpe status
nrpe dead but subsys locked
[[email protected] ~]# tail -f /var/log/messages
Dec 6 00:25:01 bugzilla1 sshd[29928]: Accepted publickey for vvond from 10.230.48.209 port 41084 ssh2
Dec 6 00:25:01 bugzilla1 sshd[29940]: Received disconnect from 10.230.48.209: 11: disconnected by user
Dec 6 00:25:02 bugzilla1 sshd[29944]: Accepted publickey for vvond from 10.230.96.12 port 35084 ssh2
Dec 6 00:25:02 bugzilla1 sshd[29946]: Received disconnect from 10.230.96.12: 11: disconnected by user
Dec 6 00:25:05 bugzilla1 xinetd[1353]: START: nrpe pid=29950 from=::ffff:10.230.51.30
Dec 6 00:25:05 bugzilla1 xinetd[29950]: FAIL: nrpe address from=::ffff:10.230.51.30
Dec 6 00:25:05 bugzilla1 xinetd[1353]: EXIT: nrpe status=0 pid=29950 duration=0(sec)
Dec 6 00:25:20 bugzilla1 xinetd[1353]: START: nrpe pid=29976 from=::ffff:10.230.51.30
Dec 6 00:25:20 bugzilla1 xinetd[29976]: FAIL: nrpe address from=::ffff:10.230.51.30
Dec 6 00:25:20 bugzilla1 xinetd[1353]: EXIT: nrpe status=0 pid=29976 duration=0(sec)
Dec 6 00:27:48 bugzilla1 xinetd[1353]: START: nrpe pid=30091 from=::ffff:10.230.51.30
Dec 6 00:27:48 bugzilla1 xinetd[30091]: FAIL: nrpe address from=::ffff:10.230.51.30
Dec 6 00:27:48 bugzilla1 xinetd[1353]: EXIT: nrpe status=0 pid=30091 duration=0(sec)
lmiltchev
Bugs find me
Posts: 13589 Joined: Mon May 23, 2011 12:15 pm
Post
by lmiltchev » Tue Dec 06, 2016 10:52 am
How did you install NRPE? Did you follow our "official" Linux agent installer?
https://assets.nagios.com/downloads/nag ... _Agent.pdf
Is 10.230.51.30 the IP address of your nagios server?
Run the following commands on the client machine and show the output:
Code: Select all
uname -a
cat /etc/*release
ps axuw | grep nrpe
netstat -at | grep nrpe
find / -name nrpe
Be sure to check out our
Knowledgebase for helpful articles and solutions!
vuduops
Posts: 81 Joined: Wed Sep 07, 2016 1:34 pm
Post
by vuduops » Tue Dec 06, 2016 11:37 am
I installed nrpe via rpm.
Code: Select all
[[email protected] ~]# uname -a
Linux bugzilla1.devmlp.marquee.net 2.6.32-642.6.1.el6.x86_64 #1 SMP Wed Oct 5 00:36:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[[email protected] ~]# cat /etc/*release
CentOS release 6.8 (Final)
CentOS release 6.8 (Final)
CentOS release 6.8 (Final)
[[email protected] ~]# ps axuw | grep nrpe
root 27450 0.0 0.0 103312 856 pts/1 S+ 16:35 0:00 grep nrpe
[[email protected] ~]# netstat -at | grep nrpe
tcp 0 0 *:nrpe *:* LISTEN
[[email protected] ~]# find / -name nrpe
/etc/sysconfig/nrpe
/etc/xinetd.d/nrpe
/etc/rc.d/init.d/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe/mods/cfg/nrpe
/var/lock/subsys/nrpe
/var/run/nrpe
/usr/local/nagios/etc/nrpe
/usr/local/nagios/bin/nrpe
/usr/sbin/nrpe
lmiltchev
Bugs find me
Posts: 13589 Joined: Mon May 23, 2011 12:15 pm
Post
by lmiltchev » Tue Dec 06, 2016 12:41 pm
Having two nrpe binaries doesn't seem right...
Code: Select all
/usr/local/nagios/bin/nrpe
/usr/sbin/nrpe
Is it possible that you tried to install NRPE using our "official Linux agent" installer AFTER you had NRPE installed from a repo (or vice versa)? The lines below seem to support this possibility.
Code: Select all
/etc/xinetd.d/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe/mods/cfg/nrpe
What is the output of the following command?
I would recommend removing the nrpe packages (yum remove <name of the npre package>), and rerunning our official installer. You can backup the nrpe.cfg file prior to removing the nrpe packages (in case you have some custom entries/commands defined in nrpe.cfg).
Be sure to check out our
Knowledgebase for helpful articles and solutions!
vuduops
Posts: 81 Joined: Wed Sep 07, 2016 1:34 pm
Post
by vuduops » Tue Dec 06, 2016 1:04 pm
Ah ok
. My bad I forgot I had used this server to test the linux agent. I removed the xinted nrpe service. Thanks a lot for all the help.
I currently have my monitoring server running on HTTPS.
Is the communication between client and monitoring server encrypted when using nrpe client?
-Krishna
lmiltchev
Bugs find me
Posts: 13589 Joined: Mon May 23, 2011 12:15 pm
Post
by lmiltchev » Tue Dec 06, 2016 2:01 pm
We compile NRPE with SSL, so the communication between the client and the server is encrypted, unless you use the "-n" option.
Options:
-n = Do not use SSL
Be sure to check out our
Knowledgebase for helpful articles and solutions!
vuduops
Posts: 81 Joined: Wed Sep 07, 2016 1:34 pm
Post
by vuduops » Wed Dec 07, 2016 4:33 pm
Thank you. Also is there a way to specify different times for different checks in the nrpe.cfg ?
-Krishna
lmiltchev
Bugs find me
Posts: 13589 Joined: Mon May 23, 2011 12:15 pm
Post
by lmiltchev » Wed Dec 07, 2016 5:39 pm
Can you elaborate on this? Are you talking about the check interval, retry interval, max check attempts, etc.?
Be sure to check out our
Knowledgebase for helpful articles and solutions!
vuduops
Posts: 81 Joined: Wed Sep 07, 2016 1:34 pm
Post
by vuduops » Mon Dec 19, 2016 12:13 pm
I am trying to understand the output for the advanced status details. For examples the check uses the generic-service as the template. I have attached the screen shot for reference.
I am mainly trying to understand the Duration, Current Check, Last Check and Next Check output in the screen shot. If a service goes down on the host how fast can Nagios report the issue depending on the parameters.
Code: Select all
nagios.cfg definitions
service_check_timeout=60
service_freshness_check_interval=60
host_check_timeout=30
host_freshness_check_interval=60
generic service template
define service {
name generic-service
is_volatile 0
max_check_attempts 3
check_interval 10
retry_interval 2
active_checks_enabled 1
passive_checks_enabled 1
check_period 24x7
parallelize_check 1
obsess_over_service 1
check_freshness 0
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
notifications_enabled 1
contact_groups admins
register 0
You do not have the required permissions to view the files attached to this post.
avandemore
Posts: 1597 Joined: Tue Sep 27, 2016 4:57 pm
Post
by avandemore » Mon Dec 19, 2016 12:44 pm
Duration = The time the object has been in it's current state.
Current Check = The check # as defined in Max Retries
Last Check = The time of the last active check.
Next Check = The next scheduled check.
Previous Nagios employee