Page 1 of 2

nrpe dead but subsys locked

Posted: Mon Dec 05, 2016 7:29 pm
by vuduops
I am getting the below error when I am trying to start nrpe on one of the client servers. can you please help me resolving this issue ?

Code: Select all

[[email protected] ~]# service nrpe status
nrpe dead but subsys locked
[[email protected] ~]# rm -f /var/lock/subsys/nrpe
[[email protected] ~]# ls -ltr /var/lock/subsys/nrpe
ls: cannot access /var/lock/subsys/nrpe: No such file or directory
[[email protected] ~]# ls -ltr /var/lock/subsys/
total 8
-rw-r--r--  1 root  root     0 Oct 26 22:16 lvm2-monitor
-rw-r--r--  1 root  root     0 Oct 26 22:17 network
-rw-r--r--  1 root  root     0 Oct 26 22:17 auditd
-rw-------  1 root  root     0 Oct 26 22:17 rsyslog
-rw-r--r--  1 root  root     0 Oct 26 22:17 messagebus
-rw-r--r--  1 root  root     0 Oct 26 22:17 blk-availability
-rw-r--r--  1 root  root     0 Oct 26 22:17 netfs
-rw-r--r--  1 root  root     0 Oct 26 22:17 acpid
-rw-r--r--  1 root  root     0 Oct 26 22:17 sshd
-rw-r--r--  1 root  root     0 Oct 26 22:17 xinetd
-rw-r--r--  1 root  root     0 Oct 26 22:17 ntpdate
-rw-r--r--  1 root  root     0 Oct 26 22:17 ntpd
-rw-r--r--  1 root  root     0 Oct 26 22:17 haveged
-rw-r--r--  1 root  root     0 Oct 26 22:17 postfix
-rw-r--r--  1 root  root     0 Oct 26 22:17 crond
-rw-r--r--  1 root  root     0 Oct 26 22:18 atd
-rw-r--r--  1 root  root     0 Oct 26 22:18 local
-rw-r--r--  1 root  root     0 Oct 26 22:19 ossec-hids
-rw-r--r--  1 root  root     0 Dec  3 04:21 splunk
-rw-r--r--  1 root  root     0 Dec  5 19:24 httpd
-rw-r--r--  1 root  root     0 Dec  5 19:24 vuduServices
drwxrwxr-x. 5 root  lock  4096 Dec  5 22:02 ../
drwxr-xr-x. 2 vvond vvond 4096 Dec  6 00:22 ./
[[email protected] ~]# service nrpe restart
Shutting down nrpe:                                        [FAILED]
Starting nrpe:                                             [  OK  ]
[[email protected] ~]# service nrpe status
nrpe dead but subsys locked

[[email protected] ~]# tail -f /var/log/messages
Dec  6 00:25:01 bugzilla1 sshd[29928]: Accepted publickey for vvond from 10.230.48.209 port 41084 ssh2
Dec  6 00:25:01 bugzilla1 sshd[29940]: Received disconnect from 10.230.48.209: 11: disconnected by user
Dec  6 00:25:02 bugzilla1 sshd[29944]: Accepted publickey for vvond from 10.230.96.12 port 35084 ssh2
Dec  6 00:25:02 bugzilla1 sshd[29946]: Received disconnect from 10.230.96.12: 11: disconnected by user
Dec  6 00:25:05 bugzilla1 xinetd[1353]: START: nrpe pid=29950 from=::ffff:10.230.51.30
Dec  6 00:25:05 bugzilla1 xinetd[29950]: FAIL: nrpe address from=::ffff:10.230.51.30
Dec  6 00:25:05 bugzilla1 xinetd[1353]: EXIT: nrpe status=0 pid=29950 duration=0(sec)
Dec  6 00:25:20 bugzilla1 xinetd[1353]: START: nrpe pid=29976 from=::ffff:10.230.51.30
Dec  6 00:25:20 bugzilla1 xinetd[29976]: FAIL: nrpe address from=::ffff:10.230.51.30
Dec  6 00:25:20 bugzilla1 xinetd[1353]: EXIT: nrpe status=0 pid=29976 duration=0(sec)
Dec  6 00:27:48 bugzilla1 xinetd[1353]: START: nrpe pid=30091 from=::ffff:10.230.51.30
Dec  6 00:27:48 bugzilla1 xinetd[30091]: FAIL: nrpe address from=::ffff:10.230.51.30
Dec  6 00:27:48 bugzilla1 xinetd[1353]: EXIT: nrpe status=0 pid=30091 duration=0(sec)


Re: nrpe dead but subsys locked

Posted: Tue Dec 06, 2016 10:52 am
by lmiltchev
How did you install NRPE? Did you follow our "official" Linux agent installer?

https://assets.nagios.com/downloads/nag ... _Agent.pdf

Is 10.230.51.30 the IP address of your nagios server?

Run the following commands on the client machine and show the output:

Code: Select all

uname -a
cat /etc/*release
ps axuw | grep nrpe
netstat -at | grep nrpe
find / -name nrpe

Re: nrpe dead but subsys locked

Posted: Tue Dec 06, 2016 11:37 am
by vuduops
I installed nrpe via rpm.

Code: Select all

[[email protected] ~]# uname -a
Linux bugzilla1.devmlp.marquee.net 2.6.32-642.6.1.el6.x86_64 #1 SMP Wed Oct 5 00:36:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[[email protected] ~]# cat /etc/*release
CentOS release 6.8 (Final)
CentOS release 6.8 (Final)
CentOS release 6.8 (Final)
[[email protected] ~]# ps axuw | grep nrpe
root     27450  0.0  0.0 103312   856 pts/1    S+   16:35   0:00 grep nrpe
[[email protected] ~]# netstat -at | grep nrpe
tcp        0      0 *:nrpe                      *:*                         LISTEN      
[[email protected] ~]# find / -name nrpe
/etc/sysconfig/nrpe
/etc/xinetd.d/nrpe
/etc/rc.d/init.d/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe/mods/cfg/nrpe
/var/lock/subsys/nrpe
/var/run/nrpe
/usr/local/nagios/etc/nrpe
/usr/local/nagios/bin/nrpe
/usr/sbin/nrpe

Re: nrpe dead but subsys locked

Posted: Tue Dec 06, 2016 12:41 pm
by lmiltchev
Having two nrpe binaries doesn't seem right...

Code: Select all

/usr/local/nagios/bin/nrpe
/usr/sbin/nrpe
Is it possible that you tried to install NRPE using our "official Linux agent" installer AFTER you had NRPE installed from a repo (or vice versa)? The lines below seem to support this possibility.

Code: Select all

/etc/xinetd.d/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe
/tmp/linux-nrpe-agent/subcomponents/nrpe/mods/cfg/nrpe
What is the output of the following command?

Code: Select all

rpm -qa | grep -i nrpe
I would recommend removing the nrpe packages (yum remove <name of the npre package>), and rerunning our official installer. You can backup the nrpe.cfg file prior to removing the nrpe packages (in case you have some custom entries/commands defined in nrpe.cfg).

Re: nrpe dead but subsys locked

Posted: Tue Dec 06, 2016 1:04 pm
by vuduops
Ah ok :-). My bad I forgot I had used this server to test the linux agent. I removed the xinted nrpe service. Thanks a lot for all the help.

I currently have my monitoring server running on HTTPS.

Is the communication between client and monitoring server encrypted when using nrpe client?

-Krishna

Re: nrpe dead but subsys locked

Posted: Tue Dec 06, 2016 2:01 pm
by lmiltchev
We compile NRPE with SSL, so the communication between the client and the server is encrypted, unless you use the "-n" option.
Options:
-n = Do not use SSL

Re: nrpe dead but subsys locked

Posted: Wed Dec 07, 2016 4:33 pm
by vuduops
Thank you. Also is there a way to specify different times for different checks in the nrpe.cfg ?

-Krishna

Re: nrpe dead but subsys locked

Posted: Wed Dec 07, 2016 5:39 pm
by lmiltchev
Can you elaborate on this? Are you talking about the check interval, retry interval, max check attempts, etc.?

Re: nrpe dead but subsys locked

Posted: Mon Dec 19, 2016 12:13 pm
by vuduops
I am trying to understand the output for the advanced status details. For examples the check uses the generic-service as the template. I have attached the screen shot for reference.

I am mainly trying to understand the Duration, Current Check, Last Check and Next Check output in the screen shot. If a service goes down on the host how fast can Nagios report the issue depending on the parameters.

Code: Select all

nagios.cfg definitions

service_check_timeout=60
service_freshness_check_interval=60

host_check_timeout=30
host_freshness_check_interval=60

generic service template

define service {
       name                                     generic-service
       is_volatile                              0
       max_check_attempts                       3
       check_interval                           10
       retry_interval                           2
       active_checks_enabled                    1
       passive_checks_enabled                   1
       check_period                             24x7
       parallelize_check                        1
       obsess_over_service                      1
       check_freshness                          0
       event_handler_enabled                    1
       flap_detection_enabled                   1
       process_perf_data                        1
       retain_status_information                1
       retain_nonstatus_information             1
       notification_interval                    60
       notification_period                      24x7
       notification_options                     w,u,c,r
       notifications_enabled                    1
       contact_groups                           admins
       register                                 0



Re: nrpe dead but subsys locked

Posted: Mon Dec 19, 2016 12:44 pm
by avandemore
Duration = The time the object has been in it's current state.
Current Check = The check # as defined in Max Retries
Last Check = The time of the last active check.
Next Check = The next scheduled check.