Page 1 of 1

Problem in monitoring HP-UX servers using nrpe giving error

Posted: Tue Feb 07, 2012 6:32 am
by Satyam
I am monitoring some HP-UX servers on which we have installed nrpe. It was working fine until from last few days it started giving SSL handshake error. I have checked nrpe.cfg and it is perfectly fine. The nrpe on HP-UX has been started from commandline on the HP-UX servers.

I have tried the following options from my NagiosXI server.

[root@nagxi libexec]# ./check_nrpe -H x.x.x.x -c check_disk -a '-w 10% -c 5% -p /home'
CHECK_NRPE: Socket timeout after 10 seconds.



[root@nagxi libexec]# ./check_nrpe -H x.x.x.x -t 90 -c check_disk -a '-w 10% -c 5% -p /home'
CHECK_NRPE: Error - Could not complete SSL handshake.


[root@nagxi libexec]# ./check_nrpe -H x.x.x.x -n -c check_disk -a '-w 10% -c 5% -p /home'
CHECK_NRPE: Socket timeout after 10 seconds.

[root@nagxi libexec]# ./check_nrpe -H x.x.x.x -n -t 20 -c check_disk -a '-w 10% -c 5% -p /home'
CHECK_NRPE: Socket timeout after 20 seconds.

nrpe giving ssl handshake error

Posted: Tue Feb 07, 2012 7:34 am
by Satyam
I am monitoring some HP-UX servers on which we have installed nrpe. It was working fine until from last few days it started giving SSL handshake error. I have checked nrpe.cfg and it is perfectly fine. The nrpe on HP-UX has been started from commandline on the HP-UX servers.

I have tried the following options from my NagiosXI server.

[root@nagxi libexec]# ./check_nrpe -H x.x.x.x -c check_disk -a '-w 10% -c 5% -p /home'
CHECK_NRPE: Socket timeout after 10 seconds.



[root@nagxi libexec]# ./check_nrpe -H x.x.x.x -t 90 -c check_disk -a '-w 10% -c 5% -p /home'
CHECK_NRPE: Error - Could not complete SSL handshake.


[root@nagxi libexec]# ./check_nrpe -H x.x.x.x -n -c check_disk -a '-w 10% -c 5% -p /home'
CHECK_NRPE: Socket timeout after 10 seconds.

Re: nrpe giving ssl handshake error

Posted: Tue Feb 07, 2012 11:36 am
by scottwilkerson
Have you made any modifications to either the XI server or the remote machine?

What is the dont_blame_nrpe in the nrpe.cfg on the HP-UX machine set to?

It should be:
dont_blame_nrpe=1

Re: nrpe giving ssl handshake error

Posted: Wed Feb 08, 2012 12:52 am
by Satyam
Hi Scott,

dont_blame_nrpe=1 is set in nrpe.cfg

The whole debug things which have done are as below:-

CASE A
ON THE HP-UX SERVER NRPE PROCESS STARTED AS:
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

FROM NAGIOSXI SERVER:
[root@mmkndnagxi libexec]# ./check_nrpe -H 10.204.100.74 -n -p 5667 -c check_mem -a '-w 65 -c 75'
CHECK_NRPE: Socket timeout after 10 seconds.
[root@mmkndnagxi libexec]# ./check_nrpe -H 10.204.100.74 -n -t 90 -p 5667 -c check_mem -a '-w 65 -c 75'
CHECK_NRPE: Error receiving data from daemon.
[root@mmkndnagxi libexec]# ./check_nrpe -H 10.204.100.73 -t 60 -p 5667 -c check_mem -a '-w 65 -c 75'
CHECK_NRPE: Error - Could not complete SSL handshake.


CASE B
ON THE HP-UX SERVER NRPE PROCESS STARTED AS:
/usr/local/nagios/bin/nrpe -n -c /usr/local/nagios/etc/nrpe.cfg -d

FROM NAGIOSXI SERVER:
[root@mmkndnagxi libexec]# ./check_nrpe -H 10.204.100.73 -t 60 -p 5667 -c check_mem -a '-w 65 -c 75'
CHECK_NRPE: Error - Could not complete SSL handshake.
You have new mail in /var/spool/mail/root
[root@mmkndnagxi libexec]# ./check_nrpe -H 10.204.100.73 -n -p 5667 -c check_mem -a '-w 65 -c 75'
CHECK_NRPE: Socket timeout after 10 seconds.
[root@mmkndnagxi libexec]# ./check_nrpe -H 10.204.100.73 -n -t 60 -p 5667 -c check_mem -a '-w 65 -c 75'
CHECK_NRPE: Error receiving data from daemon.
[root@mmkndnagxi libexec]#

------------------------------Monitored Server Log-----------------------------
Feb 7 20:34:12 MCPIFDR nrpe[10434]: Daemon shutdown
Feb 7 20:35:22 MCPIFDR nrpe[11237]: Unknown option specified in config file '/usr/local/nagios/etc/nrpe.cfg' - Line 81
Feb 7 20:35:22 MCPIFDR nrpe[11237]: INFO: SSL/TLS NOT initialized. Network encryption DISABLED.
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Starting up daemon
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Warning: Could not set effective GID=111
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Warning: Daemon is configured to accept command arguments from clients!
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Listening for connections on port 5667
Feb 7 20:37:24 MCPIFDR nrpe[11298]: Could not read request from client, bailing out...


$ ps -ef|grep nrpe
nagios 11238 1 0 20:35:22 ? 0:00 /usr/local/nagios/bin/nrpe -n -c /usr/local/nagios/etc/nrpe.cfg -d
nagios 11332 6726 0 20:39:17 pts/1 0:00 grep nrpe

$ netstat -an|grep 5667
tcp 0 0 10.204.100.73.1527 10.142.0.143.56670 FIN_WAIT_2
tcp 0 0 10.204.100.73.5667 *.* LISTEN

In both the cases I am able to telnet into port 5667 from NagiosXI server

-----------------------------------------------------------nrpe.cfg-----------------------------------
#############################################################################
# Sample NRPE Config File
# Written by: Ethan Galstad ([email protected])
#
# Last Modified: 11-23-2007
#
# NOTES:
# This is a sample configuration file for the NRPE daemon. It needs to be
# located on the remote host that is running the NRPE daemon, not the host
# from which the check_nrpe client is being executed.
#############################################################################


# LOG FACILITY
# The syslog facility that should be used for logging purposes.

log_facility=daemon



# PID FILE
# The name of the file in which the NRPE daemon should write it's process ID
# number. The file is only written if the NRPE daemon is started by the root
# user and is running in standalone mode.

pid_file=/usr/local/nagios/bin/nrpe.pid



# PORT NUMBER
# Port number we should wait for connections on.
# NOTE: This must be a non-priviledged port (i.e. > 1024).
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

server_port=5667



# SERVER ADDRESS
# Address that nrpe should bind to in case there are more than one interface
# and you do not want nrpe to bind on all interfaces.
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

server_address=10.204.100.73



# NRPE USER
# This determines the effective user that the NRPE daemon should run as.
# You can either supply a username or a UID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_user=nagios



# NRPE GROUP
# This determines the effective group that the NRPE daemon should run as.
# You can either supply a group name or a GID.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

nrpe_group=nagios



# ALLOWED HOST ADDRESSES
# This is an optional comma-delimited list of IP address or hostnames
# that are allowed to talk to the NRPE daemon.
#
# Note: The daemon only does rudimentary checking of the client's IP
# address. I would highly recommend adding entries in your /etc/hosts.allow
# file to allow only the specified host to connect to the port
# you are running this daemon on.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

#allowed_hosts=127.0.0.1

only_from=127.0.0.1 10.2.202.221

# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed. This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments

dont_blame_nrpe=1



# COMMAND PREFIX
# This option allows you to prefix all commands with a user-defined string.
# A space is automatically added between the specified prefix string and the
# command line from the command definition.
#
# *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! ***
# Usage scenario:
# Execute restricted commmands using sudo. For this to work, you need to add
# the nagios user to your /etc/sudoers. An example entry for alllowing
# execution of the plugins from might be:
#
# nagios ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/
#
# This lets the nagios user run all commands in that directory (and only them)
# without asking for a password. If you do this, make sure you don't give
# random users write access to that directory or its contents!

# command_prefix=/usr/bin/sudo



# DEBUGGING OPTION
# This option determines whether or not debugging messages are logged to the
# syslog facility.
# Values: 0=debugging off, 1=debugging on

debug=1

# COMMAND TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# allow plugins to finish executing before killing them off.

command_timeout=60



# CONNECTION TIMEOUT
# This specifies the maximum number of seconds that the NRPE daemon will
# wait for a connection to be established before exiting. This is sometimes
# seen where a network problem stops the SSL being established even though
# all network sessions are connected. This causes the nrpe daemons to
# accumulate, eating system resources. Do not set this too low.

connection_timeout=300



# WEEK RANDOM SEED OPTION
# This directive allows you to use SSL even if your system does not have
# a /dev/random or /dev/urandom (on purpose or because the necessary patches
# were not applied). The random number generator will be seeded from a file
# which is either a file pointed to by the environment valiable $RANDFILE
# or $HOME/.rnd. If neither exists, the pseudo random number generator will
# be initialized and a warning will be issued.
# Values: 0=only seed from /dev/random, 1=also seed from weak randomness

#allow_weak_random_seed=1



# INCLUDE CONFIG FILE
# This directive allows you to include definitions from an external config file.

#include=<somefile.cfg>



# INCLUDE CONFIG DIRECTORY
# This directive allows you to include definitions from config files (with a
# .cfg extension) in one or more directories (with recursion).

#include_dir=<somedirectory>
#include_dir=<someotherdirectory>



# COMMAND DEFINITIONS
# Command definitions that this daemon will run. Definitions
# are in the following format:
#
# command[<command_name>]=<command_line>
#
# When the daemon receives a request to return the results of <command_name>
# it will execute the command specified by the <command_line> argument.
#
# Unlike Nagios, the command line cannot contain macros - it must be
# typed exactly as it should be executed.
#
# Note: Any plugins that are used in the command lines must reside
# on the machine that this daemon is running on! The examples below
# assume that you have plugins installed in a /usr/local/nagios/libexec
# directory. Also note that you will have to modify the definitions below
# to match the argument format the plugins expect. Remember, these are
# examples only!


# The following examples use hardcoded command arguments...

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
#command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1


##Disk Monitoring
command[check_root]=/usr/local/nagios/libexec/check_disk -w 15% -c 5% -p /
command[check_var]=/usr/local/nagios/libexec/check_disk -w 15% -c 5% -p /var
command[check_usr]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /usr
command[check_usr_sap]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /usr/sap
command[check_oracle]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle
command[check_opt]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /opt
command[check_home]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /home
command[check_stand]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /stand
command[check_usr_sap_trans]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /usr/sap/trans
command[check_oracle_sapcheck]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/sapcheck

command[check_oracle_sapbackup]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/sapbackup
command[check_oracle_stage]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/stage
command[check_oracle_saptrace]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/saptrace
command[check_oracle_sapreorg]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/sapreorg
command[check_oracle_sapdata1]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/sapdata1
command[check_oracle_sapdata2]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/sapdata2
command[check_oracle_sapdata3]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/sapdata3
command[check_oracle_sapdata4]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/sapdata4
command[check_oracle_origlogA]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/origlogA
command[check_oracle_origlogB]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/origlogB
command[check_oracle_oraarch]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/oraarch
command[check_oracle_mirrlogA]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/mirrlogA
command[check_oracle_mirrlogB]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oracle/ED2/mirrlogB
command[check_sapmnt]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /sapmnt
command[check_sapmnt_ed2]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /sapmnt/ED2

command[check_procs_cron]=/usr/local/nagios/libexec/check_procs -c 1:1 -C cron
command[check_procs_syslogd]=/usr/local/nagios/libexec/check_procs -c 1:1 -C syslogd

#command[check_mem_used]=/usr/local/nagios/libexec/check_memhpux -u -w 75 -c 90
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z


# The following examples allow user-supplied arguments and can
# only be used if the NRPE daemon was compiled with support for
# command arguments *AND* the dont_blame_nrpe directive in this
# config file is set to '1'. This poses a potential security risk, so
# make sure you read the SECURITY file before doing this.

#command[check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
#command[check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
#command[check_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
#command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
command[check_load]=/usr/local/nagios/libexec/check_load $ARG1$
command[check_disk]=/usr/local/nagios/libexec/check_disk $ARG1$
command[check_procs]=/usr/local/nagios/libexec/check_procs $ARG1$

command[check_cpu]=/usr/local/nagios/libexec/check_cpu_stats.sh $ARG1$
command[check_log]=/usr/local/nagios/libexec/check_log -F /var/adm/syslog/syslog.log -O /usr/local/nagios/etc/syslog.log.old
-q error

command[check_swap]=/usr/local/nagios/libexec/check_swap $ARG1$
command[check_uptime]=/usr/local/nagios/libexec/check_uptime.sh
command[check_mem_used]=/usr/local/nagios/libexec/check_memhpux -u $ARG1$
command[check_mem]=/usr/local/nagios/libexec/check_mem_hpux_custom $ARG1$
command[check_iostat]=/usr/local/nagios/libexec/check_hp_iostat $ARG1$


Thanks,
Manish Kumar
Open Source Tools Team, IMS, Mahindra Satyam
Bangalore, India

Re: nrpe giving ssl handshake error

Posted: Wed Feb 08, 2012 7:51 am
by scottwilkerson
Can you check the system log on the HP-UX machine to see if there are any NRPE errors there

Code: Select all

tail -f /var/log/messages

Re: nrpe giving ssl handshake error

Posted: Wed Feb 08, 2012 8:31 am
by Satyam
On HP-UX its tail -f /var/adm/syslog/syslog.log

I have pasted output below:
-----------------------------Monitored Server Log-----------------------------
Feb 7 20:34:12 MCPIFDR nrpe[10434]: Daemon shutdown
Feb 7 20:35:22 MCPIFDR nrpe[11237]: Unknown option specified in config file '/usr/local/nagios/etc/nrpe.cfg' - Line 81
Feb 7 20:35:22 MCPIFDR nrpe[11237]: INFO: SSL/TLS NOT initialized. Network encryption DISABLED.
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Starting up daemon
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Warning: Could not set effective GID=111
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Warning: Daemon is configured to accept command arguments from clients!
Feb 7 20:35:22 MCPIFDR nrpe[11238]: Listening for connections on port 5667
Feb 7 20:37:24 MCPIFDR nrpe[11298]: Could not read request from client, bailing out...

Re: nrpe giving ssl handshake error

Posted: Wed Feb 08, 2012 9:54 am
by scottwilkerson
I see the problem. In your nrpe.cfg on the HP-UX machine on about line 81 you have

Code: Select all

only_from=127.0.0.1 10.2.202.221
This is incorrect syntax for this file (it would be correct in /etc/xinetd.d/nrpe if NRPE was running under xinetd).

Remove that line and replace with

Code: Select all

allowed_hosts=127.0.0.1,10.2.202.221
Then restart nrpe ant try again.