[Nagios-devel] [PATCH] NRPE wait for client to close
-
Guest
[Nagios-devel] [PATCH] NRPE wait for client to close
--BQPnanjtCNWHyqYD
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Hi,
We have been investigating an issue with nrpe with regard to reuse of
port numbers and client hickups (check_nrpe).
It seems that the closing of the network sockets has a race condition
which can confuse firewalls, NAT-gateways etc. I have seen reports about
this issue in several forms:
- http://permalink.gmane.org/gmane.networ ... devel/3037 1)
- http://tracker.nagios.org/view.php?id=305
- http://sourceforge.net/mailarchive/mess ... d=29054957
With my IPv6 patch the bug exposes itself more often if used
in combination with -b (bind) on the client. I had to consult Stevens' Unix
network programming seriously to find this one.
It appears that if a client is connected with a server that the most
optimal way of closing the connection is that the client takes the
initiative to close() as most modern network protocols do. The server
also has en open socket so it has to close() is at some time too. The
one who closes first sends a FIN to the other side ultimately resulting
in its socket to become in TIME_WAIT state for a few minutes.
The nrpe server did a close() too at the end of its logic resulting in a
race of FIN packets being sent from both client and server resulting in
sometimes TIME_WAIT on the client and sometimes on the server side
depending on who won the race.
When making the next connection the connection table is consulted to
choose a new source port, thereby skipping all TIME_WAIT states.
Using IPV6 in combination with bind() on the client the closing of a
connection seems to take slightly longer, I don't know why. Making
the server win the race more often resulting in more TIME_WAIT states on
the server.
The client doesn't see these entries in its connection table
increasing the chance to reuse a source port which is still in TIME_WAIT
on the server. The server will refuse by sending a RST. Strangely the
connection table changes to SYN_RCVD and as soon as the SYN tables
overflows it starts reporting synflooding.
Long story short: semantically the server should wait with a read() on
a end-of-file (0 bytes read) and then close() its side of the connection.
The end-of-file is caused by the FIN of the client (the so called
active close) The close() of the server will then be a passive close so
only one FIN will be sent, resulting in only LISTEN on the server and
all TIME_WAIT's on the client.
The patch introduces a read with timeout of 10 seconds just in case you
have to deal with high latency. In practice however the server receives
the FIN instanly so this timeout is just in case the FIN gets lost.
I have taken the liberty to also remove some logic to force sending FIN
introduced in 2006 in 1).
The patch also introduces -Wall n the Makefile to find and fix several
compiletime errors.
--
Leo Baltus, internetbeheerder /\
NPO ICT Internet Services /NPO/\
Sumatralaan 45, 1217 GP Hilversum, Filmcentrum, west \ /\/
[email protected], 035-6773555 \/
--BQPnanjtCNWHyqYD
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment;
filename="nrpe-2.12-wait_for_client_close.patch"
diff -ruN nrpe-2.12.fc8_64_ipv6/Makefile.in nrpe-2.12.fc8_64_wait_for_client_close/Makefile.in
--- nrpe-2.12.fc8_64_ipv6/Makefile.in 2007-03-14 16:30:05.000000000 +0100
+++ nrpe-2.12.fc8_64_wait_for_client_close/Makefile.in 2011-09-05 10:11:23.000000000 +0200
@@ -10,7 +10,7 @@
SRC_INCLUDE=./include/
CC=@CC@
-CFLAGS=@CFLAGS@ @DEFS@
+CFLAGS=@CFLAGS@ @DEFS@ -Wall -D_GNU_SOURCE
LDFLAGS=@LDFLAGS@ @LIBS@
prefix=@prefix@
diff -ruN nrpe-2.12.fc8_64_ipv6/include/nrpe.h nrpe-2.12.fc8_64_wait_for_client_close/include/nrpe.h
--- nrpe-2.12.fc8_64_ipv6/include/nrpe.h 2007-11-23 18:31:23.000000000 +0100
+++ nrpe-2.12.fc8_64_wait_for_client_close/include/nrpe.h 2011-09-05 10:11:23.000000000 +0200
@@ -54,6 +54,7 @@
int my_system(char *,int,int *,char *,int); /* executes a command via popen(), but also protects against timeouts */
void my_sys
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]