Re: [Nagios-devel] Passive checks number of ettempts error

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Passive checks number of ettempts error

Post by Guest »

--0016368e25dcb05b3a04bf99f17b
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi Stephen,

Thank you for your answer. However I don=B4t agree when you say that I=B4m
losing updates. We can see by the alert times that the collector sent a
SOFT3 status at
01:05:21 h and the central server consider it a HARD4 at 01:05:40 h. For me
it is a bug in the passive check process.

I=B4m using NSCA to replicate the status from the collectors to central
server and as you said the central server takes about 20 seconds to get the
status sent by the collectors, what is a normal behavior, I think.

Thank you very much,
Rodney

On Tue, May 8, 2012 at 5:01 PM, Stephen Gran wrote:

> On Tue, May 08, 2012 at 12:07:05PM -0300, Rodney Ramos said:
> > Hi everybody,
>
> Hi,
>
> > I use Nagios, release 3.2.3, in a distributed environment, with a centr=
al
> > server and several colector servers.
> >
> > For a long time I=B4m seeing errors on the passive check mechanism on t=
he
> > central server, as we can see below.
> >
> > Sometimes, on the central server, the states and number of attempts don=
=B4t
> > follow the correct order, going from SOFT2 to HARD4, for example.
> However,
> > on the colector server everything is OK.
> >
> > Log from Central Server:
> > Host Up[2012-05-05 01:06:48] HOST ALERT: node;UP;HARD;1;TCP OK - 0.005
> > second response time on port 135
> > Host Down[2012-05-05 01:05:40] HOST ALERT: node;DOWN;HARD;4;CRITICAL -
> > Socket timeout after 10 seconds
> > Host Down[2012-05-05 01:04:16] HOST ALERT: node;DOWN;SOFT;2;CRITICAL -
> > Socket timeout after 10 seconds
> > Host Down[2012-05-05 01:02:55] HOST ALERT: node;DOWN;SOFT;1;CRITICAL -
> > Socket timeout after 10 seconds
> >
> > Log from Colector Server:
> > Host Up[05-05-2012 01:06:31] HOST ALERT: node;UP;SOFT;4;TCP OK - 0.005
> > second response time on port 135
> > Host Down[05-05-2012 01:05:21] HOST ALERT: node;DOWN;SOFT;3;CRITICAL -
> > Socket timeout after 10 seconds
> > Host Down[05-05-2012 01:04:01] HOST ALERT: node;DOWN;SOFT;2;CRITICAL -
> > Socket timeout after 10 seconds
> > Host Down[05-05-2012 01:02:41] HOST ALERT: nodeDOWN;SOFT;1;CRITICAL -
> > Socket timeout after 10 seconds
>
> You're losing updates. Given that it seems to be taking 15 or 20
> seconds to get the update from your collector to your central server,
> that's not hugely surprising. You don't say what the replication
> mechanism is, but it either needs to get better at shovelling updates or
> grow a bigger buffer, at a guess.
>
> Cheers,
> --
> ------------------------------------------------------------------------=
--
> | Stephen Gran | Never eat anything bigger than your
> |
> | steve@lobefin.net | head.
> |
> | http://www.lobefin.net/~steve |
> |
> ------------------------------------------------------------------------=
--
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iQIcBAEBCAAGBQJPqXuhAAoJELMRjO+K6o/uCT4P/0Py/NWJ4DYGuiNbwMykA0th
> ejoo2nfMS8PaGiZ+KX+UlCINDyaNNTbBnzOCtaQPLLYGLzyCFZju0zMmcTvlCvFI
> 9RUJlb9U8cZU/Ek+AF3VLJ9+UrFN/EW77R+K2dnt709c445DkdOmQIQez63RmIHy
> ibrQ8waNAJpDhbZb7IV1Pq+XpnKH2RPQIrxODtPTizwGnETq9JjG33h0K7KyjRL8
> Vu3lfPo/DGoNva1NiWlCMyDV2t2Sr27QrPvAkpZYzAajZ52WwNScK+7B2y/8/fJ2
> vI4KJlJFbERMrlANXmDzYjRBU4ZXHSn1d729vAhzoeHk1+TPv1t3AmPphNCy61YX
> Z9yLeFxHLcHLyh6hOhvcxDhadx3eeFk3tTqUroJ13JQvw9+zwdI5T0I8IUmgdG0J
> WI2ntpJhvexTYq2hQHuASWSTpjSW3oc/lJ1SHD61kQ6egfr54tsnfYjeLwag35H5
> dHo9Ul+gFnIjqVw4Sp4APMaMuDHe2wAso8LMBquEudHzNevRT7ZJF1l5FQ6tps2L
> OBUk4oCds9EgwRiTUu7eGLy+0Um6fXQKQww8q0n9YbB1zLZYQBU26cgZYV/TuaZ2
> EL7mWXoEhvXSxICjA6xIOYF7HX4jI0kiN4stAMKjnmRP3BQ1G+DNbfCRF8jGn4kJ
> 39g9fqrjmwRYACjEtPQI
> =3DBY7S
> -----END PGP SIGNATURE-----
>
>
> -------------------------------------------------------------------------=
-----
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security a

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: rodneyra@gmail.com
Locked