Sporadic 'Connection refused' errors in 4.2.4

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
kernow5000
Posts: 58
Joined: Mon Jan 09, 2017 9:06 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by kernow5000 »

I've cheated and set service_check_timeout=65 just to test. Will report back.

Passive checks are an option, I don't currently have any configured.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by rkennedy »

Sounds good - let us know how the testing goes.
Former Nagios Employee
kernow5000
Posts: 58
Joined: Mon Jan 09, 2017 9:06 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by kernow5000 »

I'm not sure if it's relevant but I've changed all the http checks that were checking out OK (but also being redirected via 302 by various rewrite/config rules on the webservers) to specifically check a file.
Just to reduce the redirection and such.

Mainly because I noticed that these were the checks occasionally failing with 'connection refused'

I've also put in retry_check_interval values for some of the hosts as well as max_check_attempts > 1 (Currently I was just alerting on the first check every time)

Didn't get as many SMS's from the same old hosts last night at least.

Hopefully this'll smooth a few things out, but we'll see.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by rkennedy »

You could also increase the notification_interval for a longer length before sending a notification as well.

Let us know if you have any further questions.
Former Nagios Employee
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by dwhitfield »

kernow5000 wrote:I might just remove the check for that host ... ha
So, I notice a couple of things going back through this thread. One, would it be possible to spin up a Core server just for this one check? If you have a server doing no other checks, the other checks can't be getting in the way (well, unless they are sitting on the same physical host, or there is network latency, but realistically...)

You mention that multiple times these have come through at 4am, although I see they are not just at 4am. Have you spoken with your network team or any of the admins of the servers that are having these strange errors to see if they do anything at 4am.

Also, one thing that may be less drastic than removing the check is scheduling downtime. I don't know how critical that server is to be running at night, but you could just schedule it being down for the most annoying hours. I know that's not ideal but it seems better on the face than removing the check altogether.
kernow5000
Posts: 58
Joined: Mon Jan 09, 2017 9:06 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by kernow5000 »

Thanks for your suggestions guys. Another Nagios server is certainly possible. That's one option for testing.

I'll look into notification_interval also.

Sadly the boxes are at different providers, but I could ask if they have any of these issues at certain times.

Actually, since tweaking retry_interval and max_checks or whatever it seems to have smoothed out.

Will keep you posted.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by rkennedy »

Let us know if you have any further questions!
Former Nagios Employee
kernow5000
Posts: 58
Joined: Mon Jan 09, 2017 9:06 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by kernow5000 »

Ugh, a good 60 'connection refused' SMS errors last night, on the same few hosts. I'm going to take these down to email only for now. But at this point I have to think about sacking this off and looking at other availability monitoring sadly :(
kernow5000
Posts: 58
Joined: Mon Jan 09, 2017 9:06 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by kernow5000 »

...and pretty much all of those were port 443 / SSL connections.

Hmmmmmmm
kernow5000
Posts: 58
Joined: Mon Jan 09, 2017 9:06 am

Re: Sporadic 'Connection refused' errors in 4.2.4

Post by kernow5000 »

Haven't seen this one before.

CRITICAL - Plugin timed out while executing system call
On a DNS check this time.
Locked