Sporadic 'Connection refused' errors in 4.2.4
Posted: Mon Jan 09, 2017 9:12 am
Hi there,
First post so be gentle with me.
I have a Nagios 4.x install which I've been running for a few years. It is currently upgraded to 4.2.4 which I believe is current.
Every so often I get false positives via check_http plugin which are usually 'connection refused' - this 'mainly' seems to happen on this plugin only.
I'm also getting these errors in /var/log/messages which from reading have been changed to warnings and not errors in the current version. These still go to /var/log/messages however.
Below:
Jan 8 08:51:21 backupserver nagios: job 6328 (pid=30393): read() returned error 11
Jan 8 08:53:21 backupserver nagios: job 6333 (pid=30501): read() returned error 11
Jan 8 13:45:41 backupserver nagios: job 7103 (pid=18374): read() returned error 11
Jan 8 13:47:41 backupserver nagios: job 7108 (pid=19397): read() returned error 11
Jan 9 11:05:31 backupserver nagios: job 179 (pid=30032): read() returned error 11
Jan 9 11:07:31 backupserver nagios: job 184 (pid=31363): read() returned error 11
These errors match my alerts exactly.
Nothing changes on the servers to cause these connection refused errors. And these errors and resulting alerts only happen perhaps a burst of 10, a couple of times a week.
It's driving me nuts! Especially when I know it's on the Nagios side and not the boxes being monitored.
Please send me in the right direction!
First post so be gentle with me.
I have a Nagios 4.x install which I've been running for a few years. It is currently upgraded to 4.2.4 which I believe is current.
Every so often I get false positives via check_http plugin which are usually 'connection refused' - this 'mainly' seems to happen on this plugin only.
I'm also getting these errors in /var/log/messages which from reading have been changed to warnings and not errors in the current version. These still go to /var/log/messages however.
Below:
Jan 8 08:51:21 backupserver nagios: job 6328 (pid=30393): read() returned error 11
Jan 8 08:53:21 backupserver nagios: job 6333 (pid=30501): read() returned error 11
Jan 8 13:45:41 backupserver nagios: job 7103 (pid=18374): read() returned error 11
Jan 8 13:47:41 backupserver nagios: job 7108 (pid=19397): read() returned error 11
Jan 9 11:05:31 backupserver nagios: job 179 (pid=30032): read() returned error 11
Jan 9 11:07:31 backupserver nagios: job 184 (pid=31363): read() returned error 11
These errors match my alerts exactly.
Nothing changes on the servers to cause these connection refused errors. And these errors and resulting alerts only happen perhaps a burst of 10, a couple of times a week.
It's driving me nuts! Especially when I know it's on the Nagios side and not the boxes being monitored.
Please send me in the right direction!