Page 1 of 1
SMTP checks all failing
Posted: Tue Apr 22, 2014 3:38 pm
by c.slagel
So nagios is monitoring 4 different sendmail servers that we use. All of a sudden this morning they all went critical at the same time. Seems really odd...
Tried running the command manually with a high timeout and this was the result:
Code: Select all
[root@nagios libexec]# ./check_smtp -H 10.1.2.117 -S -D 60 -t 60
WARNING - TLS not supported by server
[root@nagios libexec]# ./check_smtp -H 10.1.2.118 -S -D 60 -t 60
WARNING - TLS not supported by server
Not sure if this is related, I don't know why this would happen on all of them at the same time, but thought I'd ask for some insight. The servers are actually sending mail, so port 25 is operational.
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 3:40 pm
by abrist
Have you upgraded any packages on your nagios server or the mail servers? Due to heartbleed, and the subsequent fallout, we have seen a few people have a diverse set of issues post ssl upgrade . . .
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 3:42 pm
by c.slagel
also, I can see the nagios test hits in the maillog:
Code: Select all
Apr 22 13:09:52 sendmail2 sendmail[3805]: s3MK9q5E003805: [10.1.2.239] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Apr 22 13:12:02 sendmail2 sendmail[3814]: s3MKC2vr003814: [10.1.2.239] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Apr 22 13:17:36 sendmail2 sendmail[3824]: s3MKHamM003824: [10.1.2.239] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Apr 22 13:23:13 sendmail2 sendmail[3839]: s3MKNDxJ003839: [10.1.2.239] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Apr 22 13:28:46 sendmail2 sendmail[3849]: s3MKSkZZ003849: [10.1.2.239] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Apr 22 13:34:13 sendmail2 sendmail[3946]: s3MKYDRi003946: [10.1.2.239] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Apr 22 13:39:45 sendmail2 sendmail[3958]: s3MKdj5c003958: [10.1.2.239] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 3:45 pm
by c.slagel
No, no changes have been made on any of these.
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 3:53 pm
by abrist
What version of check_smtp are you running?
Code: Select all
/usr/local/nagios/libexec/check_smtp -V
EDIT: Can you connect without TLS?
Code: Select all
./check_smtp -H 10.1.2.117 -D 60 -t 60
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 4:11 pm
by c.slagel
abrist wrote:What version of check_smtp are you running?
Code: Select all
/usr/local/nagios/libexec/check_smtp -V
[root@nagios certs]# /usr/local/nagios/libexec/check_smtp -V
check_smtp v1991 (nagios-plugins 1.4.13)
abrist wrote:EDIT: Can you connect without TLS?
Code: Select all
./check_smtp -H 10.1.2.117 -D 60 -t 60
[root@nagios libexec]# ./check_smtp -H 10.1.2.117 -D 60 -t 60
SMTP OK - 36.021 sec. response time|time=36.021238s;;;0.000000
Yes, but 36 seconds? seems very odd.
Also, I can telnet to port 25 of 10.1.2.117 instantly.
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 4:21 pm
by abrist
c.slagel wrote:[root@nagios libexec]# ./check_smtp -H 10.1.2.117 -D 60 -t 60
SMTP OK - 36.021 sec. response time|time=36.021238s;;;0.000000
Well, it looks like it responds, maybe there is an issue with TLS on this server?
c.slagel wrote:Yes, but 36 seconds? seems very odd.
Also, I can telnet to port 25 of 10.1.2.117 instantly.
Odd indeed. Did you negotiate any commands over telnet? The connection to port 25 may be quick, but the actual response from the smtp daemon on that server could be slow.
Do you administer those email servers?
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 4:32 pm
by c.slagel
You're right, very slow response once issuing a command...
I do administer the mail servers. Just the fact that they all starting having this issue at once is very odd to me. So guess it looks like it's not Nagios...
Re: SMTP checks all failing
Posted: Tue Apr 22, 2014 4:51 pm
by abrist
c.slagel wrote:I do administer the mail servers.
Well, at least it is in your power to fix. I have seen this behavior on my own mail servers before and a quick restart of the service usually righted them.
c.slagel wrote:So guess it looks like it's not Nagios...
No problem. Best of luck!