Nagios email alerts.

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
knackglobal
Posts: 13
Joined: Sat Mar 10, 2018 6:29 am

Nagios email alerts.

Post by knackglobal »

Hi

i have configure nagios on Centos 7 and configure email notification on my email Id first. after one month email stop receiving and then one your support member help me out for email notification afetr i change the ISP . but now after few days email notification again stop with same ISP. can you please help me out for this.

Please find below maillog file nad nagios.log file.


Nagios.Log file


Code: Select all

000), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30450, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (w2A0UlE9029976 Message accepted for delivery)
Mar 10 06:00:49 in-nagios sendmail[29978]: STARTTLS=client, relay=aspmx.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-GCM-SHA256, bits=128/128
Mar 10 06:00:50 in-nagios sendmail[29978]: w2A0UlE9029976: to=<support@knackglobal.com>, ctladdr=<nagios@in-nagios.knackbpo.com> (1000/1000), delay=00:00:03, xdelay=00:00:03, mailer=esmtp, pri=120715, relay=aspmx.l.google.com. [74.125.24.26], dsn=2.0.0, stat=Sent (OK 1520641850 e5si1484719pgr.444 - gsmtp)
Mar 10 06:01:24 in-nagios sendmail[30085]: w2A0VOeN030085: from=nagios, size=449, class=0, nrcpts=1, msgid=<201803100031.w2A0VOeN030085@in-nagios.knackbpo.com>, relay=nagios@localhost
Mar 10 06:01:24 in-nagios sendmail[30086]: w2A0VOGa030086: from=<nagios@in-nagios.knackbpo.com>, size=714, class=0, nrcpts=1, msgid=<201803100031.w2A0VOeN030085@in-nagios.knackbpo.com>, proto=ESMTP, daemon=MTA, relay=localhost [127.0.0.1]
Mar 10 06:01:24 in-nagios sendmail[30085]: w2A0VOeN030085: to=support@knackglobal.com, ctladdr=nagios (1000/1000), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30449, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (w2A0VOGa030086 Message accepted for delivery)
Mar 10 06:01:25 in-nagios sendmail[30088]: STARTTLS=client, relay=aspmx.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-GCM-SHA256, bits=128/128
Mar 10 06:01:27 in-nagios sendmail[30088]: w2A0VOGa030086: to=<support@knackglobal.com>, ctladdr=<nagios@in-nagios.knackbpo.com> (1000/1000), delay=00:00:03, xdelay=00:00:03, mailer=esmtp, pri=120714, relay=aspmx.l.google.com. [74.125.24.26], dsn=2.0.0, stat=Sent (OK 1520641887 a80si1742631pfa.315 - gsmtp)

[root@in-nagios log]# which mail
/usr/bin/mail
Last edited by tmcdonald on Thu Mar 15, 2018 12:49 pm, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
kyang

Re: Nagios email alerts.

Post by kyang »

Hello,

Looking at your logs, it looks to have sent? (stat=Sent)

Code: Select all

Mar 10 06:00:50 in-nagios sendmail[29978]: w2A0UlE9029976: to=<support@knackglobal.com>, ctladdr=<nagios@in-nagios.knackbpo.com> (1000/1000), delay=00:00:03, xdelay=00:00:03, mailer=esmtp, pri=120715, relay=aspmx.l.google.com. [74.125.24.26], dsn=2.0.0, stat=Sent (OK 1520641850 e5si1484719pgr.444 - gsmtp)

Code: Select all

Mar 10 06:00:49 in-nagios sendmail[29978]: STARTTLS=client, relay=aspmx.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-GCM-SHA256, bits=128/128
After searching on google I found a few things relating this message to "the certificate TLS is using to secure the connection. It's a warning that the certificate chain cannot be verified with a known/trusted CA."

Also this is from the nagios.log file? Or the maillog? Please post both again just to be sure.

Code: Select all

/var/log/maillog

Code: Select all

/usr/local/nagios/var/nagios.log
knackglobal
Posts: 13
Joined: Sat Mar 10, 2018 6:29 am

Re: Nagios email alerts.

Post by knackglobal »

maillog file

Code: Select all

Mar 14 19:33:02 in-nagios sendmail[26667]: STARTTLS=client, relay=aspmx.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-GCM-SHA256, bits=128/128
Mar 14 19:33:03 in-nagios sendmail[26667]: w2EE30Mq026665: to=<support@knackglobal.com>, ctladdr=<nagios@in-nagios.knackbpo.com> (1000/1000), delay=00:00:03, xdelay=00:00:02, mailer=esmtp, pri=120730, relay=aspmx.l.google.com. [74.125.24.27], dsn=2.0.0, stat=Sent (OK 1521036183 e91-v6si2021417plb.177 - gsmtp)
Mar 14 19:48:19 in-nagios sendmail[28595]: w2EEIIbw028595: from=nagios, size=465, class=0, nrcpts=1, msgid=<201803141418.w2EEIIbw028595@in-nagios.knackbpo.com>, relay=nagios@localhost
Mar 14 19:48:19 in-nagios sendmail[28596]: w2EEIJf0028596: from=<nagios@in-nagios.knackbpo.com>, size=730, class=0, nrcpts=1, msgid=<201803141418.w2EEIIbw028595@in-nagios.knackbpo.com>, proto=ESMTP, daemon=MTA, relay=localhost [127.0.0.1]
Mar 14 19:48:19 in-nagios sendmail[28595]: w2EEIIbw028595: to=support@knackglobal.com, ctladdr=nagios (1000/1000), delay=00:00:01, xdelay=00:00:00, mailer=relay, pri=30465, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (w2EEIJf0028596 Message accepted for delivery)
Mar 14 19:48:20 in-nagios sendmail[28598]: STARTTLS=client, relay=aspmx.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-GCM-SHA256, bits=128/128
Mar 14 19:48:21 in-nagios sendmail[28598]: w2EEIJf0028596: to=<support@knackglobal.com>, ctladdr=<nagios@in-nagios.knackbpo.com> (1000/1000), delay=00:00:02, xdelay=00:00:02, mailer=esmtp, pri=120730, relay=aspmx.l.google.com. [74.125.24.27], dsn=2.0.0, stat=Sent (OK 1521037101 k72si1948854pgc.334 - gsmtp)
Last edited by tmcdonald on Thu Mar 15, 2018 12:49 pm, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
knackglobal
Posts: 13
Joined: Sat Mar 10, 2018 6:29 am

Re: Nagios email alerts.

Post by knackglobal »

nagios.log file

Code: Select all

[1521045800] wproc: Core Worker 30627: job 1334 (pid=14433): Dormant child reaped
[1521045860] wproc: Core Worker 30630: job 1351 (pid=14635) timed out. Killing it
[1521045860] wproc: CHECK job 1351 from worker Core Worker 30630 timed out after 30.01s
[1521045860] wproc:   host=Voice IP GSIP; service=(null);
[1521045860] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521045860] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1521045860] HOST ALERT: Voice IP GSIP;DOWN;HARD;2;(Host check timed out after 30.01 seconds)
[1521045860] wproc: Core Worker 30630: job 1351 (pid=14635): Dormant child reaped
[1521045902] HOST ALERT: Voice IP GSIP;UP;HARD;2;PING OK - Packet loss = 61%, RTA = 80.19 ms
[1521045962] wproc: Core Worker 30630: job 1361 (pid=14756) timed out. Killing it
[1521045962] wproc: CHECK job 1361 from worker Core Worker 30630 timed out after 30.01s
[1521045962] wproc:   host=Voice IP GSIP; service=(null);
[1521045962] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521045962] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1521045962] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1521045962] wproc: Core Worker 30630: job 1361 (pid=14756): Dormant child reaped
[1521046011] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING OK - Packet loss = 75%, RTA = 80.07 ms
[1521046243] wproc: Core Worker 30628: job 1410 (pid=15356) timed out. Killing it
[1521046243] wproc: CHECK job 1410 from worker Core Worker 30628 timed out after 30.01s
[1521046243] wproc:   host=Voice IP GSIP; service=(null);
[1521046243] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521046243] wproc:   stdout line 01: PING WARNING - Packet loss = 90%, RTA = 80.03 ms|rta=80.028999ms;3000.000000;5000.000000;0.000000 pl=90%;80;100;0
[1521046243] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1521046243] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1521046243] wproc: Core Worker 30628: job 1410 (pid=15356): Dormant child reaped
[1521046298] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING WARNING - Packet loss = 80%, RTA = 80.16 ms
[1521046444] wproc: Core Worker 30629: job 1443 (pid=15754) timed out. Killing it
[1521046444] wproc: CHECK job 1443 from worker Core Worker 30629 timed out after 30.01s
[1521046444] wproc:   host=Voice IP GSIP; service=(null);
[1521046444] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521046444] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1521046444] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1521046444] wproc: Core Worker 30629: job 1443 (pid=15754): Dormant child reaped
[1521046504] wproc: Core Worker 30627: job 1457 (pid=15919) timed out. Killing it
[1521046504] wproc: CHECK job 1457 from worker Core Worker 30627 timed out after 30.01s
[1521046504] wproc:   host=Voice IP GSIP; service=(null);
[1521046504] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521046504] wproc:   stdout line 01: PING WARNING - Packet loss = 96%, RTA = 80.11 ms|rta=80.113998ms;3000.000000;5000.000000;0.000000 pl=96%;80;100;0
[1521046504] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1521046504] HOST ALERT: Voice IP GSIP;DOWN;HARD;2;(Host check timed out after 30.01 seconds)
[1521046504] wproc: Core Worker 30627: job 1457 (pid=15919): Dormant child reaped
[1521046548] HOST ALERT: Voice IP GSIP;UP;HARD;2;PING OK - Packet loss = 66%, RTA = 80.38 ms
[1521046608] wproc: Core Worker 30627: job 1475 (pid=16139) timed out. Killing it
[1521046608] wproc: CHECK job 1475 from worker Core Worker 30627 timed out after 30.00s
[1521046608] wproc:   host=Voice IP GSIP; service=(null);
[1521046608] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521046608] Warning: Check of host 'Voice IP GSIP' timed out after 30.00 seconds
[1521046608] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.00 seconds)
[1521046608] wproc: Core Worker 30627: job 1475 (pid=16139): Dormant child reaped
[1521046664] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING WARNING - Packet loss = 81%, RTA = 80.44 ms
[1521046798] wproc: Core Worker 30630: job 1510 (pid=16558) timed out. Killing it
[1521046798] wproc: CHECK job 1510 from worker Core Worker 30630 timed out after 30.01s
[1521046798] wproc:   host=Voice IP GSIP; service=(null);
[1521046798] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521046798] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1521046798] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1521046798] wproc: Core Worker 30630: job 1510 (pid=16558): Dormant child reaped
[1521046836] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING OK - Packet loss = 44%, RTA = 80.33 ms
[1521047322] wproc: Core Worker 30630: job 1596 (pid=17601) timed out. Killing it
[1521047322] wproc: CHECK job 1596 from worker Core Worker 30630 timed out after 30.01s
[1521047322] wproc:   host=Voice IP GSIP; service=(null);
[1521047322] wproc:   early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521047322] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1521047322] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1521047322] wproc: Core Worker 30630: job 1596 (pid=17601): Dormant child reaped
Last edited by tmcdonald on Thu Mar 15, 2018 12:48 pm, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
knackglobal
Posts: 13
Joined: Sat Mar 10, 2018 6:29 am

Re: Nagios email alerts.

Post by knackglobal »

its seem perfect for mail receiving, i think you were right its secure certificate issue, but how can i resolve this error.


kindly do need full
kyang

Re: Nagios email alerts.

Post by kyang »

This doesn't really seem like a Nagios issue. Also, I'm not exactly sure.

I'll attempt to provide some information to see whats going on though.

How often are emails sending? Let's run a tcpdump to capture those emails sending out. (So I can view the output in the packets)

You may need to install tcpdump. (I'm not sure what OS you are on so I'm listing RHEL/CentOS yum version)

Code: Select all

yum -y install tcpdump

Code: Select all

tcpdump -s 0 -i any port 25 -w /tmp/25.pcap
Wait until Nagios sends out notifications before using ctrl-c.

Once you confirmed it has sent emails, please PM or Post that 25.pcap file.

Thank you!
knackglobal
Posts: 13
Joined: Sat Mar 10, 2018 6:29 am

Re: Nagios email alerts.

Post by knackglobal »

i have using this version CentOS Linux release 7.4.1708 (Core)

unable to install tcpdump packets
kyang

Re: Nagios email alerts.

Post by kyang »

Code: Select all

yum -y install tcpdump
Didn't work?

I would like to see what is sending from Nagios. A tcpdump would help in this case.

Just in case, could you send me your Nagios configs for contacts and how you are sending this emails out?

Anything particular or just regular sending to email?

Code: Select all

/usr/local/nagios/etc/objects/contacts.cfg
knackglobal
Posts: 13
Joined: Sat Mar 10, 2018 6:29 am

Re: Nagios email alerts.

Post by knackglobal »

contact.cfg


define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
}

define contact{
contact_name nagiosadmin ; Short name of user
use generic-contact ; Inherit default values from generic-contact template (defined above)
alias Nagios Admin ; Full name of user
email support@knackglobal.com ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
}
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Nagios email alerts.

Post by mcapra »

As this seems to be a MTA issue and not a Nagios Core issue, a tcpdump is really the best way to properly diagnose this issue.

If sendmail is connecting directly to a Google relay, it wouldn't surprise me if Google's relays are rejecting it for one reason or another out of the blue or if your ISP has suddenly started blocking sendmail messages. I would try a different MTA as a troubleshooting step.
Former Nagios employee
https://www.mcapra.com/
Locked