Nagios email alerts.
-
- Posts: 13
- Joined: Sat Mar 10, 2018 6:29 am
Re: Nagios email alerts.
i have install tcpdump in our server after this i run this command
tcpdump -s 0 -i any port 25 -w /tmp/25.pcap
it will take almost 2 hours but no output.
tcpdump -s 0 -i any port 25 -w /tmp/25.pcap
it will take almost 2 hours but no output.
Re: Nagios email alerts.
tcpdump will continuously capture all packets traversing a given network interface. It will run exactly as long as you allow it to, which is why @kyang recommended cancelling the command with Ctrl+C once the required data has been captured.
Please read the original post by @kyang detailing the specific steps that should be taken with tcpdump:
Please read the original post by @kyang detailing the specific steps that should be taken with tcpdump:
kyang wrote:This doesn't really seem like a Nagios issue. Also, I'm not exactly sure.
I'll attempt to provide some information to see whats going on though.
How often are emails sending? Let's run a tcpdump to capture those emails sending out. (So I can view the output in the packets)
You may need to install tcpdump. (I'm not sure what OS you are on so I'm listing RHEL/CentOS yum version)Code: Select all
yum -y install tcpdump
Wait until Nagios sends out notifications before using ctrl-c.Code: Select all
tcpdump -s 0 -i any port 25 -w /tmp/25.pcap
Once you confirmed it has sent emails, please PM or Post that 25.pcap file.
Thank you!
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
-
- Posts: 13
- Joined: Sat Mar 10, 2018 6:29 am
Re: Nagios email alerts.
this is the output of this command #tcpdump -s 0 -i any port 25 -w /tmp/25.pcap
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
Re: Nagios email alerts.
Thanks for the help @mcapra!
knackglobal,
When you ran this command, did you wait for Nagios emails to be sent out?
There should be a 25.pcap file located in your /tmp directory.
Please PM or post that file. Thank you!
knackglobal,
When you ran this command, did you wait for Nagios emails to be sent out?
There should be a 25.pcap file located in your /tmp directory.
Please PM or post that file. Thank you!
Code: Select all
/tmp/25.pcap
-
- Posts: 13
- Joined: Sat Mar 10, 2018 6:29 am
Re: Nagios email alerts.
tcpdump -s 0 -i any port 25 -w /tmp/25.pcap
i ran this command but i dont get any email alert. also 25.pcap file is created but how can send you this file.
i ran this command but i dont get any email alert. also 25.pcap file is created but how can send you this file.
Re: Nagios email alerts.
Try uploading the file on the forum. If you are not able to, send it via a PM to anyone on the Nagios Support team. Thanks!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 13
- Joined: Sat Mar 10, 2018 6:29 am
-
- Posts: 13
- Joined: Sat Mar 10, 2018 6:29 am
Re: Nagios email alerts.
hi i have receive only one email of Cam 53 dowm. kindly check the logs
maillog file
Mar 23 17:49:03 in-nagios sendmail[3948]: STARTTLS=client, relay=aspmx.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-GCM-SHA256, bits=128/128
Mar 23 17:49:05 in-nagios sendmail[3948]: w2NCJ0XL003946: to=<support@knackglobal.com>, ctladdr=<nagios@in-nagios.knackbpo.com> (1000/1000), delay=00:00:05, xdelay=00:00:05, mailer=esmtp, pri=120698, relay=aspmx.l.google.com. [74.125.200.26], dsn=2.0.0, stat=Sent (OK 1521807544 v11si6094846pgb.652 - gsmtp)
nagios.log file
[1521809880] wproc: Core Worker 23130: job 30681 (pid=8832): Dormant child reaped
[1521809885] wproc: Core Worker 23125: job 30683 (pid=8850) timed out. Killing it
[1521809885] wproc: CHECK job 30683 from worker Core Worker 23125 timed out after 30.01s
[1521809885] wproc: host=Cam 58; service=(null);
[1521809885] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521809885] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521809885] Warning: Check of host 'Cam 58' timed out after 30.01 seconds
[1521809885] wproc: Core Worker 23125: job 30683 (pid=8850): Dormant child reaped
[1521810024] HOST FLAPPING ALERT: Voice IP GSIP;STOPPED; Host appears to have stopped flapping (3.8% change < 5.0% threshold)
[1521810060] wproc: Core Worker 23125: job 30716 (pid=9250) timed out. Killing it
[1521810060] wproc: CHECK job 30716 from worker Core Worker 23125 timed out after 30.01s
[1521810060] wproc: host=Cam 53; service=(null);
[1521810060] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810060] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521810060] Warning: Check of host 'Cam 53' timed out after 30.01 seconds
[1521810060] wproc: Core Worker 23125: job 30716 (pid=9250): Dormant child reaped
[1521810065] wproc: Core Worker 23129: job 30716 (pid=9254) timed out. Killing it
[1521810065] wproc: CHECK job 30716 from worker Core Worker 23129 timed out after 30.01s
[1521810065] wproc: host=Cam 58; service=(null);
[1521810065] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810065] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521810065] Warning: Check of host 'Cam 58' timed out after 30.01 seconds
[1521810065] wproc: Core Worker 23129: job 30716 (pid=9254): Dormant child reaped
[1521810240] wproc: Core Worker 23127: job 30744 (pid=9585) timed out. Killing it
[1521810240] wproc: CHECK job 30744 from worker Core Worker 23127 timed out after 30.01s
[1521810240] wproc: host=Cam 53; service=(null);
[1521810240] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810240] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521810240] Warning: Check of host 'Cam 53' timed out after 30.01 seconds
[1521810240] HOST NOTIFICATION: nagiosadmin;Cam 53;DOWN;notify-host-by-email;(Host check timed out after 30.01 seconds)
[1521810240] wproc: Core Worker 23127: job 30744 (pid=9585): Dormant child reaped
[1521810245] wproc: Core Worker 23126: job 30744 (pid=9587) timed out. Killing it
[1521810245] wproc: CHECK job 30744 from worker Core Worker 23126 timed out after 30.01s
[1521810245] wproc: host=Cam 58; service=(null);
[1521810245] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810245] Warning: Check of host 'Cam 58' timed out after 30.01 seconds
[1521810245] wproc: Core Worker 23126: job 30744 (pid=9587): Dormant child reaped
maillog file
Mar 23 17:49:03 in-nagios sendmail[3948]: STARTTLS=client, relay=aspmx.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-GCM-SHA256, bits=128/128
Mar 23 17:49:05 in-nagios sendmail[3948]: w2NCJ0XL003946: to=<support@knackglobal.com>, ctladdr=<nagios@in-nagios.knackbpo.com> (1000/1000), delay=00:00:05, xdelay=00:00:05, mailer=esmtp, pri=120698, relay=aspmx.l.google.com. [74.125.200.26], dsn=2.0.0, stat=Sent (OK 1521807544 v11si6094846pgb.652 - gsmtp)
nagios.log file
[1521809880] wproc: Core Worker 23130: job 30681 (pid=8832): Dormant child reaped
[1521809885] wproc: Core Worker 23125: job 30683 (pid=8850) timed out. Killing it
[1521809885] wproc: CHECK job 30683 from worker Core Worker 23125 timed out after 30.01s
[1521809885] wproc: host=Cam 58; service=(null);
[1521809885] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521809885] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521809885] Warning: Check of host 'Cam 58' timed out after 30.01 seconds
[1521809885] wproc: Core Worker 23125: job 30683 (pid=8850): Dormant child reaped
[1521810024] HOST FLAPPING ALERT: Voice IP GSIP;STOPPED; Host appears to have stopped flapping (3.8% change < 5.0% threshold)
[1521810060] wproc: Core Worker 23125: job 30716 (pid=9250) timed out. Killing it
[1521810060] wproc: CHECK job 30716 from worker Core Worker 23125 timed out after 30.01s
[1521810060] wproc: host=Cam 53; service=(null);
[1521810060] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810060] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521810060] Warning: Check of host 'Cam 53' timed out after 30.01 seconds
[1521810060] wproc: Core Worker 23125: job 30716 (pid=9250): Dormant child reaped
[1521810065] wproc: Core Worker 23129: job 30716 (pid=9254) timed out. Killing it
[1521810065] wproc: CHECK job 30716 from worker Core Worker 23129 timed out after 30.01s
[1521810065] wproc: host=Cam 58; service=(null);
[1521810065] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810065] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521810065] Warning: Check of host 'Cam 58' timed out after 30.01 seconds
[1521810065] wproc: Core Worker 23129: job 30716 (pid=9254): Dormant child reaped
[1521810240] wproc: Core Worker 23127: job 30744 (pid=9585) timed out. Killing it
[1521810240] wproc: CHECK job 30744 from worker Core Worker 23127 timed out after 30.01s
[1521810240] wproc: host=Cam 53; service=(null);
[1521810240] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810240] wproc: stdout line 01: PING CRITICAL - Packet loss = 100%|rta=5000.000000ms;3000.000000;5000.000000;0.000000 pl=100%;80;100;0
[1521810240] Warning: Check of host 'Cam 53' timed out after 30.01 seconds
[1521810240] HOST NOTIFICATION: nagiosadmin;Cam 53;DOWN;notify-host-by-email;(Host check timed out after 30.01 seconds)
[1521810240] wproc: Core Worker 23127: job 30744 (pid=9585): Dormant child reaped
[1521810245] wproc: Core Worker 23126: job 30744 (pid=9587) timed out. Killing it
[1521810245] wproc: CHECK job 30744 from worker Core Worker 23126 timed out after 30.01s
[1521810245] wproc: host=Cam 58; service=(null);
[1521810245] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1521810245] Warning: Check of host 'Cam 58' timed out after 30.01 seconds
[1521810245] wproc: Core Worker 23126: job 30744 (pid=9587): Dormant child reaped
Re: Nagios email alerts.
As @mcapra mentioned before.If sendmail is connecting directly to a Google relay, it wouldn't surprise me if Google's relays are rejecting it for one reason or another out of the blue or if your ISP has suddenly started blocking sendmail messages. I would try a different MTA as a troubleshooting step.
The SMTP service relay you are using is relay=aspmx.l.google.com
According to this Google doc, there are limitations under that relay. Are you hitting the limit of emails by chance?
https://support.google.com/a/answer/176600?hl=en
I got to look at the tcpdump. It looks to have sent notifications to support@knackglobal.com. (When using wireshark to view the 25.pcap file, I viewed packets 15, 17, and 20.)
It even shows which host notification it was sending out.
-
- Posts: 13
- Joined: Sat Mar 10, 2018 6:29 am
Re: Nagios email alerts.
According to google docs it seems that 2000 emails per day. we are not crossing the limit of emails.
can you please suggest me the alternate relay option rather than relay=aspmx.l.google.com
kindly tell me alternate MTA for email sending
can you please suggest me the alternate relay option rather than relay=aspmx.l.google.com
kindly tell me alternate MTA for email sending