Page 2 of 3

Re: Nagios email alerts.

Posted: Tue Feb 20, 2018 10:16 am
by Knack
kindly check commands.cfg file

# 'process-host-perfdata' command definition
define command{
command_name process-host-perfdata
command_line /usr/bin/printf "%b" "$LASTHOSTCHECK$\t$HOSTNAME$\t$HOSTSTATE$\t$HOSTATTEMPT$\t$HOSTSTATETYPE$\t$HOSTEXECUTIONTIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$\n" >> /usr/local/nagios/var/host-perfdata.out
}


# 'process-service-perfdata' command definition
define command{
command_name process-service-perfdata
command_line /usr/bin/printf "%b" "$LASTSERVICECHECK$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATE$\t$SERVICEATTEMPT$\t$SERVICESTATETYPE$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$\n" >> /usr/local/nagios/var/service-perfdata.out
}

define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

Re: Nagios email alerts.

Posted: Tue Feb 20, 2018 10:20 am
by Knack
# 'notify-host-by-email' command definition
define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}

# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}

Re: Nagios email alerts.

Posted: Tue Feb 20, 2018 5:27 pm
by npolovenko
@Knack, Looks like something happened to your /usr/bin/mail file. Did you delete it by accident?
[1519065324] wproc: stderr line 01: /bin/sh: /usr/bin/mail: No such file or directory

Re: Nagios email alerts.

Posted: Wed Feb 21, 2018 7:23 am
by Knack
Hi,

No i didn't delete this file /usr/bin/mail. actually the is not created,can i create now ?

any other issue you see in logs file that reason for not coming notifications.

and one more thing i want to ask you. can we decrease the ping update time from 5 min to 1 min???

Re: Nagios email alerts.

Posted: Wed Feb 21, 2018 2:01 pm
by npolovenko
@Knack, In your notification commands above, try changing:

Code: Select all

/usr/bin/mail -s 
to

Code: Select all

/bin/mail -s 
Then restart nagios with:

Code: Select all

service nagios restart
Try getting a service check in a critical state and see if you can receive a notification. If that still doesn't work try this command:

Code: Select all

which mail
And show us the output.

Re: Nagios email alerts.

Posted: Wed Feb 21, 2018 3:47 pm
by Knack
Hi

after changes as you said Please check below.

# 'notify-host-by-email' command definition
define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}

# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$



[root@in-nagios objects]# which mail
/usr/bin/mail

Re: Nagios email alerts.

Posted: Wed Feb 21, 2018 3:49 pm
by Knack
today logs nagios.log file

[1519244618] wproc: Core Worker 5342: job 19774 (pid=16806): Dormant child reaped
[1519244625] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING OK - Packet loss = 28%, RTA = 78.63 ms
[1519244776] wproc: Core Worker 5345: job 20010 (pid=19656) timed out. Killing it
[1519244776] wproc: CHECK job 20010 from worker Core Worker 5345 timed out after 30.01s
[1519244776] wproc: host=Voice IP GSIP; service=(null);
[1519244776] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1519244776] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1519244776] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1519244776] wproc: Core Worker 5345: job 20010 (pid=19656): Dormant child reaped
[1519244805] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING WARNING - Packet loss = 82%, RTA = 77.40 ms
[1519244851] wproc: Core Worker 5343: job 20120 (pid=20979) timed out. Killing it
[1519244851] wproc: CHECK job 20120 from worker Core Worker 5343 timed out after 30.01s
[1519244851] wproc: host=Voice IP GSIP; service=(null);
[1519244851] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1519244851] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1519244851] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1519244851] wproc: Core Worker 5343: job 20120 (pid=20979): Dormant child reaped
[1519244868] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING OK - Packet loss = 70%, RTA = 77.68 ms
[1519245123] wproc: Core Worker 5341: job 20524 (pid=25847) timed out. Killing it
[1519245123] wproc: CHECK job 20524 from worker Core Worker 5341 timed out after 30.01s
[1519245123] wproc: host=Voice IP GSIP; service=(null);
[1519245123] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1519245123] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1519245123] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1519245123] wproc: Core Worker 5341: job 20524 (pid=25847): Dormant child reaped
[1519245136] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING OK - Packet loss = 61%, RTA = 77.33 ms
[1519245421] HOST FLAPPING ALERT: Voice IP GSIP;STOPPED; Host appears to have stopped flapping (3.8% change < 5.0% threshold)
[1519245653] Auto-save of retention data completed successfully.
[1519245845] Caught SIGTERM, shutting down...
[1519245845] Successfully shutdown... (PID=5338)
[1519245845] Event broker module 'NERD' deinitialized successfully.
[1519245846] Nagios 4.3.4 starting... (PID=6953)
[1519245846] Local time is Thu Feb 22 02:14:06 IST 2018
[1519245846] LOG VERSION: 2.0
[1519245846] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1519245846] qh: core query handler registered
[1519245846] nerd: Channel hostchecks registered successfully
[1519245846] nerd: Channel servicechecks registered successfully
[1519245846] nerd: Channel opathchecks registered successfully
[1519245846] nerd: Fully initialized and ready to rock!
[1519245846] wproc: Successfully registered manager as @wproc with query handler
[1519245846] wproc: Registry request: name=Core Worker 6957;pid=6957
[1519245846] wproc: Registry request: name=Core Worker 6955;pid=6955
[1519245846] wproc: Registry request: name=Core Worker 6956;pid=6956
[1519245846] wproc: Registry request: name=Core Worker 6959;pid=6959
[1519245846] wproc: Registry request: name=Core Worker 6960;pid=6960
[1519245846] wproc: Registry request: name=Core Worker 6958;pid=6958
[1519245846] WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
[1519245846] WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
[1519245846] WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
[1519245846] WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
[1519245846] Successfully launched command file worker with pid 6961
[1519245893] Caught SIGTERM, shutting down...
[1519245893] Successfully shutdown... (PID=6953)
[1519245893] Event broker module 'NERD' deinitialized successfully.
[1519245894] Nagios 4.3.4 starting... (PID=7920)
[1519245894] Local time is Thu Feb 22 02:14:54 IST 2018
[1519245894] LOG VERSION: 2.0
[1519245894] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1519245894] qh: core query handler registered
[1519245894] nerd: Channel hostchecks registered successfully
[1519245894] nerd: Channel servicechecks registered successfully
[1519245894] nerd: Channel opathchecks registered successfully
[1519245894] nerd: Fully initialized and ready to rock!
[1519245894] wproc: Successfully registered manager as @wproc with query handler
[1519245894] wproc: Registry request: name=Core Worker 7922;pid=7922
[1519245894] wproc: Registry request: name=Core Worker 7927;pid=7927
[1519245894] wproc: Registry request: name=Core Worker 7925;pid=7925
[1519245894] wproc: Registry request: name=Core Worker 7926;pid=7926
[1519245894] wproc: Registry request: name=Core Worker 7924;pid=7924
[1519245894] wproc: Registry request: name=Core Worker 7923;pid=7923
[1519245894] WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
[1519245894] WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
[1519245894] WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
[1519245894] WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
[1519245894] Successfully launched command file worker with pid 7928
[1519245943] wproc: Core Worker 7924: job 30 (pid=8296) timed out. Killing it
[1519245943] wproc: CHECK job 30 from worker Core Worker 7924 timed out after 30.01s
[1519245943] wproc: host=Voice IP GSIP; service=(null);
[1519245943] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1519245943] Warning: Check of host 'Voice IP GSIP' timed out after 30.01 seconds
[1519245943] HOST ALERT: Voice IP GSIP;DOWN;SOFT;1;(Host check timed out after 30.01 seconds)
[1519245943] wproc: Core Worker 7924: job 30 (pid=8296): Dormant child reaped
[1519245969] HOST ALERT: Voice IP GSIP;UP;SOFT;2;PING WARNING - Packet loss = 80%, RTA = 77.35 ms

Re: Nagios email alerts.

Posted: Wed Feb 21, 2018 4:14 pm
by Knack
hi kindly check mail.log file

Feb 21 20:39:54 in-nagios postfix/postfix-script[4419]: starting the Postfix mail system
Feb 21 20:39:54 in-nagios postfix/master[4421]: daemon started -- version 2.10.1, configuration /etc/postfix
Feb 21 20:51:47 in-nagios postfix/postfix-script[17050]: stopping the Postfix mail system
Feb 21 20:51:47 in-nagios postfix/master[4421]: terminating on signal 15
Feb 21 20:51:47 in-nagios postfix/postfix-script[17130]: starting the Postfix mail system
Feb 21 20:51:47 in-nagios postfix/master[17132]: daemon started -- version 2.10.1, configuration /etc/postfix
Feb 21 20:51:56 in-nagios postfix/postfix-script[17421]: stopping the Postfix mail system
Feb 21 20:51:56 in-nagios postfix/master[17132]: terminating on signal 15
Feb 21 20:51:56 in-nagios sendmail[17416]: alias database /etc/aliases rebuilt by root
Feb 21 20:51:56 in-nagios sendmail[17416]: /etc/aliases: 76 aliases, longest 10 bytes, 771 bytes total
Feb 21 20:51:56 in-nagios sendmail[17429]: starting daemon (8.14.7): SMTP+queueing@01:00:00
Feb 21 20:51:56 in-nagios sm-msp-queue[17444]: starting daemon (8.14.7): queueing@01:00:00
Feb 21 22:04:26 in-nagios postfix/postfix-script[30803]: starting the Postfix mail system
Feb 21 22:04:26 in-nagios postfix/master[30805]: daemon started -- version 2.10.1, configuration /etc/postfix
Feb 21 23:59:41 in-nagios postfix/pickup[8756]: 82CF8C07AAE3: uid=1000 from=<nagios>
Feb 21 23:59:41 in-nagios postfix/cleanup[25022]: 82CF8C07AAE3: message-id=<20180221182941.82CF8C07AAE3@in-nagios.knackbpo.com>
Feb 21 23:59:41 in-nagios postfix/qmgr[30807]: 82CF8C07AAE3: from=<nagios@in-nagios.knackbpo.com>, size=692, nrcpt=1 (queue active)
Feb 21 23:59:43 in-nagios postfix/smtp[25026]: connect to aspmx.l.google.com[2404:6800:4003:c03::1a]:25: Network is unreachable
Feb 21 23:59:44 in-nagios postfix/smtp[25026]: 82CF8C07AAE3: to=<support@knackglobal.com>, relay=aspmx.l.google.com[74.125.24.26]:25, delay=3.2, delays=0.11/0.01/2.2/0.83, dsn=2.0.0, status=sent (250 2.0.0 OK 1519237784 4-v6si12439603plc.92 - gsmtp)
Feb 21 23:59:44 in-nagios postfix/qmgr[30807]: 82CF8C07AAE3: removed
Feb 21 23:59:56 in-nagios postfix/pickup[8756]: 8D4B7C07AAE3: uid=1000 from=<nagios>
Feb 21 23:59:56 in-nagios postfix/cleanup[25022]: 8D4B7C07AAE3: message-id=<20180221182956.8D4B7C07AAE3@in-nagios.knackbpo.com>
Feb 21 23:59:56 in-nagios postfix/qmgr[30807]: 8D4B7C07AAE3: from=<nagios@in-nagios.knackbpo.com>, size=691, nrcpt=1 (queue active)
Feb 21 23:59:57 in-nagios postfix/smtp[25026]: 8D4B7C07AAE3: to=<support@knackglobal.com>, relay=aspmx.l.google.com[74.125.24.26]:25, delay=1.2, delays=0.05/0/0.62/0.56, dsn=2.0.0, status=sent (250 2.0.0 OK 1519237797 g8-v6si1897205plt.687 - gsmtp)
Feb 21 23:59:57 in-nagios postfix/qmgr[30807]: 8D4B7C07AAE3: removed
Feb 22 00:13:11 in-nagios postfix/pickup[8756]: 84D15C07AAE3: uid=1000 from=<nagios>
Feb 22 00:13:11 in-nagios postfix/cleanup[7158]: 84D15C07AAE3: message-id=<20180221184311.84D15C07AAE3@in-nagios.knackbpo.com>
Feb 22 00:13:11 in-nagios postfix/qmgr[30807]: 84D15C07AAE3: from=<nagios@in-nagios.knackbpo.com>, size=692, nrcpt=1 (queue active)
Feb 22 00:13:12 in-nagios postfix/smtp[7160]: connect to aspmx.l.google.com[2404:6800:4003:c03::1b]:25: Network is unreachable
Feb 22 00:13:13 in-nagios postfix/smtp[7160]: 84D15C07AAE3: to=<support@knackglobal.com>, relay=aspmx.l.google.com[74.125.200.26]:25, delay=2.4, delays=0.07/0.01/1.6/0.76, dsn=2.0.0, status=sent (250 2.0.0 OK 1519238593 x1si264315pgv.124 - gsmtp)
Feb 22 00:13:13 in-nagios postfix/qmgr[30807]: 84D15C07AAE3: removed
Feb 22 00:13:21 in-nagios postfix/pickup[8756]: 99AB2C07AAE3: uid=1000 from=<nagios>
Feb 22 00:13:21 in-nagios postfix/cleanup[7158]: 99AB2C07AAE3: message-id=<20180221184321.99AB2C07AAE3@in-nagios.knackbpo.com>
Feb 22 00:13:21 in-nagios postfix/qmgr[30807]: 99AB2C07AAE3: from=<nagios@in-nagios.knackbpo.com>, size=691, nrcpt=1 (queue active)
Feb 22 00:13:22 in-nagios postfix/smtp[7160]: 99AB2C07AAE3: to=<support@knackglobal.com>, relay=aspmx.l.google.com[74.125.200.26]:25, delay=1.3, delays=0.07/0/0.59/0.59, dsn=2.0.0, status=sent (250 2.0.0 OK 1519238602 j76si31174pfj.146 - gsmtp)
Feb 22 00:13:22 in-nagios postfix/qmgr[30807]: 99AB2C07AAE3: removed
Feb 22 00:45:01 in-nagios postfix/pickup[8756]: 459E8C07AAE3: uid=1000 from=<nagios>
Feb 22 00:45:01 in-nagios postfix/cleanup[8866]: 459E8C07AAE3: message-id=<20180221191501.459E8C07AAE3@in-nagios.knackbpo.com>
Feb 22 00:45:01 in-nagios postfix/qmgr[30807]: 459E8C07AAE3: from=<nagios@in-nagios.knackbpo.com>, size=692, nrcpt=1 (queue active)
Feb 22 00:45:03 in-nagios postfix/smtp[8872]: 459E8C07AAE3: to=<support@knackglobal.com>, relay=aspmx.l.google.com[74.125.24.27]:25, delay=2.5, delays=0.05/0/1.6/0.83, dsn=2.0.0, status=sent (250 2.0.0 OK 1519240503 72-v6si3095965ple.299 - gsmtp)
Feb 22 00:45:03 in-nagios postfix/qmgr[30807]: 459E8C07AAE3: removed
Feb 22 00:45:30 in-nagios postfix/pickup[8756]: 44945C07AAE3: uid=1000 from=<nagios>
Feb 22 00:45:30 in-nagios postfix/cleanup[8866]: 44945C07AAE3: message-id=<20180221191530.44945C07AAE3@in-nagios.knackbpo.com>
Feb 22 00:45:30 in-nagios postfix/qmgr[30807]: 44945C07AAE3: from=<nagios@in-nagios.knackbpo.com>, size=696, nrcpt=1 (queue active)
Feb 22 00:45:31 in-nagios postfix/smtp[8872]: 44945C07AAE3: to=<support@knackglobal.com>, relay=aspmx.l.google.com[74.125.24.27]:25, delay=1.3, delays=0.06/0/0.63/0.56, dsn=2.0.0, status=sent (250 2.0.0 OK 1519240531 w24-v6si7752487pll.479 - gsmtp)
Feb 22 00:45:31 in-nagios postfix/qmgr[30807]: 44945C07AAE3: removed
Feb 22 02:18:13 in-nagios postfix/postfix-script[11593]: stopping the Postfix mail system
Feb 22 02:18:13 in-nagios postfix/master[30805]: terminating on signal 15
Feb 22 02:18:13 in-nagios postfix/postfix-script[11673]: starting the Postfix mail system
Feb 22 02:18:13 in-nagios postfix/master[11675]: daemon started -- version 2.10.1, configuration /etc/postfix
Feb 22 02:30:15 in-nagios postfix/postfix-script[24613]: stopping the Postfix mail system
Feb 22 02:30:15 in-nagios postfix/master[11675]: terminating on signal 15
Feb 22 02:30:15 in-nagios postfix/postfix-script[24693]: starting the Postfix mail system
Feb 22 02:30:15 in-nagios postfix/master[24695]: daemon started -- version 2.10.1, configuration /etc/postfix

Re: Nagios email alerts.

Posted: Wed Feb 21, 2018 4:17 pm
by Knack
please check this in log file

relay=aspmx.l.google.com[74.125.200.26]:25

Re: Nagios email alerts.

Posted: Wed Feb 21, 2018 5:28 pm
by npolovenko
@Knack, It seems like the email is working ok now, do you still not receive anything?