Migrating Nagios XI to diff domain: Email Notifications

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
atremblay
Posts: 46
Joined: Wed Apr 05, 2017 1:38 pm

Migrating Nagios XI to diff domain: Email Notifications

Post by atremblay »

Hello again,

As the company I work for is separating out and selling off some assets we've had to build a second Nagios environment for that side to manage. Thusfar I've been able to solve just about every issue that comes up with doing this. But there's one that's lasted until the end, and I cannot for the life of me find out what's going wrong.

Email Notifications.

Now, just to be clear. Email works. If I send test emails to users, they get them. But notifications do not come in. I've been scouring through the contact groups configs, opened up the verboseness of the phpmailer() log, to no avail. Nothing seems to point at anything actually failing. Just not sending. Usually that's config, but this chain isn't that complex.

Mail Server configured > User account configured for notifications > Notification templates associated to user > Notification group "All notifications" group added to user > Notification group is associated with Services/Hosts > Notification preferences are enabled for user to recieve this type of notification.

Help me here, cause I feel I'm going mad trying to track down a gremlin in my system.

Thanks.

Alex
atremblay
Posts: 46
Joined: Wed Apr 05, 2017 1:38 pm

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by atremblay »

I did pick up this log.

Code: Select all

Oct  1 10:03:03 <Nagios Hostname> postfix/smtp[34704]: connect to <domain name>.com[<domain controller IP Address>]:25: Connection timed out
Oct  1 10:03:33 <Nagios Hostname> postfix/smtp[34704]: connect to <domain name>.com[<domain controller IP Address>]:25: Connection timed out
Oct  1 10:04:03 <Nagios Hostname> postfix/smtp[34704]: connect to <domain name>.com[<domain controller IP Address>]:25: Connection timed out
Oct  1 10:04:33 <Nagios Hostname> postfix/smtp[34704]: connect to <domain name>.com[<domain controller IP Address>]:25: Connection timed out
Oct  1 10:04:33 <Nagios Hostname> postfix/smtp[34704]: AB4E711BA: to=<nagios-notice@<domain name>.com>, relay=none, delay=422737, delays=422617/0.01/120/0, dsn=4.4.1, status=deferred (connect to <domain name>.com[<domain controller IP address>]:25: Connection timed out)
Now I don't know why the Nagios server would be trying to connect to a domain controller on port 25. And then say that mail failed because a connection to port 25 on the domain controller failed. But clearly that ain't right. Our exchange is separate from DCs (not SBS if that's a thing anymore).

This was found in /var/log/maillog .
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by scottwilkerson »

If you go to the CCM -> Contacts and click on one of the contacts, Alert Settings Tab

What is selected under "Manage Host Notification Commands" & "Manage Service Notification Commands"?

also, can you confirm everything is green under Admin -> Monitoring Engine Status
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
atremblay
Posts: 46
Joined: Wed Apr 05, 2017 1:38 pm

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by atremblay »

The Manage Host Notification Commands and Service Commands have nothing set. 0

The same is true in our other Nagios XI environment for the same user account, but notifications works there. Here's some background which probably will be nothing related to the issue. But in all your years, you may have seen something funny happen with something like this. We have a Nagios XI virtual machine. We're dividing up the environment, so what we did was clone the existing virtual machine, applied a new license, removed half of the hosts from one VM, and removed the other half from the other VM, so that we would have two different servers monitoring basically everything that one server did before. But split in half. And then the last part was that we pointed the mail server to this other organizations mail server. Initially I thought there could be an issue there since they don't run the same mail server we do for processing mail, but as I mentioned before, I was able to send test messages from the XI interface. Just none of the notifications were making it there. Last thing to tack on there too is that when you check the notifications logs in XI, it shows that a notification was generated. It doesn't show any error with it. Just seems to not be processing it properly to send it to the user account.

And yes, all Monitor Engine Status checks are green.

Thank you!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by scottwilkerson »

atremblay wrote:The Manage Host Notification Commands and Service Commands have nothing set. 0
This is really odd unless they were being applied by a template (maybe on the other system they are)

The Manage Host Notification Commands should be xi_host_notification_handler
The Manage Service Notification Commands should be xi_service_notification_handler
atremblay wrote:Last thing to tack on there too is that when you check the notifications logs in XI, it shows that a notification was generated. It doesn't show any error with it. Just seems to not be processing it properly to send it to the user account.
this leads me to believe that they must be set in the template and we have a problem with either the handler. In the report does the Dispatcher column say "Nagios XI"?

If so, we are going to need to check the mail settings in Admin -> Manage Email Settings and verify everything is correct.

If so, we want to look at the debug logs listed on the settings page.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
atremblay
Posts: 46
Joined: Wed Apr 05, 2017 1:38 pm

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by atremblay »

Yeah, you were right. The host and service notification handler commands are being inherited to each contact through a template. And seems to be the same way in both systems. Most of our stuff is built that way, to apply to the broadest scope as frequently as possible so as to keep it simple to edit later on.

And yes, the Dispatcher is listed as Nagios XI.

The settings on Admin -> Manage Email Settings must be correct, because on that page when I click Send Test Message, it works.

The debug logs on the settings page never seem to recieve any mail commands from Nagios XI. Because even though I enabled the verbosity, it only lists my "Test" emails generated from the Settings page. And it lists them as success. To me it seems like Nagios never fully commits to the command of generating an email. Because nothing seems to be failing, it just never seems to be happening.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by ssax »

Please run this tail command (and let it run):

Code: Select all

tail -F /usr/local/nagiosxi/var/eventman.log /usr/local/nagiosxi/tmp/phpmailer.log
Then force a notification to be sent (not with the Send a Test Email button) and send us the complete tail command output so we can see what is going on.

Thank you
atremblay
Posts: 46
Joined: Wed Apr 05, 2017 1:38 pm

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by atremblay »

Good call! I can tell you right now that the event manager isn't processing the events. But there should be. I can see the host up/down states, high bandwidth utilization on some links. As I mentioned before we've got two Nagios servers. So I'm using the other one that's working to try to help determine what's going on. And it's getting events from devices that I've configured on both Nagios servers. So they should both be processing the same events.

You can see bellow all the test emails I sent two days ago. Which were the last successful ones despite the roughly 50 I would expect from this server each day for the past while.


==> /usr/local/nagiosxi/var/eventman.log <==
PROCESSED 0 EVENTS
...................
PROCESSED 0 EVENTS
....................
PROCESSED 0 EVENTS
.....................
PROCESSED 0 EVENTS
...................
PROCESSED 0 EVENTS
....
==> /usr/local/nagiosxi/tmp/phpmailer.log <==
[10-01-2018 10:56:48] Message sent! (method=smtp;host=<mail server DNS name>;port=25;security=none), Referer: admin/testemail.php
[10-01-2018 11:05:26] SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/ ... leshooting (method=smtp;host=<mail server DNS name>;port=25;smtpauth=true;security=none), Referer: admin/testemail.php
[10-01-2018 11:05:49] SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/ ... leshooting (method=smtp;host=<mail server DNS name>;port=25;smtpauth=true;security=tls), Referer: admin/testemail.php
[10-01-2018 11:06:21] SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/ ... leshooting (method=smtp;host=<mail server DNS name>;port=25;smtpauth=true;security=tls), Referer: admin/testemail.php
[10-01-2018 11:06:29] SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/ ... leshooting (method=smtp;host=<mail server DNS name>;port=25;smtpauth=true;security=ssl), Referer: admin/testemail.php
[10-01-2018 11:06:55] Message sent! (method=sendmail), Referer: admin/testemail.php
[10-01-2018 11:13:35] Message sent! (method=sendmail), Referer: admin/testemail.php
[10-01-2018 11:15:03] Message sent! (method=sendmail), Referer: account/testnotification.php > PHPmailer Test
[10-01-2018 11:15:26] Message sent! (method=sendmail), Referer: admin/testemail.php
[10-01-2018 11:16:52] SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/ ... leshooting (method=smtp;host=<ip_addr>;port=25;smtpauth=true;security=none), Referer: admin/testemail.php

==> /usr/local/nagiosxi/var/eventman.log <==
................
PROCESSED 0 EVENTS
....................
PROCESSED 0 EVENTS
.....................
PROCESSED 0 EVENTS
..................^C
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by ssax »

Please PM me a copy of your profile of the non-working server, you can download it from Admin > System Profile > Download Profile.

Let me know the exact hostname and servicename you're testing with and any contact who should be receiving it but isn't.

Thank you
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Migrating Nagios XI to diff domain: Email Notifications

Post by ssax »

Looks like you have a crashed table that would impact this:

Code: Select all

181004 13:37:51 [ERROR] /usr/libexec/mysqld: Table './nagiosxi/xi_eventqueue' is marked as crashed and should be repaired
Please run these commands and see if it resolves your issue:

Code: Select all

service nagios stop
service ndo2db stop
cd /usr/local/nagiosxi/scripts
./repair_databases.sh
Locked