Notifications not going out.

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
mkeey
Posts: 199
Joined: Mon Sep 25, 2017 11:13 am

Re: Notifications not going out.

Post by mkeey »

Please some clarification as I'm not an SMTP person, what is the "relay" server?

Is that the host that the Nagios XI runs on? Or, is that our company's Email server (like MS Exchange)? Or, is it something else?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Notifications not going out.

Post by tgriep »

The relay server would be your companies email server, the Exchange server.
If you look in the log files that you sent me, look at the relay=xxxxxxxxxxxxxxxxxx in the file. The xxxxxxxxxxxxxxxxxx would be the name of the relay server that some one will have to look at.
After that line, you will see this line
status=sent (250 Message accepted for delivery)
That means the (Relay, Exchange) server received it so the blocking has to be that server or beyond.
Be sure to check out our Knowledgebase for helpful articles and solutions!
mkeey
Posts: 199
Joined: Mon Sep 25, 2017 11:13 am

Re: Notifications not going out.

Post by mkeey »

Sent a private message indicating I don't see what you see.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Notifications not going out.

Post by tgriep »

Don't bother looking for the entry in the log files when the test email is sent. The test email function is a separate function from the Notification commands so the entries will not show up in the same logs.
The entries in the phpmailer.log file do show that it was sent to the (relay) SMTP server.

To trouble shoot the Test Email button, I need to see all of the settings in the Admin > Manage Email Settings menu but I suspect it it the same, the email was sent to the (relay) SMTP server but did not forward it on.
Make sure the from address is set in the Manage Email Settings menu.
Be sure to check out our Knowledgebase for helpful articles and solutions!
mkeey
Posts: 199
Joined: Mon Sep 25, 2017 11:13 am

Re: Notifications not going out.

Post by mkeey »

Now, I'm totally confused.

Not really worried about the test button. I'm trying to focus on the notification issue. I do not see any entries in the maillog file for any of the notifications shown in our Nagios XI Event Log.

These notifications occurred according to the Event Log...
SERVICE NOTIFICATION: USER01;viants06;Drive C: Disk Usage;CRITICAL;xi_service_notification_handler;CRITICAL - Socket timeout
SERVICE NOTIFICATION: USER02;viants06;Drive C: Disk Usage;CRITICAL;xi_service_notification_handler;CRITICAL - Socket timeout

When I look in the maillog for that XI, I do not see any emails going to either of those users.

I do see emails from some other software and those do have the RELAY messages...

Apr 18 00:49:31 XISERVER postfix/smtp[7971]: B0BABC000FE: to=<[email protected]>, orig_to=<root>, relay=mlwsmtpserver[REMOVEDIP]:25, delay=0.61, delays=0/0.01/0.37/0.22, dsn=2.0.0, status=sent (250 Message accepted for delivery)
Apr 18 00:49:31 XISERVER postfix/smtp[7971]: B0BABC000FE: to=<[email protected]>, orig_to=<root>, relay=mlwsmtpserver[REMOVEDIP]:25, delay=0.61, delays=0/0.01/0.37/0.22, dsn=2.0.0, status=sent (250 Message accepted for delivery)
Apr 18 00:49:31 XISERVER postfix/smtp[7971]: B0BABC000FE: to=<[email protected]>, orig_to=<root>, relay=mlwsmtpserver[REMOVEDIP]:25, delay=0.61, delays=0/0.01/0.37/0.22, dsn=2.0.0, status=sent (250 Message accepted for delivery)
Apr 18 00:49:31 XISERVER postfix/smtp[7971]: B0BABC000FE: to=<[email protected]>, orig_to=<root>, relay=mlwsmtpserver[REMOVEDIP]:25, delay=0.61, delays=0/0.01/0.37/0.22, dsn=2.0.0, status=sent (250 Message accepted for delivery)
Apr 18 00:49:31 XISERVER postfix/qmgr[2095]: B0BABC000FE: removed

But, they are not related to this case.

So, again, what makes you say that Nagios is actually passing these Notifications to the Linux OS so that it can be sent to the Relay Server?

Your other question regarding email settings, yes all three XI's in question have the same exact setup. Only difference is the From changes so we can distinguish one XI's emails from another.

Thanks.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Notifications not going out.

Post by tgriep »

OK, I see what you are seeing now.
In the server that ends in 22, I see a big gap in the maillog file. Nothing was logged between Apr 18 00:50:04 and Apr 18 23:32:10 when the notifications were generated.
The server that ended in 25 has even a bigger gap in the maillog file at that time on April 18th.

Go to the /usr/local/nagiosxi/var folder on the bad XI servers.
That is the folder where XI logs the data when it is running the various processes.
Take a look in the eventman.log file during that time the Notification was generated, you might find the error there.
If not, check the other .log files in that folder.
Be sure to check out our Knowledgebase for helpful articles and solutions!
mkeey
Posts: 199
Joined: Mon Sep 25, 2017 11:13 am

Re: Notifications not going out.

Post by mkeey »

I can certainly do that but it does not explain the failure from today.

Plus, I still would like an answer about the Nagios XI database. Would doing a repair on the DB possibly fix this problem?

I think, I may have answered my own question. Checked the eventman.log on the server ending in 25 and it shows this...

[nagios@mlwnag25]:[/usr/local/nagiosxi/var]# tail 100 eventman.log
tail: cannot open ‘100’ for reading: No such file or directory
==> eventman.log <==
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>
. <p><pre>SQL Error [nagiosxi] : Table './nagiosxi/xi_events' is marked as crashed and last (automatic?) repair failed</pre></p>

May I assume we need to run the repair process?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Notifications not going out.

Post by tgriep »

Yea, that is probably it, corrupt MYSQL tables caused the emails to not get sent.

Run the following commands to stop the processes, truncate some tables, repair the tables and start up the processes.
Run them all as root.

Code: Select all

service nagios stop
service ndo2db stop
killall -9 nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
service ndo2db start
service nagios start
Run it on both of the servers.
Be sure to check out our Knowledgebase for helpful articles and solutions!
mkeey
Posts: 199
Joined: Mon Sep 25, 2017 11:13 am

Re: Notifications not going out.

Post by mkeey »

We did this process (found it works best after the SQL Error case on 4/15)...

1) systemctl stop mariadb
2) /usr/local/nagiosxi/scripts/repair_databases.sh
3) systemctl start mariadb

I then deleted the ndo.sock file and rebooted.
Server came up clean and I forced a host failure (bad IP address).
Emails were sent and received by all recipients!

Doing the same process on the other "bad" server now. That one did not fix the issue, but we're now getting an error in the eventman.log file for the 22 server...

PHP Warning: file_put_contents(/usr/local/nagiosxi/tmp/phpmailer.log): failed to open stream:
Permission denied in /usr/local/nagiosxi/html/includes/utils-email.inc.php on line 160
User has mobile text notifications disabled.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Notifications not going out.

Post by tgriep »

If email debugging was not enabled in the server, than that file probably does not exist so lets create it just to suspress the errors.
Run this as root.

Code: Select all

touch /usr/local/nagiosxi/tmp/phpmailer.log
chown nagios.nagios /usr/local/nagiosxi/tmp/phpmailer.log
chmod 774 /usr/local/nagiosxi/tmp/phpmailer.log
That should stop that error.
re-run the repair on it and make sure it runs to completion.
The commands I posted earlier does an extended repair and rebuilds the table structures so run that instead.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked