Nagios XI sending very delayed emails after network outage.

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
yo_marc
Posts: 83
Joined: Thu Aug 11, 2016 1:56 pm

Nagios XI sending very delayed emails after network outage.

Post by yo_marc »

Hi all - I really need help with this one.

I have a 5.5.8 XI server here that is sending very delayed emails long after a network outage has been resolved. (Over 5 hours ago).

The state of the monitoring looks good - None of the emails that are coming through now match what is seen in Nagios XI.

I am sending emails via a SMTP config in the GUI. It seems the only way I can stop these old/stale emails from going through is to put in some bogus info in that config. I do not see any mail sitting on the server locally.

Can anyone please tell me how I can stop these old/stale notifications from being sent? I've got hundreds of users, and the emails are confusing.

Rebooting did not help...
yo_marc
Posts: 83
Joined: Thu Aug 11, 2016 1:56 pm

Re: Nagios XI sending very delayed emails after network outa

Post by yo_marc »

I am looking under the covers of 'eventman'. The eventman.log is churning away, apparently sending email from events that happened 9 hours ago.

Running the following query (borrowed and adapted from eventman.php), I am seeing over 50,000 rows returned...

Code: Select all

SELECT * FROM nagiosxi.xi_events WHERE (status_code='0' AND event_time<=NOW()) OR (status_code='".escape_sql_param(EVENTSTATUS_PROCESSING,DB_NAGIOSXI)."'
 AND processing_time + INTERVAL 1 MINUTE <= NOW()) ORDER BY event_id ASC;
CentOS Linux release 7.6.1810 (Core)
XI 5.5.8
mariadb-5.5.60-1.el7_5.x86_64
yo_marc
Posts: 83
Joined: Thu Aug 11, 2016 1:56 pm

Re: Nagios XI sending very delayed emails after network outa

Post by yo_marc »

Seeing lots of SNMP entries like this in the eventman.log, FWIW:

Code: Select all

*** GLOBAL HANDLER (snmptrapsender)...
Array
(
    [event_id] => 3373480
    [event_source] => 2
    [event_type] => 1
    [event_time] => 2019-03-12 22:39:28
    [event_meta] => Array
        (
            [handler-type] => service
            [host] => <hostname>
            [service] => Puppet-Agent
            [hostaddress] => <IP address>
            [hoststate] => UP
            [hoststateid] => 0
            [hosteventid] => 1605924
            [hostproblemid] => 0
            [servicestate] => CRITICAL
            [servicestateid] => 2
            [lastservicestate] => OK
            [lastservicestateid] => 0
            [servicestatetype] => SOFT
            [currentattempt] => 1
            [maxattempts] => 5
            [serviceeventid] => 1605927
            [serviceproblemid] => 703058
            [serviceoutput] => CHECK_NRPE: Socket timeout after 10 seconds.
            [longserviceoutput] =>
            [servicedowntime] => 0
        )

    [logging_enabled] => 1
)
SNMP TRAP SENDER NOT CONFIGURED!
Not sure what those are about, or if they are of any importance.
yo_marc
Posts: 83
Joined: Thu Aug 11, 2016 1:56 pm

Re: Nagios XI sending very delayed emails after network outa

Post by yo_marc »

I have the event queue cleared out. I manually deleted the entries from the DB. (Desperate times... Desperate measures. Plan B was to restore from yesterdays backup to accomplish the same task.)

If anyone could give any insight as to what those SNMP messages are about, that would be great. Also, any info on why our mail queue got so stacked up. We had over 144,000 events waiting to be processed from a 3 hours network-flapping outage affecting about 250 Hosts (corrected from 100 originally posted).

Our server has about 800 Host, 4000 Services. "5, 1, 5" on the check-interval, retry-interval, and max-check-attempts - respectively.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Nagios XI sending very delayed emails after network outa

Post by npolovenko »

Hello, @yo_marc. One thing you can do next time to stop spooled email notifications is to run the query to clear out the mailing queue:
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -uroot -pnagiosxi nagiosxi
Your global SNMP trap sender was attempting to send SNMP traps with critical service check results. But looks like it is not fully configured on your system. If you want to disable it go to the components menu, click on settings, check the box to disable SNMP trap sender integration.
Untitled.png
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
yo_marc
Posts: 83
Joined: Thu Aug 11, 2016 1:56 pm

Re: Nagios XI sending very delayed emails after network outa

Post by yo_marc »

Perfect! Thank you very much for the help.
yo_marc
Posts: 83
Joined: Thu Aug 11, 2016 1:56 pm

Re: Nagios XI sending very delayed emails after network outa

Post by yo_marc »

Strangely, I don't have the SNMP Trap Sender enabled...
SNMPtrapSender.JPG
I tried enabling, then disabling, but still seeing those same messages in the eventman.log

I did notice with it 'enabed', the last line below (SNMP TRAP SENDER NOT ENABLED!) was absent.

Code: Select all

*** GLOBAL HANDLER (snmptrapsender)...
Array
(
    [event_id] => 3527999
    [event_source] => 2
    [event_type] => 1
    [event_time] => 2019-03-14 11:44:31
    [event_meta] => Array
        (
            [handler-type] => service
            [host] => <host>
            [service] => Scheduled Tasks: Task Scheduler Library
            [hostaddress] => <host> 
            [hoststate] => UP
            [hoststateid] => 0
            [hosteventid] => 1693738
            [hostproblemid] => 0
            [servicestate] => CRITICAL
            [servicestateid] => 2
            [lastservicestate] => CRITICAL
            [lastservicestateid] => 2
            [servicestatetype] => SOFT
            [currentattempt] => 5
            [maxattempts] => 5
            [serviceeventid] => 1697027
            [serviceproblemid] => 747958
            [serviceoutput] => 2 / 23 tasks failed! <info>
            [longserviceoutput] =>
            [servicedowntime] => 0
        )

    [logging_enabled] => 1
)
SNMP TRAP SENDER NOT ENABLED! VALUE='0'
Is there something wrong with the Component, perhaps?
You do not have the required permissions to view the files attached to this post.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Nagios XI sending very delayed emails after network outa

Post by npolovenko »

@yo_marc, Can you check the global event handlers component? It should be in the same menu and make sure you don't have any global event handlers enabled.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
yo_marc
Posts: 83
Joined: Thu Aug 11, 2016 1:56 pm

Re: Nagios XI sending very delayed emails after network outa

Post by yo_marc »

Nothing is enabled there either...

I did go ahead and remove the 'SNMP Trap Sender' component from a test system, and that did remove those snmp specific entries from the eventman.log.

I don't think we plan on using that component, so I am probably ok with that 'fix'. But just to be sure its necessary, in an "alert storm" such as the one we experienced, I assume there will be some performance hit if we are unnecessarily trying to process snmp forwards... is that correct?
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Nagios XI sending very delayed emails after network outa

Post by npolovenko »

@yo_marc, I spoke to my colleagues and was told that these event messages are normal. The message will say that the "SNMP TRAP SENDER NOT ENABLED" even if you don't use the component. There is no way to disable these event messages. But running the query I provided should clear out the event queue. You can use it when something major goes wrong and there are lots of notifications.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked