Nagios XI clean up old alerts
Posted: Tue Feb 05, 2019 4:09 pm
Running Nagios XI 5.2.9 on CentOS Linux release 7.3.1611.
If the sendmail service is stopped for weeks, all the previous alters are remained somewhere.
When we restart the sendmail services (systemctl start sendmail), people received thousands of old alerts in 2 hours
before stop the sendmail service again.
How to clean up the previous alters before restart the sendmail service?
We have been tried the following steps but we still see thousands of files are generated in the /var/spool/mqueue.
- set mail server to dummy server or localhost in sendmail.cf. We cannot set to real mail server because thousands of emails will be sent.
FallbackSmartHost=fakedserver.company.com
- Stop Nagios Monitoring Engine, Performance Grapher, and Database Backend.
- run database repair script many times.
- shell script to delete all files in /var/spool/mqueue over and over.
- Restart all process from Monitoring Engine Status
- Recreate host where the notification mentioned but was deleted.
- Delete any suspected records in the nagios_tables like nagios.nagios_contact_notificationcommands, nagios.nagios_contactnotifications, nagios.nagios_contactnotificationmethods, etc.
I don't see any suspected records in both nagiosql and nagiosxi database. nagiosxi.xi_events:event_time is up to date,
nagiosxi.xi_eventqueue is null, and messages in nagiosxi.xi_meta are up to date.
We want to monitor the hosts from now and don't want to receive old alerts. We cann't let notifications really send out before cleanup
those old alerts.
any suggestion will be appreciated, thanks,
If the sendmail service is stopped for weeks, all the previous alters are remained somewhere.
When we restart the sendmail services (systemctl start sendmail), people received thousands of old alerts in 2 hours
before stop the sendmail service again.
How to clean up the previous alters before restart the sendmail service?
We have been tried the following steps but we still see thousands of files are generated in the /var/spool/mqueue.
- set mail server to dummy server or localhost in sendmail.cf. We cannot set to real mail server because thousands of emails will be sent.
FallbackSmartHost=fakedserver.company.com
- Stop Nagios Monitoring Engine, Performance Grapher, and Database Backend.
- run database repair script many times.
- shell script to delete all files in /var/spool/mqueue over and over.
- Restart all process from Monitoring Engine Status
- Recreate host where the notification mentioned but was deleted.
- Delete any suspected records in the nagios_tables like nagios.nagios_contact_notificationcommands, nagios.nagios_contactnotifications, nagios.nagios_contactnotificationmethods, etc.
I don't see any suspected records in both nagiosql and nagiosxi database. nagiosxi.xi_events:event_time is up to date,
nagiosxi.xi_eventqueue is null, and messages in nagiosxi.xi_meta are up to date.
We want to monitor the hosts from now and don't want to receive old alerts. We cann't let notifications really send out before cleanup
those old alerts.
any suggestion will be appreciated, thanks,