Best Practices for Managing Historical Data in MariaDB (ndoutils) - Reducing Storage Usage in /var/lib/mysql/nagios

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Post Reply
igae1
Posts: 57
Joined: Thu May 25, 2017 6:35 am

Best Practices for Managing Historical Data in MariaDB (ndoutils) - Reducing Storage Usage in /var/lib/mysql/nagios

Post by igae1 »

Hello everyone,

I've been noticing a substantial increase in storage usage in the /var directory, specifically in /var/lib/mysql/nagios, which is currently occupying nearly 5GB. It appears that historical data is taking up the most space, with specific files standing out:

Code: Select all

250M    ./nagios_downtimehistory.MYI
286M    ./nagios_statehistory.MYI
407M    ./nagios_commenthistory.MYI
424M    ./nagios_downtimehistory.MYD
1011M   ./nagios_statehistory.MYD
1.4G    ./nagios_commenthistory.MYD
I'm looking for guidance on best practices to manage these historical data files. Ideally, I’d like to store only the last year's worth of data.

Additionally, I’d like to know if it's safe to clean up some older data manually. I noticed we have a high volume of records in nagios_downtimehistory:

Code: Select all

MariaDB [nagios]> SELECT COUNT(*) FROM nagios_downtimehistory;
+----------+
| COUNT(*) |
+----------+
|  3617508 |
+----------+
There’s also evidence of downtimes that may have been created incorrectly:

Code: Select all

MariaDB [nagios]> SELECT MIN(actual_start_time) AS oldest_downtime, MAX(actual_end_time) AS newest_downtime FROM nagios_downtimehistory;
+---------------------+---------------------+
| oldest_downtime     | newest_downtime     |
+---------------------+---------------------+
| 0000-00-00 00:00:00 | 2024-10-30 08:00:00 |
+---------------------+---------------------+
Here’s another query identifying some of the records I’m considering deleting:

Code: Select all

MariaDB [nagios]> SELECT COUNT(*) FROM nagios_downtimehistory WHERE was_started = '0' AND actual_end_time < DATE_SUB(NOW(), INTERVAL 1 MONTH);
+----------+
| COUNT(*) |
+----------+
|    23675 |
+----------+
Questions:
  1. Are the following commands safe to execute in order to delete older, incomplete downtime records and free up space?

    Code: Select all

    DELETE FROM nagios_downtimehistory WHERE was_started = 0 AND actual_end_time < DATE_SUB(NOW(), INTERVAL 1 YEAR);
    OPTIMIZE TABLE nagios_downtimehistory;
  2. Is there a configuration setting in Nagios that can limit the storage of downtimes, comments, and other historical data to only the past year? This would help keep the database lean without manually pruning data periodically.
Any advice or recommendations on best practices for managing and cleaning historical data in MariaDB for Nagios would be greatly appreciated. Thanks in advance!
jsimon
Posts: 343
Joined: Wed Aug 23, 2023 11:27 am

Re: Best Practices for Managing Historical Data in MariaDB (ndoutils) - Reducing Storage Usage in /var/lib/mysql/nagios

Post by jsimon »

Hi igae1,

Take a look at Admin > System Config > Performance Settings > Database. There are settings in here that control the max age of each of those tables, among others. These control a cron script that will clear out old records to your preferences. Hopefully this meets your needs, but let us know if you have any issues with this.
igae1
Posts: 57
Joined: Thu May 25, 2017 6:35 am

Re: Best Practices for Managing Historical Data in MariaDB (ndoutils) - Reducing Storage Usage in /var/lib/mysql/nagios

Post by igae1 »

Hi,

Thanks for the guidance! I went to Admin > System Config > Performance Settings > Database and found some options, including one for nagios_commenthistory (Max Comment History Age set to 180 days) but I didn't see any specific setting for nagios_downtimehistory

Code: Select all

[Nagios XI Database]

Max Commands Age:                 480  (Max time in minutes to keep commands.)
Max Events Age:                   480  (Max time in minutes to keep events.)
Max SNMP Trap Age:                 90  (Max time in DAYS to store SNMP trap data in the database.)
Max Scheduled Reports History Age: 365 (Max time in DAYS to store scheduled report log data.)
Max Expired Auth Token Age:         24 (Max time in HOURS to store expired auth tokens in the database.)
Max Expired Session Age:            24 (Max time in HOURS to store expired (no longer active) session data in the database.)
Max Audit Log Age:                  90 (Max time in DAYS to keep audit log entries.)
Optimize Interval:                  60 (Max time in minutes between optimization executions.)

[NDO Database]

Max External Commands Age:           7  (Max time in DAYS to keep external commands.)
Max Log Entries Age:                30  (Max time in DAYS to keep log entries.)
Max Notifications Age:              30  (Max time in DAYS to keep notifications.)
Max State History Age:             180  (Max time in DAYS to keep state history information.)
Max Timed Events Age:                5  (Max time in minutes to keep timed events.)
Max System Commands Age:             5  (Max time in minutes to keep system commands.)
Max Service Checks Age:              5  (Max time in minutes to keep service checks.)
Max Host Checks Age:                 5  (Max time in minutes to keep host checks.)
Max Event Handlers Age:              5  (Max time in minutes to keep event handlers.)
Max Comment History Age:           180  (Max time in DAYS to keep comment history.)
Optimize Interval:                  60  (Max time in minutes between optimization executions.)

[CCM Database]

Max Logbook Age:                 10080  (Max time in minutes to keep logbook records.)
Optimize Interval:                  60  (Max time in minutes between optimization executions.)
Here’s what I’m seeing in table nagios_commenthistory, looks like it's being managed by entry_time, so it seems to be working as expected.

Code: Select all

MariaDB [nagios]> SELECT MIN(entry_time) AS oldest_entry_time,
                         MAX(entry_time) AS newest_entry_time, 
                         MIN(comment_time) AS oldest_comment_time,
                         MAX(comment_time) AS newest_comment_time 
                  FROM nagios_commenthistory;

+---------------------+---------------------+---------------------+---------------------+
| oldest_entry_time   | newest_entry_time   | oldest_comment_time | newest_comment_time |
+---------------------+---------------------+---------------------+---------------------+
| 2024-05-04 10:36:41 | 2024-10-31 07:18:01 | 2017-10-10 13:43:22 | 2024-10-31 07:18:01 |
+---------------------+---------------------+---------------------+---------------------+
1 row in set (3.21 sec)
Is there any additional configuration required for downtimehistory to automatically clear out older entries? Alternatively, if there’s a safe way to delete these entries manually, I’d appreciate any advice on how to proceed.

Thanks again for the help!
jsimon
Posts: 343
Joined: Wed Aug 23, 2023 11:27 am

Re: Best Practices for Managing Historical Data in MariaDB (ndoutils) - Reducing Storage Usage in /var/lib/mysql/nagios

Post by jsimon »

What version of Nagios XI are you running? I mostly ask because I see "Max Downtime History Age" as the next option after "Max Comment History Age" on my own server.

For reference, this is the command most recently run in my dbmaint.log, if you're looking for a template to match against for a safe command to manually prune:

Code: Select all

SQL: DELETE FROM nagios_downtimehistory WHERE entry_time < FROM_UNIXTIME(1667140512)
I don't see any particular issues in modifying this command with "was_started" to be more specific if there is older history you are looking to preserve.
igae1
Posts: 57
Joined: Thu May 25, 2017 6:35 am

Re: Best Practices for Managing Historical Data in MariaDB (ndoutils) - Reducing Storage Usage in /var/lib/mysql/nagios

Post by igae1 »

Thanks jsimon,

Our Nagios version is 5.8.9 (yes, it's a bit outdated).

So, good to know that there is no problem in deleting this data directly.
Post Reply