"Expiration date" of Nagios Xi log messages?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
KDA
Posts: 16
Joined: Wed Aug 11, 2021 1:28 am

"Expiration date" of Nagios Xi log messages?

Post by KDA »

In our Nagios Xi set up we want to "....keep received SNMP status (poll results) for X days".... I am a little confused with regards to what is being logged where (we do not currently have Nagios Log Server) but I assume that this log is where the actual poll results are stored:

/usr/local/nagios/var/nagios.log (we massage the entries in this file with a simple script and thus make them look sufficiently nice)

But here is where the confusion sets in:

i read somewhere that a new log is created daily... which does not seem to be the case, so what i read must be referring to another log file (or files)?

If not, is this file just appended to until... when?

Is there a way to set the "retention time" for entries in this file ? In the GUI, there is the Performance settings/Database tab which lets you set data retention times for various databases, but I do not understand which of the settings there that apply to this file?

Or do we have to make tool that monitors the file and deletes entries older than X days?
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: "Expiration date" of Nagios Xi log messages?

Post by dchurch »

/usr/local/nagios/var/nagios.log is the Nagios monitoring engine internal process's log. It's "rotated" away (i.e. truncated) when it gets to a certain size (5M).

The SNMP trap retention logs are stored in the database.

You can change the time SNMP logs are stored in the database by going to Admin => Performance Settings => Database.

Please see this FAQ below for some good information regarding retention/cleanup:

---

FAQ: Can I truncate the tables first before proceeding with database repair (if I have crashed tables)?

You can truncate before repairing the DB, it's up to you. If you want to back it up first, you'll need to repair it. If you don't care, or already have a backup, truncate it first as it will speed up the DB repair process.

NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the commands if your DB is housed/stored/offloaded/contained on a different server and/or you've changed the root mysql password

If you don't care about the data, or already have a backup, you can just truncate the tables which will essentially drop and recreate the table with zero data in it (removing all historical data for the respective reports):

nagios_logentries - Impacts Event Log report length

Code: Select all

mysql -uroot -pnagiosxi -h 127.0.0.1 -B nagios -e 'truncate table nagios_logentries;'
nagios_statehistory - Impacts the State History report length

Code: Select all

mysql -uroot -pnagiosxi -h 127.0.0.1 -B nagios -e 'truncate table nagios_statehistory;'
nagios_commenthistory - Impacts the comment history

Code: Select all

mysql -uroot -pnagiosxi -h 127.0.0.1 -B nagios -e 'truncate table nagios_commenthistory;'

These should technically work to clean the DB tables up manually (if the tables aren't crashed, if they ARE crashed, you will need to repair the database FIRST in order to run these queries):

nagios_logentries - Impacts Event Log report length

Code: Select all

mysql -uroot -pnagiosxi -h 127.0.0.1 -B nagios -e 'DELETE FROM nagios_logentries WHERE logentry_time <= (NOW() - INTERVAL 6 MONTH);'
nagios_statehistory - Impacts the State History report length

Code: Select all

mysql -uroot -pnagiosxi -h 127.0.0.1 -B nagios -e 'DELETE FROM nagios_statehistory WHERE state_time <= (NOW() - INTERVAL 6 MONTH);'
nagios_commenthistory - Impacts the comment history

Code: Select all

mysql -uroot -pnagiosxi -h 127.0.0.1 -B nagios -e 'DELETE FROM nagios_commenthistory WHERE entry_time <= (NOW() - INTERVAL 6 MONTH);'
Then you should go to Admin > Performance Settings > Databases tab and adjust ALL of the retention intervals to meet your business data policy standards to keep them cleaned up as these settings are for adjusting the retention on those DB tables.

I would lower them to the smallest possible level and utilize the XI backup/restore process and the Admin > Scheduled Backups process to offload the backups to another server. Since these XI backups contain database backups you can spin them up to grab the data and report on them if needed.

See here for more information:

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

And here:

https://assets.nagios.com/downloads/nag ... tabase.pdf
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
KDA
Posts: 16
Joined: Wed Aug 11, 2021 1:28 am

Re: "Expiration date" of Nagios Xi log messages?

Post by KDA »

Thank you for an elaborate response. I don't use SNMP traps, however, I only poll... and what I am interested in is simply a history of SNMP poll results.

Such a history of poll results seems to be available in "Performance Graphs, View Host History"... ...filterable by host/service/period and whatnot.

(These are the logs you refer to as SNMP logs? Or are there others?)

>You can change the time SNMP logs are stored in the database by going to Admin => Performance Settings => Database.

For the history I refer to above, which parameter in the Admin => Performance Settings => Database affects the retention time? I don't find any that seem to match...? And in general: which log that is affected by which parameter is a bit vague to me; is there a document somewhere that could help me perhaps?
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: "Expiration date" of Nagios Xi log messages?

Post by dchurch »

The poll results are instead referred to as "check results" since they're run as regular checks. Other than the latest check, the storage mechanism is more a "bookkeeping" data that's controlled by "Max Service Checks Age" and "Max Commands Age"

Are you experiencing a problem, or are you just concerned about data security?
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
KDA
Posts: 16
Joined: Wed Aug 11, 2021 1:28 am

Re: "Expiration date" of Nagios Xi log messages?

Post by KDA »

>Are you experiencing a problem, or are you just concerned about data security?

The "problem" I am experiencing is that I have a requirement to retain SNMP poll results (or checks) for X days... and I cannot figure out how to do it....I cannot seem to limit the retention time and I cannot seem to find "all" historical poll results; the parameters you supplied don't seem to have the effect I was hoping for.

I have made some screenshots in the attached document "SNMP poll history.doc", where you can see

a) the historical data I find
b) the retention settings

I find no correlation...whereas the params you suggested are set to
Max Service Checks = 5 min
Max Commands Age= 480 min

... i retain poll checks for "ever" .

BUT - even though I poll the services every 120 s - I only find a very very tiny fraction of these polls.... where are the rest? Do I have to issue mysql commands to get them? If so, do you have any examples? (I have googled and googled... ! :D )

Again: the majority of the terms used in the "Database settings" page are terms that I have not come across anywhere while playing around with Nagios... so they mean nothing to me. Is there a document that describes these terms somewhere?
You do not have the required permissions to view the files attached to this post.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: "Expiration date" of Nagios Xi log messages?

Post by ssax »

Not all of the check results are logged in order to save space, you can enable state stalking to log more changes in Reports > State History but it will never log every OK result (it should store the results in the graph data if it's graphing though):

https://assets.nagios.com/downloads/nag ... lking.html

The only way I can think of (event handlers also only run on state changes so that wouldn't work) would be to use OCSP and log every result somewhere via a script that you'd write:

https://assets.nagios.com/downloads/nag ... sp_command

See the OSCP column here for what macros are available:

https://assets.nagios.com/downloads/nag ... olist.html

See page 4 here for defining commands:

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

EDIT: If you do that you would need to make sure that log doesn't fill up your disk on the server
KDA
Posts: 16
Joined: Wed Aug 11, 2021 1:28 am

Re: "Expiration date" of Nagios Xi log messages?

Post by KDA »

Thank you again, my personal fog seems to be lifting slowly :-) If the log shown when viewing Performance Graphs->View Host History indeed contains all state transitions, I am ok.

But I still have not been able to find out where and how to set the retention period? I cannot seem to map the parameters in the PerformanceSettings->Database to this log, the names of the parameters ... tell me very little. And whatever I blindfoldedly try to do doesn't seem to do the trick.

What IS the setting for retaining the log displayed by Performance Graphs->View Host History for x days only?
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: "Expiration date" of Nagios Xi log messages?

Post by ssax »

The performance graph data is stored in RRD files which have a different setup (I would leave the defaults), they are stored in this directory:

Code: Select all

/usr/local/nagios/share/perfdata
https://support.nagios.com/kb/article.php?id=41

The RRD setup uses this file when creating the RRDs from the first result that contains performance data:

Code: Select all

/usr/local/nagios/etc/pnp/rra.cfg
Which has these set:
- Do not make changes to these settings unless you understand the changes as it will globally impact all new RRDs created going forward, if you have questions, ask first

Code: Select all

#
# PNP default RRA config
#
# you will get 400kb of data per datasource
#
# 2880 entries with 1 minute step = 48 hours
#
RRA:AVERAGE:0.5:1:2880
#
# 2880 entries with 5 minute step = 10 days
#
RRA:AVERAGE:0.5:5:2880
#
# 4320 entries with 30 minute step = 90 days
#
RRA:AVERAGE:0.5:30:4320
#
# 5840 entries with 360 minute step = 4 years
#
RRA:AVERAGE:0.5:360:5840

RRA:MAX:0.5:1:2880
RRA:MAX:0.5:5:2880
RRA:MAX:0.5:30:4320
RRA:MAX:0.5:360:5840

RRA:MIN:0.5:1:2880
RRA:MIN:0.5:5:2880
RRA:MIN:0.5:30:4320
RRA:MIN:0.5:360:5840
That is what determines how long information is stored in the RRD files.

Please note that the RRDs stay the same size from creation (the data is averaged):

https://support.nagios.com/kb/article/n ... g-768.html
Locked