NSCA tuning - HUGE checkresults directory
Posted: Wed Feb 03, 2016 10:08 am
First of all, i'm rather new to nagios and this forum. So please feel free to reassign this if it is in the wrong place and try to be patient if I don't know *ANYTHING*. I've been reading a bunch and I think I understand enough to dig into my issue:
I've inherited an old nagios environment with NSCA implemented in a distributed system on CENT 5. It is integrated into my companies datacenter software so much that we now cannot upgrade it. We are building a new infrastructure to replace it, but for now I'm stuck with it. The configuration is several nagios "distributed" servers running NSCA, sending the check data to a Nagios "Master" cluster. The issue we are having is the check data on the "distributed" servers (which resides in /nagios/var/spool/checkresults) is constantly at 20,000 + files, 90% of which are older than 30 days. I believe this old data may be caused by the servers getting out of sync due to things like network interruption or load spikes/server crashes etc...
What I'm trying to figure out:
1. Is it safe to delete the old files or not?
2. Is there some setting in NSCA that will trim the old data?
3. Is there some setting in NSCA that defines how old data has to be before it will no longer forward it to the "Master" server?
4. Is there some tuning I can do if the files are just expiring before they can be pushed out in time?
I did find this link (http://www.terminalinflection.com/nagio ... ding-nsca/) which is a decent overview and has a link within to an implementation guide but it's for nagios XI and this environment is more like Nagios 2 or 3... I have yet to find any documentation that talks about managing the checkresults directory. Please note that the location of our check results directory may be different from the default
Thanks in advance for any help you can offer!!
I've inherited an old nagios environment with NSCA implemented in a distributed system on CENT 5. It is integrated into my companies datacenter software so much that we now cannot upgrade it. We are building a new infrastructure to replace it, but for now I'm stuck with it. The configuration is several nagios "distributed" servers running NSCA, sending the check data to a Nagios "Master" cluster. The issue we are having is the check data on the "distributed" servers (which resides in /nagios/var/spool/checkresults) is constantly at 20,000 + files, 90% of which are older than 30 days. I believe this old data may be caused by the servers getting out of sync due to things like network interruption or load spikes/server crashes etc...
What I'm trying to figure out:
1. Is it safe to delete the old files or not?
2. Is there some setting in NSCA that will trim the old data?
3. Is there some setting in NSCA that defines how old data has to be before it will no longer forward it to the "Master" server?
4. Is there some tuning I can do if the files are just expiring before they can be pushed out in time?
I did find this link (http://www.terminalinflection.com/nagio ... ding-nsca/) which is a decent overview and has a link within to an implementation guide but it's for nagios XI and this environment is more like Nagios 2 or 3... I have yet to find any documentation that talks about managing the checkresults directory. Please note that the location of our check results directory may be different from the default
Thanks in advance for any help you can offer!!