Hope you can help on this one, the daily system backup is getting larger by the day. At this moment each node produces a 8,5GB tar.gz and obviously takes hours to complete.
The real problem is that more often the backups will fail for all kinds of reasons like "Waiting for available slot." messages, or worse when "{"acknowledged":true,"persistent":{},"transient":{"plugin":{"knapsack":{"export":{"state":"[]"}}}}}" messages appear in the /tmp/backup.log.
The worst one is when it wil only produce a 26303 byte tar.gz which is broken and empty. At that point I will have to restart elasticsearch to get it to function again.
Reading in the manual, within the system backup amongst dashboards, etc. also de audit log is saved. That gave me the thought "what is the retention of the audit log?". Escpecially since we have the "save user query to audit" on a swell.
Digging through the audit log, I ended up somewhere in 2016 when we started our cluster. That might explain why the backup is so large and always growing.
Question is, is there no retention on de audit log (or other logging saved in the system backup)? If not, how can I set a retention or at least clear out some old data. Unless something else is going, but I would appreciate some help. No backup is never good
Nagios log server version is 2.1.15 on CentOS 7.9
King Regards,
Hans Blom