Page 1 of 1

Estimate disk usage for new deployment

Posted: Wed May 27, 2020 3:56 am
by Chris Hardick
Hi

I am trying to estimate the disk requirements of a new single instance log server deployment.
It is estimated that 13.5GB of syslog events would be received daily.
I would like to retain 28 days in the queryable database without needing to load information that has already been archived
I would like to retain 365 days of daily archives that could be loaded into the queryable database when necessary (im not sure if I have used the right terminology as this will be our first deployment)

Any help in estimating the amount of disk space would be greatly appreciated - im assuming that compression is occurring, but also there is overhead in additional indexes etc

Thanks

Chris

Re: Estimate disk usage for new deployment

Posted: Wed May 27, 2020 4:36 pm
by cdienger
It's been a while since I've looked into this so I set up a quick test of loading a large file. So far about 10% of the way through there is little difference between the size of the raw file and space needed to store it in the Elasticsearch database. It's looking at ~17% compression. Elasticsearch does do compression but the extra overhead uses most of the saved space. I'll let it run and follow up tomorrow with the final numbers.

Re: Estimate disk usage for new deployment

Posted: Thu May 28, 2020 4:46 pm
by cdienger
The logs were ~20% compressed when everything was done and inserted into the database. Keep in mind your results will vary. I always suggest running trial with NLS and import a few days of typical data to get a proper sizing for an environment but this can at least give you an idea of where to start with disk size.