WillemDH wrote:Could you tell me how much data you think we should be able to handle daily with two nodes?
This is somewhat a loaded question as with many items, the answer is, it depends. Below are some of the factors it depends on:
- What type of filters (and quantity) are added to the logstash config
Speed of Disks
Amount of RAM
Quantity of people querying the data
How even the data comes in (bursty or steady stream)
One thing I will point out, performance wise, is that there is only a marginal benefit of 2 nodes over a single as all data is indexed on both instances, the real load reduction benefit comes with 3+ nodes as the indexing will always only happen on 2 instances.
WillemDH wrote:Is there some way to see how much data each source uses?
Data usage by source is not available. The closest you can really get would be the quantity of docs per source