Hi, I found on wiki (https://wiki.apache.org/solr/SolrPerformanceProblems#RAM) that optimal amount of RAM for SOLR is equal to index size. This is lets say the ideal case to have everything in memory.
We plan to have small installation with 2 nodes and 8shards. We'll have inside the cluster 100M of documents. We expect that each document will take 5kB to index. With in-memory index this would mean that those two nodes would require ~500GB RAM. This would mean 2x 256GB to have everything in memory. And those are really big machines... Is this calculation even correct in new Solr versions? And we do have a bit restricted problem: Our data are time based logs and we generally have a restricted search for last 3 months. Which will match let's say 10M of documents. How will this affect SOLR memory requirements? Will we still need to have the whole inverted indexes in memory? Or is there some internal optimization, which will ensure that only some part will need to be in memory? The questions: 1) Is the 500GB of memory reqs correct assumption? 2) Will the fact that we have time-based logs with majority of accesses to recent data only help? 3) Is there some best practice how to reduce required RAM in Solr? Thanks in advance! Pavel Side note: We were thinking about DB partitioning based on Time Routed Aliases, but unfortunately we need to ensure disaster recovery through a bad network connection. And TRA and Cross Data Center Replication are not compatible. (CDCR requires static number of cores, while TRA creates cores dynamically).