On 3/24/2017 1:15 AM, vrindavda wrote: > Thanks Erick and Emir , for your prompt reply. > > We are expecting around 50M documents to sit on 80GB . I understand that > there is no equation to predict the number/size of server. But considering > to have minimal fault tolerant architecture, Will 2 shards and 2 replicas > with 128GB RAM, 4 core solr instance be advisable ? Will that suffice ? > > I am planning to use two solr instances for shards and replicas each and 3 > instances for zookeeper. Please suggest if I am in right direction.
If you have two servers with 128GB and the entire index will be 80GB in size, this should work well. The heap would likely be fine at around 8GB, so each server would have a complete copy of the index and would have enough memory available to cache it entirely. With two servers, you want two replicas, regardless of the number of shards. When I say two replicas, I am talking about a total of two copies -- not a leader and two followers. If the query rate is very low, then sharding would be worthwhile, because multiple CPUs will be used by a single query. If the query rate is high, then you would want all the documents in a single shard, so the CPUs are not overwhelmed. If you don't know what the query rate will be, assume it will be high. A more detailed discussion: https://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn