On 9/29/2013 7:21 AM, adfel70 wrote: > Hi, > I'm thinking of solr cluster architecture before purchasing machines. > > > My total index size is around 5TB. I want to have replication factor of 3. > total 15TB. > I've understood that I should have 50-100% of the index size as ram, for OS > cache. Lets say we're talking about around 10TB of memory. > Now I need to split this memory to multiple servers and get the machine spec > I want to buy. > I'm thinking of running multiple solr processes per machine.
Running multiple solr instances per machine is a really bad idea. One Solr instance can run many indexes, and there will be far less memory overhead if you're not running multiple servlet containers. You can also run all instances on the same TCP port - no need to figure out different ports per instance. Configuration and deployment are not as complicated. When you have multiple Solr instances per machine, the SolrCloud collections API has a tendency to place some or all of the replicas for each shard on the same machine, which means that it won't be fault tolerant. With one instance per machine, you can be absolutely sure that created collections will have all replicas for each shard on different machines. I will echo the advice you've been given about using SSD. You'll need much less OS disk cache memory with SSD. Thanks, Shawn