Hi Jonothan and Markus, >>Why 3 shards on one machine instead of one larger shard per machine?
Good question! We made this architectural decision several years ago and I'm not remembering the rationale at the moment. I believe we originally made the decision due to some tests showing a sweetspot for I/O performance for shards with 500,000-600,000 documents, but those tests were made before we implemented CommonGrams and when we were still using attached storage. I think we also might have had concerns about Java OOM errors with a really large shard/index, but we now know that we can keep memory usage under control by tweaking the amount of the terms index that gets read into memory. We should probably do some tests and revisit the question. The reason we don't have 12 shards on 12 machines is that current performance is good enough that we can't justify buying 8 more machines:) Tom -----Original Message----- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: Tuesday, August 02, 2011 2:12 PM To: solr-user@lucene.apache.org Subject: Re: performance crossover between single index and sharding Hi Tom, Very interesting indeed! But i keep wondering why some engineers choose to store multiple shards of the same index on the same machine, there must be significant overhead. The only reason i can think of is ease of maintenance in moving shards to a separate physical machine. I know that rearranging the shard topology can be a real pain in a large existing cluster (e.g. consistent hashing is not consistent anymore and having to shuffle docs to their new shard), is this the reason you choose this approach? Cheers, bble.com.