I think the first one to respond is indeed the way it works, but that's only deterministic up to a point (if your small index is in the throes of a commit and everything required for a response happens to be cached on the larger shard ... who knows ?)
On Mon, Aug 8, 2011 at 7:10 PM, Shawn Heisey <s...@elyograg.org> wrote: > On 8/8/2011 4:07 PM, simon wrote: >> >> Only one should be returned, but it's non-deterministic. See >> >> http://wiki.apache.org/solr/DistributedSearch#Distributed_Searching_Limitations > > I had heard it was based on which one responded first. This is part of why > we have a small index that contains the newest content and only distribute > content to the other shards once a day. The hope is that the small index > (less than 1GB, fits into RAM on that virtual machine) will always respond > faster than the other larger shards (over 18GB each). Is this an incorrect > assumption on our part? > > The build system does do everything it can to ensure that periods of overlap > are limited to the time it takes to commit a change across all of the > shards, which should amount to just a few seconds once a day. There might > be situations when the index gets out of whack and we have duplicate id > values for a longer time period, but in practice it hasn't happened yet. > > Thanks, > Shawn > >