Thanks Yonik. I have a follow on now, how does Solr ensure consistent results across pages? So for example if we had my 3 theoretical solr instances again and a, b and c each returned 100 documents with the same score and the user only requested 100 documents, how are those 100 documents chosen from the set available from a, b and c if the documents have the same score?
On Tue, Jun 7, 2011 at 9:38 AM, Yonik Seeley <yo...@lucidimagination.com>wrote: > On Tue, Jun 7, 2011 at 9:35 AM, Jamie Johnson <jej2...@gmail.com> wrote: > > I am currently experimenting with the Solr Cloud code on trunk and just > had > > a quick question. Lets say my setup had 3 nodes a, b and c. Node a has > > 1000 results which meet a particular query, b has 2000 and c has 3000. > When > > executing this query and asking for row 900 what specifically happens? > From > > reading the Distributed Search Wiki I would expect that node a responds > with > > 900, node b responds with 900 and c responds with 900 and the > coordinating > > node is responsible for taking the top scored items and throwing away the > > rest, is this correct or is there some additional coordination that > happens > > where nodes a, b and c return back an id and a score and the coordinating > > node makes an additional request to get back the documents for the ids > which > > make up the top list? > > The latter is correct - the first phase only collects enough > information to merge ids from the shards, and then a second phase > requests the stored fields, highlighting, etc for the specific docs > that will be returned. > > -Yonik > http://www.lucidimagination.com >