On Wed, Feb 9, 2011 at 1:42 PM, mrw <[email protected]> wrote: > > I have a data set indexed over two irons, with M docs per Solr core for a > total of N cores. > > If I perform a query across all N cores with start=0 and rows=30, I get, > say, numFound=27521). If I simply change the start param to start=27510 > (simulating being on the last page of data), I get a smaller result set > (say, numFound=21415). > > I had expected numFound to be the same in either case, since no other aspect > of the query had changed. Am I mistaken?
You probably have some duplicate docs in your shards (those with the same id). Solr doesn't know they are dups until it retrieves the ids of the docs to merge, and then it only takes one of the dups and decrements numFound. -Yonik http://lucidimagination.com
