On 2/8/2017 9:35 PM, cmti95035 wrote: > I noticed in our production environment that the returned result count is > inconsistent when doing paging. > > For example, for a certain query, for the first page (start = 0, rows = 30), > the corresponding "numFound" is 3402; and then it returned 3378, 3361 for > the 2nd and 3rd page, respectively (start = 30, 60 respectively). A sample > query looks like the following: > q:TMCN:(美丽 OR ?美丽 OR 美丽? OR 丽美) > raw query parameters: > fl=*&start=60&rows=30&shards=172.10.10.3:9080/solr/tm01,172.10.10.3:9080 <snip> > /solr/tm44,172.10.10.3:9080/solr/tm45&facet=true&facet.missing=false&facet.field=intCls&facet.field=appDate&facet.field=TMStatus > > The query was against multiple shards at a time. With limited tries I > noticed that the return count is consistent if the number of shards are less > than 5.
When a distributed search returns different numFound values on different requests for the same query, it almost always means that your uniqueKey field is not unique between the different shards -- you have documents using the same uniqueKey value in more than one shard. The reason you see different counts has to do with which shards get their results back to the coordinating node first, so on one query there may be a different number of duplicate documents than on a subsequent query, and the fact that Solr will remove duplicates from the combined results before calculating the total. Probably when you reduce the number of shards, you are removing shards from the list that contain the duplicate documents, so the problem doesn't happen. It is *critical* that the uniqueKey field remains unique across the entire distributed index. Using SolrCloud with *fully* automatic document routing will typically ensure that everything is unique across the entire collection, but in other situations, making sure this happens will be up to you. Thanks, Shawn