We've seen this as well. Before we understood the cause, it seemed very bizarre that hitting different nodes would yield different numFound, as well as using different rows=N (since the proxying node only de-dupe the documents that are returned in the response).
I think "consistency" and "correctness" should be clearly delineated. Of course we'd rather have consistently correct result, but failing that, I'd rather have consistently incorrect result rather than inconsistent results because otherwise it's even hard to debug, as was the case here. I think either the node hosting the shard should also do the de-duping, or no one should. It's strange that the proxying node decides to do some sketchy limited result set de-dupe. On Tue, Mar 10, 2015 at 9:09 AM, Timothy Potter <thelabd...@gmail.com> wrote: > > Before I open a JIRA, I wanted to put this out to solicit feedback on what > I'm seeing and what Solr should be doing. So I've indexed the following 8 > docs into a 2-shard collection (Solr 4.8'ish - internal custom branch > roughly based on 4.8) ... notice that the 3 grand-children of 2-1 have > dup'd keys: > > [ > { > "id":"1", > "name":"parent", > "_childDocuments_":[ > { > "id":"1-1", > "name":"child" > }, > { > "id":"1-2", > "name":"child" > } > ] > }, > { > "id":"2", > "name":"parent", > "_childDocuments_":[ > { > "id":"2-1", > "name":"child", > "_childDocuments_":[ > { > "id":"2-1-1", > "name":"grandchild" > }, > { > "id":"2-1-1", > "name":"grandchild2" > }, > { > "id":"2-1-1", > "name":"grandchild3" > } > ] > } > ] > } > ] > > When I query this collection, using: > > http://localhost:8984/solr/blockjoin2_shard2_replica1/select?q=*%3A*&wt=json&indent=true&shards.info=true&rows=10 > > I get: > > { > "responseHeader":{ > "status":0, > "QTime":9, > "params":{ > "indent":"true", > "q":"*:*", > "shards.info":"true", > "wt":"json", > "rows":"10"}}, > "shards.info":{ > " http://localhost:8984/solr/blockjoin2_shard1_replica1/|http://localhost:8985/solr/blockjoin2_shard1_replica2/ ":{ > "numFound":3, > "maxScore":1.0, > "shardAddress":" http://localhost:8984/solr/blockjoin2_shard1_replica1", > "time":4}, > " http://localhost:8984/solr/blockjoin2_shard2_replica1/|http://localhost:8985/solr/blockjoin2_shard2_replica2/ ":{ > "numFound":5, > "maxScore":1.0, > "shardAddress":" http://localhost:8985/solr/blockjoin2_shard2_replica2", > "time":4}}, > "response":{"numFound":6,"start":0,"maxScore":1.0,"docs":[ > { > "id":"1-1", > "name":"child"}, > { > "id":"1-2", > "name":"child"}, > { > "id":"1", > "name":"parent", > "_version_":1495272401329455104}, > { > "id":"2-1-1", > "name":"grandchild"}, > { > "id":"2-1", > "name":"child"}, > { > "id":"2", > "name":"parent", > "_version_":1495272401361960960}] > }} > > > So Solr has de-duped the results. > > If I execute this query against the shard that has the dupes (distrib=false): > > http://localhost:8984/solr/blockjoin2_shard2_replica1/select?q=*%3A*&wt=json&indent=true&shards.info=true&rows=10&distrib=false > > Then the dupes are returned: > > { > "responseHeader":{ > "status":0, > "QTime":0, > "params":{ > "indent":"true", > "q":"*:*", > "shards.info":"true", > "distrib":"false", > "wt":"json", > "rows":"10"}}, > "response":{"numFound":5,"start":0,"docs":[ > { > "id":"2-1-1", > "name":"grandchild"}, > { > "id":"2-1-1", > "name":"grandchild2"}, > { > "id":"2-1-1", > "name":"grandchild3"}, > { > "id":"2-1", > "name":"child"}, > { > "id":"2", > "name":"parent", > "_version_":1495272401361960960}] > }} > > So I guess my question is why doesn't the non-distrib query do > de-duping? Mainly confirming this is how it's supposed to work and > this behavior doesn't strike anyone else as odd ;-) > > Cheers, > > Tim