On Wed, Sep 2, 2015, at 10:12 PM, tedsolr wrote: > I've read from http://heliosearch.org/solrs-mergestrategy/ > <http://heliosearch.org/solrs-mergestrategy/> that the AnalyticsQuery > component only works for a single instance of Solr. I'm planning to > "migrate" to the SolrCloud soon and I have a custom AnalyticsQuery module > that collapses what I consider to be duplicate documents, keeping stats > like > a "count" of the dupes. For my purposes "dupes" are determined at run > time > and vary by the search request. Once a collection has multiple shards I > will > not be able to prevent "dupes" from appearing across those shards. A > custom > merge strategy should allow me to merge my stats, but I don't see how I > can > drop duplicate docs at that point. > > If shard1 returns docs A & B and shard2 returns docs B & C (letters > denoting > what I consider to be unique docs), can my implementation of a merge > strategy return only docs A, B, & C, rather than A, B, B, & C?
How did you end up with document B in both shard1 and shard2? Can't you prevent that from happening, and thus not have this issue? Upayavira