The merge strategy probably won't work for the type of distributed collapse
you're describing.

You may want to begin exploring the Streaming API which supports real-time
map/reduce operations,

http://joelsolr.blogspot.com/2015/03/parallel-computing-with-solrcloud.html

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Sep 2, 2015 at 5:12 PM, tedsolr <tsm...@sciquest.com> wrote:

> I've read from  http://heliosearch.org/solrs-mergestrategy/
> <http://heliosearch.org/solrs-mergestrategy/>   that the AnalyticsQuery
> component only works for a single instance of Solr. I'm planning to
> "migrate" to the SolrCloud soon and I have a custom AnalyticsQuery module
> that collapses what I consider to be duplicate documents, keeping stats
> like
> a "count" of the dupes. For my purposes "dupes" are determined at run time
> and vary by the search request. Once a collection has multiple shards I
> will
> not be able to prevent "dupes" from appearing across those shards. A custom
> merge strategy should allow me to merge my stats, but I don't see how I can
> drop duplicate docs at that point.
>
> If shard1 returns docs A & B and shard2 returns docs B & C (letters
> denoting
> what I consider to be unique docs), can my implementation of a merge
> strategy return only docs A, B, & C, rather than A, B, B, & C?
>
> thanks!
> solr 5.2.1
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Merging-documents-from-a-distributed-search-tp4226802.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to