Upayavira ,

The docs are all unique. In my example the two docs are considered to be
dupes because the requested fields all have the same values.
fields   A        B   C   D     E
Doc 1: apple, 10, 15, bye, yellow
Doc 2: apple, 12, 15, by, green

The two docs are certainly unique. Say they are on different shards in the
same collection. If the search request has fl:A,C then the two are dupes and
the user wants to see them collapsed. If the search request has fl:A,B,C
then the two are unique from the user's perspective and display separately.

Each doc typically has a couple hundred fields. When viewed through the lens
of just 3 or 4 fields, lots of docs, sometimes 1000s will be rolled up and
I'll compute some stats on that group. Bringing all those docs back to the
calling app for processing is too slow. The AnalyticsQuery does a great job
of filtering out the dupes, but it looks like I need another solution for
multi shard collections.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Merging-documents-from-a-distributed-search-tp4226802p4227261.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to