Thanks Joel, that link looks promising. The CloudSolrStream bypasses my issue of multiple shards. Perhaps the ReducerStream would provide what I need. At first glance I worry that the the buffer would grow too large - if its really holding the values for all the fields in each document (Tuple.getMaps()). I use a Map in my DelegatingCollector to store the "unique" docs, but I only keep the docId, my stats, and the ordinals for each field. Would you expect the new streams API to perform as well as my implementation of an AnalyticsQuery and a DelegatingCollector?
-- View this message in context: http://lucene.472066.n3.nabble.com/Merging-documents-from-a-distributed-search-tp4226802p4227034.html Sent from the Solr - User mailing list archive at Nabble.com.