Hi,

The streaming API in Solr 6x has been expanded to supported many different
parallel computing workloads. For example the topic stream supports pub/sub
messaging. The gatherNodes stream supports graph traversal. The facet
stream supports aggregations inside the search engine, while the rollup
stream supports shuffling map / reduce aggregations. Stored queries and
large scale alerting is on the way...

The sort stream is designed to be used at scale in parallel mode. It can
currently sort about 1,000,000 docs per second on a single worker. So if
you have 20 workers it can sort 20,000,000 docs per second. The plan is to
eventually switch to the fork/join merge sort so that you get parallelism
within the same worker.



Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Jun 30, 2016 at 3:43 PM, tedsolr <tsm...@sciquest.com> wrote:

> I've read about the sort stream in v6.1 but it appears to me to break the
> streaming design. If it has to read all the results into memory then it's
> not streaming. Sounds like it could be slow and memory intensive for very
> large result sets. Has anyone had good results with the sort stream when
> there are 10M+ docs returned?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Specify-sorting-of-merged-streams-tp4285026p4285202.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to