Thanks, Yonik, that makes great sense.

My understanding of "many parts of Solr can already stream" is that not all 
sets of SearchHandler parameters are equal.  One set of SearchHandler 
parameters can be best for classic <1 second web search, one set of 
SearchHandler parameters may be best for returning just analytic computed 
facets over 10k rows, or even more.

I'm understanding StreamHandler and its relation to JDBC completely now, and 
staying away from it for now because it doesn't fit my application.

-----Original Message-----
From: Yonik Seeley [mailto:ysee...@gmail.com] 
Sent: Tuesday, April 19, 2016 5:18 PM
To: solr-user@lucene.apache.org
Subject: Re: Streaming with facets

Part of the difficulty is that "stream" and "streaming" are rather overloaded 
terms.  Many parts of Solr can already stream, with varying degrees of how much 
state is aggregated / internally collected before "streaming" starts.

Faceting can be truly streamed *if* the sort order is by the bucket value 
ascending, since that is the order contained in the lucene index.  All of the 
rest of the bucket information can be computed on the fly as it is being sent 
out.  This is what the JSON Facet API does when method="stream".

We could extend the current facet streaming for other sorts... this would 
involve calculating & sorting the sort criteria first, and then streaming after 
that point (i.e. other metrics would be calculated on-the-fly as facet buckets 
are being streamed).

-Yonik


On Tue, Apr 19, 2016 at 4:48 PM, Davis, Daniel (NIH/NLM) [C] 
<daniel.da...@nih.gov> wrote:
> So, can someone clarify how faceting works with streaming expressions?
>
> I can see how document search can return documents as it finds them, using 
> any particular ordering desired - just a parse tree of query operators with 
> priority queues (or something more complicated) within each query operator, 
> so you really get the best match as you go for as long as you continue.
>
> For facet values, without knowing Solr's internals, my intuition is that Solr 
> could stream unique facet values, but not counts of matching documents.
>
> Even when I put on my user hat - I don't see how the Streaming API can return 
> both facet values and documents, it looks like it is either documents or 
> facet values as results.
>
> Dan Davis, Systems/Applications Architect (Contractor), Office of 
> Computer and Communications Systems, National Library of Medicine, NIH
>

Reply via email to