https://issues.apache.org/jira/browse/SOLR-8492 shows an example of the
AnalyticsQuery where the merge is being handled by the Streaming API. I
actually think this is nicer then then using MergeStrategy. The Streaming
API gives you full control over the merge from the shards.

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Mar 16, 2016 at 12:49 PM, sudsport s <sudssf2...@gmail.com> wrote:

> Hi ,
>
> I am planning to write custom aggregator in solr which will use some
> probabilistic data structures per shard to accumate results and then after
> shard merging results will be sent to user as integer.
>
> I explored 2 options to do this
>
> 1. Solr analytics API
> https://cwiki.apache.org/confluence/display/solr/AnalyticsQuery+API
>
> I can implement merge policy and post filter to perform aggregation , I
> have example working using this , but I am not sure if it is ok to pass
> objects which > 1 MB in shard response?
> does solr use javabin serialization to optimize data gathering from shards?
> then leader shard will collect these 1 MB probabilistic data structures &
> produce count which will be included in response.
>
>
> 2. JSON Facet API  http://yonik.com/json-facet-api/
>
> After looking at
>
> https://github.com/apache/lucene-solr/tree/master/solr/core/src/java/org/apache/solr/search/facet
>
> FacetProcessor.java seems very similar to Solr analytics API.
> seems like merging happens similar way where response will include objects
> like hll and merge them
>
>
>
>
>
> one key difference is  Solr analytics API is based on postFilter and JSON
> facet API is based on ValueSource but I dont understand impact of using one
> or the other.
>
>
> Can someone help me out?
>

Reply via email to