Hi ,

I am planning to write custom aggregator in solr which will use some
probabilistic data structures per shard to accumate results and then after
shard merging results will be sent to user as integer.

I explored 2 options to do this

1. Solr analytics API
https://cwiki.apache.org/confluence/display/solr/AnalyticsQuery+API

I can implement merge policy and post filter to perform aggregation , I
have example working using this , but I am not sure if it is ok to pass
objects which > 1 MB in shard response?
does solr use javabin serialization to optimize data gathering from shards?
then leader shard will collect these 1 MB probabilistic data structures &
produce count which will be included in response.


2. JSON Facet API  http://yonik.com/json-facet-api/

After looking at
https://github.com/apache/lucene-solr/tree/master/solr/core/src/java/org/apache/solr/search/facet

FacetProcessor.java seems very similar to Solr analytics API.
seems like merging happens similar way where response will include objects
like hll and merge them





one key difference is  Solr analytics API is based on postFilter and JSON
facet API is based on ValueSource but I dont understand impact of using one
or the other.


Can someone help me out?

Reply via email to