Re: Huge Facets and Streaming

2017-08-21 Thread Joel Bernstein
The current approach for high cardinality aggregations is the MapReduce approach: parallel(rollup(search())) But what Yonik describes would be much more efficient. Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Aug 21, 2017 at 3:44 PM, Mikhail Khludnev wrote: > Thanks for sharing this

Re: Huge Facets and Streaming

2017-08-21 Thread Mikhail Khludnev
Thanks for sharing this idea, Younik! I've raised https://issues.apache.org/jira/browse/SOLR-11271. On Mon, Aug 21, 2017 at 4:00 PM, Yonik Seeley wrote: > On Mon, Aug 21, 2017 at 6:01 AM, Mikhail Khludnev wrote: > > Hello! > > > > I need to count really wide facet on 30 shards index with roughl

Re: Huge Facets and Streaming

2017-08-21 Thread Yonik Seeley
On Mon, Aug 21, 2017 at 6:01 AM, Mikhail Khludnev wrote: > Hello! > > I need to count really wide facet on 30 shards index with roughly 100M > docs, the facet response is about 100M values takes 0.5G in text file. > > So, far I experimented with old facets. It calculates per shard facets > fine, b

Huge Facets and Streaming

2017-08-21 Thread Mikhail Khludnev
Hello! I need to count really wide facet on 30 shards index with roughly 100M docs, the facet response is about 100M values takes 0.5G in text file. So, far I experimented with old facets. It calculates per shard facets fine, but then a node which attempts to merge such 30 responses fails due to