[
https://issues.apache.org/jira/browse/SOLR-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki resolved SOLR-14470.
-------------------------------------
Resolution: Fixed
> Add streaming expressions to /export handler
> --------------------------------------------
>
> Key: SOLR-14470
> URL: https://issues.apache.org/jira/browse/SOLR-14470
> Project: Solr
> Issue Type: Improvement
> Components: Export Writer, streaming expressions
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Priority: Major
> Fix For: 8.6
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Many streaming scenarios would greatly benefit from the ability to perform
> partial rollups (or other transformations) as early as possible, in order to
> minimize the amount of data that has to be sent from shards to the
> aggregating node.
> This can be implemented as a subset of streaming expressions that process the
> data directly inside each local {{ExportHandler}} and outputs only the
> records from the resulting stream.
> Conceptually it would be similar to the way Hadoop {{Combiner}} works. As is
> the case with {{Combiner}}, because the input data is processed in batches
> there would be no guarantee that only 1 record per unique sort values would
> be emitted - in fact, in most cases multiple partial aggregations would be
> emitted. Still, in many scenarios this would allow reducing the amount of
> data to be sent by several orders of magnitude.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]