[ https://issues.apache.org/jira/browse/SOLR-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki resolved SOLR-14470. ------------------------------------- Resolution: Fixed > Add streaming expressions to /export handler > -------------------------------------------- > > Key: SOLR-14470 > URL: https://issues.apache.org/jira/browse/SOLR-14470 > Project: Solr > Issue Type: Improvement > Components: Export Writer, streaming expressions > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Priority: Major > Fix For: 8.6 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Many streaming scenarios would greatly benefit from the ability to perform > partial rollups (or other transformations) as early as possible, in order to > minimize the amount of data that has to be sent from shards to the > aggregating node. > This can be implemented as a subset of streaming expressions that process the > data directly inside each local {{ExportHandler}} and outputs only the > records from the resulting stream. > Conceptually it would be similar to the way Hadoop {{Combiner}} works. As is > the case with {{Combiner}}, because the input data is processed in batches > there would be no guarantee that only 1 record per unique sort values would > be emitted - in fact, in most cases multiple partial aggregations would be > emitted. Still, in many scenarios this would allow reducing the amount of > data to be sent by several orders of magnitude. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org