rdblue commented on code in PR #6622:
URL: https://github.com/apache/iceberg/pull/6622#discussion_r1103898014


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -193,6 +337,19 @@ private Schema schemaWithMetadataColumns() {
 
   @Override
   public Scan build() {
+    // if aggregates are pushed down, instead of constructing a 
SparkBatchQueryScan, creating file
+    // read tasks and sending over the tasks to Spark executors, a 
SparkLocalScan will be created
+    // and the scan is done locally on the Spark driver instead of the 
executors. The statistics
+    // info will be retrieved from manifest file and used to build a Spark 
internal row, which
+    // contains the pushed down aggregate values.
+    if (pushedAggregateRows != null) {

Review Comment:
   I think it would be slightly better to create the scan in the aggregation 
methods. Then this could be `if (localScan != null) { return localScan }` which 
is a bit more generic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to