xiaoxuandev opened a new pull request, #9457: URL: https://github.com/apache/iceberg/pull/9457
### Notes Support min/max/count aggregate push down for partition columns - min/max/count aggregate push down is not working if partition columns don't present as data columns(the stats won't be present in avro files), so even the aggregate has been push down to data source, `AggregateEvaluator` will fail, it still go through full table scan - add support by updating evaluator based on PartitionData ### Testing Creating a hive table: CREATE EXTERNAL TABLE store_sales (id int, data INT) PARTITIONED BY (ss_sold_date_sk INT) then registered as Iceberg table Tested on Spark 3.5, verified count/min/max been successfully pushdown, and simple queries (`select count(ss_sold_date_sk) from store_sales` , `select min(ss_sold_date_sk) from store_sales` and `select max(ss_sold_date_sk) from store_sales`) has been speed up with the change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org