sunchao commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1009613399
########## api/src/main/java/org/apache/iceberg/Scan.java: ########## @@ -129,6 +129,34 @@ default ThisT select(String... columns) { */ ThisT planWith(ExecutorService executorService); + /** + * Create a new {@link TableScan} which dictates that when plan tasks via the {@link + * #planTasks()}, the scan should preserve partition boundary specified by the provided partition + * column names. In other words, the scan will not attempt to combine tasks whose input files have + * different partition data w.r.t `columns`. + * + * @param columns the partition column names to preserve boundary when planning tasks + * @return a table scan preserving partition boundary when planning tasks + * @throws IllegalArgumentException if any of the input columns is not a partition column, or if + * the table is un-partitioned. + */ + ThisT preservePartitions(Collection<String> columns); + + /** + * Create a new {@link TableScan} which dictates that when plan tasks via the {@link + * #planTasks()}, the scan should preserve partition boundary specified by the provided partition Review Comment: @aokolnychyi yes we can do that, in fact I think this is what we are currently proposing here: https://github.com/apache/spark/pull/38434. cc @huaxingao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org