coderfender commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049724630
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java: ########## @@ -407,15 +409,49 @@ private Builder doExecuteWithPartialProgress( Stream<RewriteFileGroup> toGroupStream( RewriteExecutionContext ctx, Map<StructLike, List<List<FileScanTask>>> groupsByPartition) { - return groupsByPartition.entrySet().stream() + if (maxFilesToRewrite == null) { Review Comment: Good point ! My motivation here is to keep the option as simple and straightforward as possible. Having an upper bound ( I am assuming you meant setting default value of param MAX_FILES_TO_REWRITE as LONG.MAX_VALUE) which would add a side effect to this functionality by limiting the number of file to 2^^63-1 . However unlikely that is, I dont believe optional parameters should interfere in the default behavior and should be isolated for the sake of consistency and clarity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org