coderfender commented on code in PR #12824:
URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049724630


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java:
##########
@@ -407,15 +409,49 @@ private Builder doExecuteWithPartialProgress(
 
   Stream<RewriteFileGroup> toGroupStream(
       RewriteExecutionContext ctx, Map<StructLike, List<List<FileScanTask>>> 
groupsByPartition) {
-    return groupsByPartition.entrySet().stream()
+    if (maxFilesToRewrite == null) {

Review Comment:
   Good point ! My motivation here is to keep the option as simple and 
straightforward as possible. Having an upper bound ( I am assuming you meant 
setting default value of  param MAX_FILES_TO_REWRITE  as LONG.MAX_VALUE) which 
would add a side effect to this functionality by limiting  the number of file 
to 2^^63-1 . However unlikely that is, I dont believe optional parameters 
should interfere in the default behavior  and isolate for the sake of 
consistency and clarity. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to