ruotianwang commented on issue #12076:
URL: https://github.com/apache/iceberg/issues/12076#issuecomment-2616273743

   @manuzhang This maxCommits is directly getting from the configuration: 
https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java#L458-L460
   
   About here: 
https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java#L354-L375
   
   I think the edge case is when `ctx.totalGroupCount() < maxCommits`
   
   Here is what I saw within out application log
   ```
   25/01/23 20:24:18 INFO RewriteDataFilesSparkAction: Rewrite Files Ready to 
be Committed - Rewriting 17929 files (BIN-PACK, file group 1/1, 
PartitionData{*********} (1/1)) in ******** 
   25/01/23 20:24:18 INFO BaseMetastoreTableOperations: Refreshing table 
metadata from new version: s3://*******.metadata.json
   ```
   
   And another table's job
   ```
   25/01/23 03:49:52 INFO RewriteDataFilesSparkAction: Rewrite Files Ready to 
be Committed - Rewriting 1584 files (BIN-PACK, file group 2/3, 
PartitionData{********} (2/2)) in ********* 
   25/01/23 03:49:52 INFO BaseMetastoreTableOperations: Refreshing table 
metadata from new version: *************.metadata.json 
   25/01/23 03:49:57 INFO BaseMetastoreTableOperations: Successfully committed 
to table ********** in 957 ms 
   25/01/23 03:49:57 INFO SnapshotProducer: Committed snapshot 
1388669599076500919 (BaseRewriteFiles) 
   25/01/23 03:49:57 INFO BaseMetastoreTableOperations: Refreshing table 
metadata from new version: s3://*******.metadata.json
   ```
   
   You can see the total file group is 1 and 3 even though the partial max 
commit is configured as 10.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to