RussellSpitzer opened a new issue, #6367:
URL: https://github.com/apache/iceberg/issues/6367

   ### Apache Iceberg version
   
   1.1.0 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   Partial progress currently works in the following psuedo-code
   
   
   ```
   Rewrite Job Thread Pool In parallel {
      rewriteFiles for a partition/fileGroup // Datafiles generated here
      add result of rewrite to commit queue 
   }
   
   Commit Thread {
      when enough fileGroups have been rewritten perform a commit // Manifests 
generated at this point in time
   }
   
   Once in parallel has completed {
      Await Termination of Single Threaded (10 Minutes or die)
   }
   ```
   
   See
   
https://github.com/apache/iceberg/blob/f5f79a98b5bead5b976378cc2fc45c9454ac7731/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java#L350-L357
   
https://github.com/apache/iceberg/blob/f5f79a98b5bead5b976378cc2fc45c9454ac7731/core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java#L179-L188
   And 
   
https://github.com/apache/iceberg/blob/f5f79a98b5bead5b976378cc2fc45c9454ac7731/core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java#L228-L240
   
   The original assumption here is that 10 minutes after the rewrite has 
completed we should be finished performing all the commits as the commit phase 
should be relatively fast and the rewrite phase is long. There are a few issues 
with this, for some users they may be using a very large cluster for the 
"parallel" phase allowing them to complete the rewrites quickly but these new 
files will require a huge amount new metadata which in turns would require a 
large amount of new manifest files. 
   
   In one of our internal examples we have a very large partial progress 
rewrite in 10 parts. The rewrites start finishing all around the same time 
basically just enqueuing all the commits to then occur in sequence. The 
timeline looks basically like this (imagine there are only five commit groups):
   
   ```
   All Rewrites Begin
   1/5 of files Rewritten
   1st Commit Begins
   2/5 of files groups rewritten
   3/5 of files groups rewritten
   4/5 of files groups rewritten
   1st Commit Finishes
   2nd Commit Begins
   5/5 of files groups rewritten
   10 Minute Timer Begins to Finish Commits
   2nd Commit Finishes
   3rd Commit Begins
   // Timeout! 
   ```
   
   I think the best way to improve this, and increase throughput of the 
operation  is to move the actual writing of manifests into the parallel portion 
of the operation. In this case we could probably do this by building our commit 
groups in the Service's offer method rather than in the service thread itself, 
the the service thread can just be checking for completed commit groups.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to