aokolnychyi opened a new pull request, #8972:
URL: https://github.com/apache/iceberg/pull/8972

   This PR migrates the action for rewriting manifests to use rolling writers. 
Right now, we collect all entries in a Spark partition into a list to determine 
the number of entries that must be written and then decide whether to split 
them into multiple manifest files or not. This process is slow as it forces 
Spark to materialize all records in a partition before we start writing. 
Moreover, it consumes quite a bit of memory as the entire Spark partition is 
loaded into memory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to