RussellSpitzer commented on issue #12150:
URL: https://github.com/apache/iceberg/issues/12150#issuecomment-2630889126

   This is really more of questions for a whole book or a series of talks, I 
would recommend checking out
   https://www.youtube.com/playlist?list=PLkifVhhWtccxBSrKFPXOmjAFFEpeYii5K
   
   For all the Iceberg Summit videos from last year
   
   For short answers:
   
   You should run all those maintenance things. The most important for most 
people are Rewrite Metadata and Expire Snapshots. The others are more 
contextual and expensive to actually run so it's usage dependent imho.
   
   Spark Apis use distributed computing, thats the biggest difference. The Java 
APIS are also much more low level in Iceberg, more for users building engines 
or doing custom logic.
   
   MOR is faster on write, slower on read. Good for sparse deletes
   COW is slower on write, faster on read. Good for dense deletes (many deletes 
in the same file - 30% or more of the file replaced)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to