amogh-jahagirdar commented on code in PR #13614:
URL: https://github.com/apache/iceberg/pull/13614#discussion_r2219196348


##########
core/src/main/java/org/apache/iceberg/IncrementalFileCleanup.java:
##########
@@ -50,7 +50,7 @@ class IncrementalFileCleanup extends FileCleanupStrategy {
   @Override
   @SuppressWarnings({"checkstyle:CyclomaticComplexity", "MethodLength"})
   public void cleanFiles(TableMetadata beforeExpiration, TableMetadata 
afterExpiration) {

Review Comment:
   I think there's a bigger question of moving away from incremental cleanup 
and just always doing the reachability analysis in the core cleanup. This is 
what's done in spark and other integrations (which is probably why this issue 
didn't surfaced for quite some time).
   
    If I recall correctly, the original intent for keeping incremental cleanup 
was that at the time branching/tagging was a pretty new introduction and I 
didn't want to regress folks expiration cleanup in the "average" case. However, 
even beyond branching/tagging there are other constructs on table metadata 
(like the ability to cherry pick) that can make incremental cleanup difficult 
to do safely, that maybe in 2.0 we should consider just removing it and always 
doing a reachability analysis? cc @rdblue @RussellSpitzer @stevenzwu 
@aokolnychyi 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to