jasonf20 commented on code in PR #10962:
URL: https://github.com/apache/iceberg/pull/10962#discussion_r1880327776


##########
core/src/main/java/org/apache/iceberg/ManifestFilterManager.java:
##########
@@ -363,6 +363,10 @@ private ManifestFile filterManifest(
   }
 
   private boolean canContainDeletedFiles(ManifestFile manifest, boolean 
trustManifestReferences) {
+    if (manifest.minSequenceNumber() > 0 && manifest.minSequenceNumber() < 
minSequenceNumber) {
+      return true;
+    }

Review Comment:
   Perhaps it can be added as a table property. As an anecdote I have 
encountered tables with frequent updates that have 100K + inactive delete files 
in their manifests. And since Spark isn't used in this env performing a cleanup 
is simple.
   
   I think the above issue should be addressed, but perhaps we got a bit side 
tracked in the context of this bug. How would you like to proceed here. This is 
still a valid bug fix, can you think of a trick to make the test reproduce the 
issue easily without reverting the change to  `dropDeleteFilesOlderThan`? If 
not can we merge it regardless, it's still a valid fix. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to