jasonf20 commented on code in PR #10962: URL: https://github.com/apache/iceberg/pull/10962#discussion_r1864787713
########## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ########## @@ -833,7 +833,17 @@ public List<ManifestFile> apply(TableMetadata base, Snapshot snapshot) { filterManager.filterManifests( SnapshotUtil.schemaFor(base, targetBranch()), snapshot != null ? snapshot.dataManifests(ops.io()) : null); - long minDataSequenceNumber = + + long minNewFileSequenceNumber = Review Comment: @amogh-jahagirdar Responded [here](https://github.com/apache/iceberg/pull/10962#discussion_r1862492365). ########## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ########## @@ -363,6 +363,10 @@ private ManifestFile filterManifest( } private boolean canContainDeletedFiles(ManifestFile manifest, boolean trustManifestReferences) { + if (manifest.minSequenceNumber() > 0 && manifest.minSequenceNumber() < minSequenceNumber) { + return true; + } Review Comment: If that's the case then doesn't the last condition here do nothing since it can only reach this check if one of the earlier checks was true anyway: ```java deletePaths.contains(file.location()) || deleteFiles.contains(file) || dropPartitions.contains(file.specId(), file.partition()) || (isDelete && entry.isLive() && entry.dataSequenceNumber() > 0 && entry.dataSequenceNumber() < minSequenceNumber); ``` It seems like perhaps `dropDeleteFilesOlderThan` has no affect anymore (unless maybe `allDeletesReferenceManifests` gets set to false or something). I think not removing by `minSequenceNumber` leaves undeleted delete files that just never get applied to any files at query time, so it's not the end of the world, but it does lead to some wasted storage and slightly longer scan planning times. Assuming we want to keep this behaviour perhaps we should just not use `dropDeleteFilesOlderThan` anymore in `mergingSnapshotProducer` then? We can try getting a minimal test working by doing some actual delete or something from a shared manifest. But it seems like `dropDeleteFilesOlderThan` is not exactly doing anything right now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org