amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1779738033
########## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ########## @@ -81,6 +81,7 @@ public String partition() { // cache filtered manifests to avoid extra work when commits fail. private final Map<ManifestFile, ManifestFile> filteredManifests = Maps.newConcurrentMap(); + private final Set<String> manifestsWithDeletedFiles = Sets.newConcurrentHashSet(); Review Comment: Opening back up this thread in case we wanted to discuss the field name since I haven't changed that yet:`manifestsWithDeletedFiles` vs `deleteManifests`. This change is keeping track of both data and delete manifests which have entries which are deleted (hence the name `manifestsWithDeletedFiles`). When we go to prune out which manifests *cannot* contain any deleted entries, that's when an additional check on delete manifests is performed since cherrypick overwrite cases make it so that detecting data manifests cannot contain deleted entries not possible. I could also only keep track of specifically the delete manifests and then rename to `deleteManifests` but the advantage that gets lost is the ability to detect if a data manifest has an entry which is deleted, we has to go through the old path. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org