RussellSpitzer commented on code in PR #6661: URL: https://github.com/apache/iceberg/pull/6661#discussion_r1132657790
########## core/src/main/java/org/apache/iceberg/PartitionsTable.java: ########## @@ -220,21 +257,53 @@ Iterable<Partition> all() { static class Partition { private final StructLike key; - private long recordCount; - private int fileCount; private int specId; + private long dataRecordCount; + private int dataFileCount; + + private final Set<DeleteFile> equalityDeleteFiles; + private final Set<DeleteFile> positionDeleteFiles; Partition(StructLike key) { this.key = key; - this.recordCount = 0; - this.fileCount = 0; this.specId = 0; + this.dataRecordCount = 0; + this.dataFileCount = 0; + this.positionDeleteFiles = Sets.newHashSet(); + this.equalityDeleteFiles = Sets.newHashSet(); + } + + private void update(FileScanTask task) { Review Comment: I think I would strongly want to consider an approach which either disposes of the set's when the partition info is done being constructed or doesn't use this set approach. It looks like in the current implementation we end up keeping the entire set of delete file objects in memory indefinitely. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org