szehon-ho commented on code in PR #6661:
URL: https://github.com/apache/iceberg/pull/6661#discussion_r1132981630


##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -220,21 +257,53 @@ Iterable<Partition> all() {
 
   static class Partition {
     private final StructLike key;
-    private long recordCount;
-    private int fileCount;
     private int specId;
+    private long dataRecordCount;
+    private int dataFileCount;
+
+    private final Set<DeleteFile> equalityDeleteFiles;
+    private final Set<DeleteFile> positionDeleteFiles;
 
     Partition(StructLike key) {
       this.key = key;
-      this.recordCount = 0;
-      this.fileCount = 0;
       this.specId = 0;
+      this.dataRecordCount = 0;
+      this.dataFileCount = 0;
+      this.positionDeleteFiles = Sets.newHashSet();
+      this.equalityDeleteFiles = Sets.newHashSet();
+    }
+
+    private void update(FileScanTask task) {

Review Comment:
   Actually that way I thought will be quite expensive (two pass).  
   
   Probably the only way to effectively do it , until this whole table is 
migrated over to some kind of view of 'files' table, is to rewrite the 
PartitionsTableScan to directly use the underlying code:  
ManifestReader.readDeleteManifest() / ManifestReader.read(), and then go 
through it, instead of using the ManifestGroup.planFiles() way.  
   
   That way, we can iterate through , and collect delete files/data files in 
one pass.  Any thoughts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to