szehon-ho commented on code in PR #7581:
URL: https://github.com/apache/iceberg/pull/7581#discussion_r1201254518


##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -47,6 +47,16 @@ public class PartitionsTable extends BaseMetadataTable {
         new Schema(
             Types.NestedField.required(1, "partition", 
Partitioning.partitionType(table)),
             Types.NestedField.required(4, "spec_id", Types.IntegerType.get()),
+            Types.NestedField.required(
+                9,
+                "last_updated_at",

Review Comment:
   How about 'last_updated' to be more precise, and match table metadata json?



##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -246,19 +268,27 @@ static class Partition {
     private int posDeleteFileCount;
     private long eqDeleteRecordCount;
     private int eqDeleteFileCount;
+    private long lastUpdatedAt;
+    private long lastUpdatedSnapshotId;
 
     Partition(StructLike key, Types.StructType keyType) {
       this.partitionData = toPartitionData(key, keyType);
       this.specId = 0;
-      this.dataRecordCount = 0;
+      this.dataRecordCount = 0L;
       this.dataFileCount = 0;
-      this.posDeleteRecordCount = 0;
+      this.posDeleteRecordCount = 0L;
       this.posDeleteFileCount = 0;
-      this.eqDeleteRecordCount = 0;
+      this.eqDeleteRecordCount = 0L;
       this.eqDeleteFileCount = 0;
+      this.lastUpdatedAt = 0L;
+      this.lastUpdatedSnapshotId = 0L;
     }
 
-    void update(ContentFile<?> file) {
+    void update(ContentFile<?> file, Snapshot snapshot) {
+      if (snapshot.timestampMillis() * 1000 > this.lastUpdatedAt) {

Review Comment:
   opt: maybe make a variable snapshotCommitTime = snapshot.timestampMillis * 
1000?



##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -155,21 +172,26 @@ private static Iterable<Partition> partitions(Table 
table, StaticTableScan scan)
   }
 
   @VisibleForTesting
-  static CloseableIterable<ContentFile<?>> planFiles(StaticTableScan scan) {
+  static CloseableIterable<ManifestEntry<? extends ContentFile<?>>> 
planEntries(

Review Comment:
   Is it possible to remove `? extends ContentFile` here?



##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -155,21 +172,26 @@ private static Iterable<Partition> partitions(Table 
table, StaticTableScan scan)
   }
 
   @VisibleForTesting
-  static CloseableIterable<ContentFile<?>> planFiles(StaticTableScan scan) {
+  static CloseableIterable<ManifestEntry<? extends ContentFile<?>>> 
planEntries(
+      StaticTableScan scan) {
     Table table = scan.table();
 
     CloseableIterable<ManifestFile> filteredManifests =
         filteredManifests(scan, table, 
scan.snapshot().allManifests(table.io()));
 
-    Iterable<CloseableIterable<ContentFile<?>>> tasks =
+    Iterable<CloseableIterable<ManifestEntry<? extends ContentFile<?>>>> tasks 
=
         CloseableIterable.transform(
             filteredManifests,
             manifest ->
                 CloseableIterable.transform(
                     ManifestFiles.open(manifest, table.io(), table.specs())
                         .caseSensitive(scan.isCaseSensitive())
-                        .select(scanColumns(manifest.content())), // don't 
select stats columns
-                    t -> (ContentFile<?>) t));
+                        .select(scanColumns(manifest.content())) // don't 
select stats columns
+                        .entries(),
+                    t ->

Review Comment:
   maybe worth breaking into new method now?  ie, manifest -> entries(manifest, 
scan)



##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -139,13 +155,14 @@ private static StaticDataTask.Row 
convertPartition(Partition partition) {
   private static Iterable<Partition> partitions(Table table, StaticTableScan 
scan) {
     Types.StructType partitionType = Partitioning.partitionType(table);
     PartitionMap partitions = new PartitionMap(partitionType);
-
-    try (CloseableIterable<ContentFile<?>> files = planFiles(scan)) {
-      for (ContentFile<?> file : files) {
+    try (CloseableIterable<ManifestEntry<? extends ContentFile<?>>> entries = 
planEntries(scan)) {
+      for (ManifestEntry<? extends ContentFile<?>> entry : entries) {
+        Snapshot snapshot = table.snapshot(entry.snapshotId());

Review Comment:
   Hm.. i think it may be possible.  Eg, snapshot1 => add DataFile1, snapshot2 
=> add DataFile2, 
   
   ExpireSnapshot may move Snapshot1, just DataFile1 is not removed.  
   
   Maybe we could make a null here if we don't find a snapshot?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to