aokolnychyi commented on code in PR #9455:
URL: https://github.com/apache/iceberg/pull/9455#discussion_r1475245547


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java:
##########
@@ -405,15 +407,18 @@ public boolean equals(Object other) {
       return false;
     }
 
-    // use only name in order to correctly invalidate Spark cache
+    // use name only unless branch/snapshotId is given in order to correctly 
invalidate Spark cache
+    // when branch or snapshotId is given, it's time travel
     SparkTable that = (SparkTable) other;
-    return icebergTable.name().equals(that.icebergTable.name());
+    return icebergTable.name().equals(that.icebergTable.name())
+        && Objects.equals(snapshotId, that.snapshotId);

Review Comment:
   We would need to double check our caching catalogs and whether 
`refreshEagerly` has to be part of this comparison.



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java:
##########
@@ -117,7 +119,7 @@ public class SparkTable
           .build();
 
   private final Table icebergTable;
-  private final Long snapshotId;
+  private Long snapshotId;

Review Comment:
   Do we have to remove the final keyword here?



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java:
##########
@@ -131,12 +133,12 @@ public SparkTable(Table icebergTable, boolean 
refreshEagerly) {
   public SparkTable(Table icebergTable, String branch, boolean refreshEagerly) 
{
     this(icebergTable, refreshEagerly);
     this.branch = branch;
+    final Snapshot snapshot = icebergTable.snapshot(branch);
     ValidationException.check(
-        branch == null
-            || SnapshotRef.MAIN_BRANCH.equals(branch)
-            || icebergTable.snapshot(branch) != null,
+        branch == null || SnapshotRef.MAIN_BRANCH.equals(branch) || snapshot 
!= null,
         "Cannot use branch (does not exist): %s",
         branch);
+    this.snapshotId = snapshot.snapshotId();

Review Comment:
   Won't this throw an NPE in some cases as `snapshot` could be null?
   
   Also, I am not sure this logic is correct. Tables loaded for a particular 
snapshot ID and for a particular branch may not be logically equal, more 
operations could happen to the branch upon its initial load.



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java:
##########
@@ -405,15 +407,18 @@ public boolean equals(Object other) {
       return false;
     }
 
-    // use only name in order to correctly invalidate Spark cache
+    // use name only unless branch/snapshotId is given in order to correctly 
invalidate Spark cache
+    // when branch or snapshotId is given, it's time travel
     SparkTable that = (SparkTable) other;
-    return icebergTable.name().equals(that.icebergTable.name());
+    return icebergTable.name().equals(that.icebergTable.name())
+        && Objects.equals(snapshotId, that.snapshotId);

Review Comment:
   An alternative to loading snapshot ID for a branch, could be something like 
this.
   
   ```
   @Override
   public boolean equals(Object other) {
     if (this == other) {
       return true;
     } else if (other == null || getClass() != other.getClass()) {
       return false;
     }
   
     SparkTable that = (SparkTable) other;
     return icebergTable.name().equals(that.icebergTable.name())
         && normalizedBranch().equals(that.normalizedBranch())
         && Objects.equals(snapshotId, that.snapshotId());
   }
   
   @Override
   public int hashCode() {
     return Objects.hash(icebergTable.name(), normalizedBranch(), snapshotId);
   }
   
   private String normalizedBranch() {
     return branch != null ? branch : SnapshotRef.MAIN_BRANCH;
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to