rdblue commented on code in PR #6634:
URL: https://github.com/apache/iceberg/pull/6634#discussion_r1083582534


##########
core/src/main/java/org/apache/iceberg/BaseTransaction.java:
##########
@@ -551,10 +555,19 @@ public void commit(TableMetadata underlyingBase, 
TableMetadata metadata) {
       }
 
       // track the intermediate snapshot ids for rewriting the snapshot log
-      // an id is intermediate if it isn't the base snapshot id and it is 
replaced by a new current
-      Long oldId = currentId(current);
-      if (oldId != null && !oldId.equals(currentId(metadata)) && 
!oldId.equals(currentId(base))) {
-        intermediateSnapshotIds.add(oldId);
+      // an id is intermediate if it isn't the head of the branch in base and 
it is replaced by a new head of the branch in current

Review Comment:
   I don't think that we need to keep intermediate snapshot IDs like this 
anymore.
   
   I took a look at this and the intermediate IDs are currently used to ensure 
that we don't delete files from any committed snapshot. Before we added 
metadata change tracking to TableMetadata, this list was also used to rewrite 
the history / snapshot log. But that's handled in `TableMetadata.Builder` now, 
so this is only used for fixing up the deletes.
   
   Fixing up deletes is a much simpler problem. We don't need to know whether a 
reference was "intermediate" anymore -- that was for the history fixes. So what 
we need is a list of the new snapshots committed by the transaction to feed 
into `committedFiles`. We can do that more easily by taking the current set of 
snapshots and removing the old set of snapshots.
   
   @amogh-jahagirdar, does that make sense? We can do this all in 
`commitSimpleTransaction` and remove `intermediateSnapshotIds`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to