amogh-jahagirdar commented on code in PR #10523:
URL: https://github.com/apache/iceberg/pull/10523#discussion_r1676113425


##########
core/src/main/java/org/apache/iceberg/SnapshotProducer.java:
##########
@@ -565,6 +570,10 @@ protected boolean canInheritSnapshotId() {
     return canInheritSnapshotId;
   }
 
+  protected boolean cleanUncommittedAfterCommit() {

Review Comment:
   I think I'd just call this `cleanupAfterCommit`. ultimately the only files 
that can be cleaned up are the  uncommitted ones so that seems implied to me.



##########
core/src/test/java/org/apache/iceberg/TestFastAppend.java:
##########
@@ -317,6 +317,60 @@ public void testRecoveryWithoutManifestList() {
     
assertThat(metadata.currentSnapshot().allManifests(FILE_IO)).contains(newManifest);
   }
 
+  @TestTemplate
+  public void testWriteNewManifestsIdempotency() {
+    table.updateProperties().set(TableProperties.MANIFEST_LISTS_ENABLED, 
"true").commit();

Review Comment:
   Yeah let's remove it please, it shouldn't matter for these tests



##########
core/src/main/java/org/apache/iceberg/SnapshotProducer.java:
##########
@@ -423,13 +423,18 @@ public void commit() {
       }
 
       try {
-        LOG.info("Committed snapshot {} ({})", newSnapshotId.get(), 
getClass().getSimpleName());
+        LOG.info(
+            "Committed snapshot {} ({})",
+            committedSnapshot.get().snapshotId(),
+            getClass().getSimpleName());
 
         // at this point, the commit must have succeeded. after a refresh, the 
snapshot is loaded by
         // id in case another commit was added between this commit and the 
refresh.
-        Snapshot saved = ops.refresh().snapshot(newSnapshotId.get());
+        Snapshot saved = committedSnapshot.get();
         if (saved != null) {
-          cleanUncommitted(Sets.newHashSet(saved.allManifests(ops.io())));
+          if (cleanUncommittedAfterCommit()) {
+            cleanUncommitted(Sets.newHashSet(saved.allManifests(ops.io())));
+          }

Review Comment:
   Nit: New line after the if block



##########
core/src/test/java/org/apache/iceberg/TestFastAppend.java:
##########
@@ -324,6 +324,60 @@ public void testRecoveryWithoutManifestList() {
     
assertThat(metadata.currentSnapshot().allManifests(FILE_IO)).contains(newManifest);
   }
 
+  @TestTemplate
+  public void testWriteNewManifestsIdempotency() {
+    table.updateProperties().set(TableProperties.MANIFEST_LISTS_ENABLED, 
"true").commit();
+
+    // inject 3 failures, the last try will succeed
+    TestTables.TestTableOperations ops = table.ops();
+    ops.failCommits(3);
+
+    AppendFiles append = table.newFastAppend().appendFile(FILE_B);
+    Snapshot pending = append.apply();
+    ManifestFile newManifest = pending.allManifests(FILE_IO).get(0);
+    assertThat(new File(newManifest.path())).exists();
+
+    append.commit();
+
+    TableMetadata metadata = readMetadata();
+
+    // contains only a single manifest, does not duplicate manifests on retries
+    validateSnapshot(null, metadata.currentSnapshot(), FILE_B);
+    assertThat(new File(newManifest.path())).exists();
+    
assertThat(metadata.currentSnapshot().allManifests(FILE_IO)).contains(newManifest);
+    assertThat(listManifestFiles(tableDir)).containsExactly(new 
File(newManifest.path()));
+  }
+
+  @TestTemplate
+  public void testWriteNewManifestsCleanup() {
+    table.updateProperties().set(TableProperties.MANIFEST_LISTS_ENABLED, 
"true").commit();

Review Comment:
   Same as above, we can remove it.



##########
core/src/main/java/org/apache/iceberg/FastAppend.java:
##########
@@ -198,6 +198,14 @@ protected void cleanUncommitted(Set<ManifestFile> 
committed) {
     }
   }
 
+  @Override
+  protected boolean cleanUncommittedAfterCommit() {
+    // appendManifests are not rewritten, never need cleanup
+    // rewrittenAppendManifests are rewritten in appendManifest, never need 
cleanup
+    // newManifests are cleaned up in writeNewManifests

Review Comment:
   Nit: I'd make this a method level comment instead of inlining all this.
   
   ```
   /**
   Cleanup after committing is disabled for FastAppend for the following 
reasons:
   
   1.) Appended manifests are never rewritten
   2.) Manifests which are written out as part of appendFile are cleaned up 
between commit attempts 
   in writeNewManifests
   */
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to