aokolnychyi commented on code in PR #6335:
URL: https://github.com/apache/iceberg/pull/6335#discussion_r1276752392


##########
core/src/main/java/org/apache/iceberg/FastAppend.java:
##########
@@ -178,8 +178,28 @@ public Object updateEvent() {
 
   @Override
   protected void cleanUncommitted(Set<ManifestFile> committed) {
-    if (newManifest != null && !committed.contains(newManifest)) {
-      deleteFile(newManifest.path());
+    if (newManifests != null) {
+      // Delete newManifests that have not been committed and clear them from 
the list.
+      // This is needed for manifests cleanup especially in transaction mode, 
for example:
+      //   Transaction txn = beginTransaction(...)
+      //
+      //   // Operation succeeds and calls cleanUncommitted
+      //   txn.newFastAppend().appendFile(...).commit();
+      //
+      //   // Some other operations ...
+      //
+      //   // Commit fails and needs to clean up newManifests
+      //   txn.commitTransaction()
+      List<ManifestFile> cleanedNewManifests = Lists.newArrayList();

Review Comment:
   Shouldn't this be called `committedNewManifests`? The name suggests it holds 
a set of cleaned up manifests while it is the opposite of that.



##########
core/src/main/java/org/apache/iceberg/FastAppend.java:
##########
@@ -178,8 +178,28 @@ public Object updateEvent() {
 
   @Override
   protected void cleanUncommitted(Set<ManifestFile> committed) {
-    if (newManifest != null && !committed.contains(newManifest)) {
-      deleteFile(newManifest.path());
+    if (newManifests != null) {
+      // Delete newManifests that have not been committed and clear them from 
the list.

Review Comment:
   How useful is this comment given that `cleanUncommitted` has an elaborate 
doc? It does not seem specific to this case but rather a general comment when 
manifests may be cleaned up.
   
   I'd be open to include a few examples (like auto merging and transactions) 
in the parent doc in a separate PR but I am not sure it makes sense here.



##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -907,9 +907,28 @@ public Object updateEvent() {
   }
 
   private void cleanUncommittedAppends(Set<ManifestFile> committed) {
-    if (cachedNewDataManifest != null && 
!committed.contains(cachedNewDataManifest)) {
-      deleteFile(cachedNewDataManifest.path());
-      this.cachedNewDataManifest = null;
+    if (cachedNewDataManifests != null) {
+      // Delete newManifests that have not been committed and clear them from 
the list.

Review Comment:
   Same comments for this block as in `FastAppend`.



##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -952,10 +971,8 @@ protected void cleanUncommitted(Set<ManifestFile> 
committed) {
   private Iterable<ManifestFile> prepareNewDataManifests() {
     Iterable<ManifestFile> newManifests;
     if (newDataFiles.size() > 0) {
-      ManifestFile newManifest = newDataFilesAsManifest();
-      newManifests =
-          Iterables.concat(
-              ImmutableList.of(newManifest), appendManifests, 
rewrittenAppendManifests);
+      List<ManifestFile> newManifest = newDataFilesAsManifests();

Review Comment:
   Should this be called something like `dataFileManifests` given that it is a 
list now?



##########
core/src/main/java/org/apache/iceberg/FastAppend.java:
##########
@@ -178,8 +178,28 @@ public Object updateEvent() {
 
   @Override
   protected void cleanUncommitted(Set<ManifestFile> committed) {
-    if (newManifest != null && !committed.contains(newManifest)) {
-      deleteFile(newManifest.path());
+    if (newManifests != null) {
+      // Delete newManifests that have not been committed and clear them from 
the list.
+      // This is needed for manifests cleanup especially in transaction mode, 
for example:
+      //   Transaction txn = beginTransaction(...)
+      //
+      //   // Operation succeeds and calls cleanUncommitted
+      //   txn.newFastAppend().appendFile(...).commit();
+      //
+      //   // Some other operations ...
+      //
+      //   // Commit fails and needs to clean up newManifests
+      //   txn.commitTransaction()
+      List<ManifestFile> cleanedNewManifests = Lists.newArrayList();
+      for (ManifestFile manifestFile : newManifests) {
+        if (!committed.contains(manifestFile)) {

Review Comment:
   What about flipping the if to avoid the negation? Negation is always harder 
to read.
   
   ```
   if (committed.contains(manifest)) {
     committedNewManifests.add(manifest);
   } else {
     deleteFile(manifest.path());
   }
   ```



##########
core/src/main/java/org/apache/iceberg/FastAppend.java:
##########
@@ -143,12 +143,12 @@ private ManifestFile copyManifest(ManifestFile manifest) {
 
   @Override
   public List<ManifestFile> apply(TableMetadata base, Snapshot snapshot) {
-    List<ManifestFile> newManifests = Lists.newArrayList();

Review Comment:
   What about renaming `writeManifests` to `writeNewManifests` to match 
`newManifests` variable and calling this variable simply `manifests` like we do 
in `MergingSnapshotProducer`?



##########
core/src/main/java/org/apache/iceberg/FastAppend.java:
##########
@@ -178,8 +178,28 @@ public Object updateEvent() {
 
   @Override
   protected void cleanUncommitted(Set<ManifestFile> committed) {
-    if (newManifest != null && !committed.contains(newManifest)) {
-      deleteFile(newManifest.path());
+    if (newManifests != null) {
+      // Delete newManifests that have not been committed and clear them from 
the list.
+      // This is needed for manifests cleanup especially in transaction mode, 
for example:
+      //   Transaction txn = beginTransaction(...)
+      //
+      //   // Operation succeeds and calls cleanUncommitted
+      //   txn.newFastAppend().appendFile(...).commit();
+      //
+      //   // Some other operations ...
+      //
+      //   // Commit fails and needs to clean up newManifests
+      //   txn.commitTransaction()
+      List<ManifestFile> cleanedNewManifests = Lists.newArrayList();
+      for (ManifestFile manifestFile : newManifests) {

Review Comment:
   Minor: `manifestFile` -> `manifest` to match the existing naming pattern in 
this class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to