amogh-jahagirdar commented on code in PR #13556:
URL: https://github.com/apache/iceberg/pull/13556#discussion_r2237827785


##########
core/src/main/java/org/apache/iceberg/CherryPickOperation.java:
##########
@@ -214,13 +220,17 @@ private static void validateReplacedPartitions(
           parentId == null || isCurrentAncestor(meta, parentId),
           "Cannot cherry-pick overwrite, based on non-ancestor of the current 
state: %s",
           parentId);
-      List<DataFile> newFiles =
-          SnapshotUtil.newFiles(parentId, meta.currentSnapshot().snapshotId(), 
meta::snapshot, io);
-      for (DataFile newFile : newFiles) {
-        ValidationException.check(
-            !replacedPartitions.contains(newFile.specId(), 
newFile.partition()),
-            "Cannot cherry-pick replace partitions with changed partition: %s",
-            newFile.partition());
+      try (CloseableIterable<DataFile> newFiles =
+          SnapshotUtil.newFilesBetween(
+              parentId, meta.currentSnapshot().snapshotId(), meta::snapshot, 
io)) {
+        for (DataFile newFile : newFiles) {
+          ValidationException.check(
+              !replacedPartitions.contains(newFile.specId(), 
newFile.partition()),
+              "Cannot cherry-pick replace partitions with changed partition: 
%s",
+              newFile.partition());
+        }
+      } catch (IOException ioe) {
+        LOG.warn("Failed to close task iterable", ioe);
       }

Review Comment:
   I think I understand the intent to not let any failure on close of the 
stream impede the validation of the operation, which is nice but I think we 
should take that separately. I'd imagine there are quite a few parts that can 
leverage this kind of improvement. We could have some kind of separate 
exception for indicating that a failure happened specifically on Close or match 
on the message (but that's brittle and messy, so I'd avoid that).
   
   I'd bubble up the IOException as was the case before and then we can figure 
out how to prevent unneccessary failures in validation separately.



##########
core/src/main/java/org/apache/iceberg/CherryPickOperation.java:
##########
@@ -214,13 +220,17 @@ private static void validateReplacedPartitions(
           parentId == null || isCurrentAncestor(meta, parentId),
           "Cannot cherry-pick overwrite, based on non-ancestor of the current 
state: %s",
           parentId);
-      List<DataFile> newFiles =
-          SnapshotUtil.newFiles(parentId, meta.currentSnapshot().snapshotId(), 
meta::snapshot, io);
-      for (DataFile newFile : newFiles) {
-        ValidationException.check(
-            !replacedPartitions.contains(newFile.specId(), 
newFile.partition()),
-            "Cannot cherry-pick replace partitions with changed partition: %s",
-            newFile.partition());
+      try (CloseableIterable<DataFile> newFiles =
+          SnapshotUtil.newFilesBetween(
+              parentId, meta.currentSnapshot().snapshotId(), meta::snapshot, 
io)) {
+        for (DataFile newFile : newFiles) {
+          ValidationException.check(
+              !replacedPartitions.contains(newFile.specId(), 
newFile.partition()),
+              "Cannot cherry-pick replace partitions with changed partition: 
%s",
+              newFile.partition());
+        }
+      } catch (IOException ioe) {
+        LOG.warn("Failed to close task iterable", ioe);
       }

Review Comment:
   I don't quite agree with this change to handle the `IOException` this way 
since it's possible an IOException gets thrown when trying to read the 
manifests, but this code is assuming the IOException always happens during the 
closing of the `CloseableIterable`  and can silently skip passed files that 
need to be validated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to