amogh-jahagirdar commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2237827785
########## core/src/main/java/org/apache/iceberg/CherryPickOperation.java: ########## @@ -214,13 +220,17 @@ private static void validateReplacedPartitions( parentId == null || isCurrentAncestor(meta, parentId), "Cannot cherry-pick overwrite, based on non-ancestor of the current state: %s", parentId); - List<DataFile> newFiles = - SnapshotUtil.newFiles(parentId, meta.currentSnapshot().snapshotId(), meta::snapshot, io); - for (DataFile newFile : newFiles) { - ValidationException.check( - !replacedPartitions.contains(newFile.specId(), newFile.partition()), - "Cannot cherry-pick replace partitions with changed partition: %s", - newFile.partition()); + try (CloseableIterable<DataFile> newFiles = + SnapshotUtil.newFilesBetween( + parentId, meta.currentSnapshot().snapshotId(), meta::snapshot, io)) { + for (DataFile newFile : newFiles) { + ValidationException.check( + !replacedPartitions.contains(newFile.specId(), newFile.partition()), + "Cannot cherry-pick replace partitions with changed partition: %s", + newFile.partition()); + } + } catch (IOException ioe) { + LOG.warn("Failed to close task iterable", ioe); } Review Comment: I think I understand the intent to not let any failure on close of the stream impede the validation of the operation, which is nice but I think we should take that separately. I'd imagine there are quite a few parts that can leverage this kind of improvement. We could have some kind of separate exception for indicating that a failure happened specifically on Close or match on the message (but that's brittle and messy, so I'd avoid that). I'd bubble up the IOException as was the case before and then we can figure out how to prevent unneccessary failures in validation separately. ########## core/src/main/java/org/apache/iceberg/CherryPickOperation.java: ########## @@ -214,13 +220,17 @@ private static void validateReplacedPartitions( parentId == null || isCurrentAncestor(meta, parentId), "Cannot cherry-pick overwrite, based on non-ancestor of the current state: %s", parentId); - List<DataFile> newFiles = - SnapshotUtil.newFiles(parentId, meta.currentSnapshot().snapshotId(), meta::snapshot, io); - for (DataFile newFile : newFiles) { - ValidationException.check( - !replacedPartitions.contains(newFile.specId(), newFile.partition()), - "Cannot cherry-pick replace partitions with changed partition: %s", - newFile.partition()); + try (CloseableIterable<DataFile> newFiles = + SnapshotUtil.newFilesBetween( + parentId, meta.currentSnapshot().snapshotId(), meta::snapshot, io)) { + for (DataFile newFile : newFiles) { + ValidationException.check( + !replacedPartitions.contains(newFile.specId(), newFile.partition()), + "Cannot cherry-pick replace partitions with changed partition: %s", + newFile.partition()); + } + } catch (IOException ioe) { + LOG.warn("Failed to close task iterable", ioe); } Review Comment: I don't quite agree with this change to handle the `IOException` this way since it's possible an IOException gets thrown when trying to read the manifests, but this code is assuming the IOException always happens during the closing of the `CloseableIterable` and can silently skip passed files that need to be validated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org