danielcweeks commented on code in PR #12250: URL: https://github.com/apache/iceberg/pull/12250#discussion_r1954884816
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewritePositionDeleteFilesSparkAction.java: ########## @@ -404,8 +408,31 @@ private void validateAndInitOptions() { maxCommits, PARTIAL_PROGRESS_ENABLED); - Preconditions.checkArgument( - TableUtil.formatVersion(table) <= 2, "Cannot rewrite position deletes for V3 table"); + if (TableUtil.formatVersion(table) >= 3) { + PositionDeletesBatchScan scan = + (PositionDeletesBatchScan) + MetadataTableUtils.createMetadataTableInstance( + table, MetadataTableType.POSITION_DELETES) + .newBatchScan(); + Optional<PositionDeletesScanTask> foundPuffinFiles = + StreamSupport.stream( + CloseableIterable.transform( + scan.baseTableFilter(filter) + .caseSensitive(caseSensitive) + .select(PositionDeletesTable.DELETE_FILE_PATH) + .ignoreResiduals() + .planFiles(), + task -> (PositionDeletesScanTask) task) + .spliterator(), + false) + .filter(t -> t.file().format() == FileFormat.PUFFIN) + .findAny(); + + if (foundPuffinFiles.isPresent()) { + throw new IllegalArgumentException( Review Comment: I don't feel like we want to initiate a scan in the validation portion of the action. Also, I don't think this case should result in an error. If a table has no v2 deletes, it should be a no-op as opposed to failing. The results of the action should indicate that nothing was rewritten. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org