amogh-jahagirdar commented on code in PR #11561: URL: https://github.com/apache/iceberg/pull/11561#discussion_r1863707596
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatchQueryScan.java: ########## @@ -168,7 +168,7 @@ protected Map<String, DeleteFileSet> rewritableDeletes() { for (ScanTask task : tasks()) { FileScanTask fileScanTask = task.asFileScanTask(); for (DeleteFile deleteFile : fileScanTask.deletes()) { - if (ContentFileUtil.isFileScoped(deleteFile)) { + if (ContentFileUtil.isFileScoped(deleteFile) || ContentFileUtil.isDV(deleteFile)) { Review Comment: Great catch, yes to both points. the file scoped Util correctly handles the DV case by checking the referenced data file, and we for DVs we do need to merge all existing position deletes, regardless of scope. I've fixed that and added a test case for update/delete/merge which will create a mix of file scoped and partition scoped deletes, and then finally produce a delete which has the positions merged from these existing deletes. ########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatchQueryScan.java: ########## @@ -168,7 +168,7 @@ protected Map<String, DeleteFileSet> rewritableDeletes() { for (ScanTask task : tasks()) { FileScanTask fileScanTask = task.asFileScanTask(); for (DeleteFile deleteFile : fileScanTask.deletes()) { - if (ContentFileUtil.isFileScoped(deleteFile)) { + if (ContentFileUtil.isFileScoped(deleteFile) || ContentFileUtil.isDV(deleteFile)) { Review Comment: Great catch, yes to both points. the file scoped Util correctly handles the DV case by checking the referenced data file, and we for DVs we do need to merge all existing position deletes, regardless of scope. I've fixed that and added a test case for update/delete/merge which will create a mix of file scoped and partition scoped deletes, and then finally produce a DV which has the positions merged from these existing deletes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org