aokolnychyi commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1387199381
########## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ########## @@ -1105,6 +1108,499 @@ public void testRewriteManifestsOnBranchUnsupported() { "Cannot commit to branch someBranch: org.apache.iceberg.BaseRewriteManifests does not support branch commits"); } + @Test + public void testRewriteDataManifestsPreservesDeletes() { + Assumptions.assumeThat(formatVersion).isGreaterThan(1); + + Table table = load(); + + table.newAppend().appendFile(FILE_A).appendFile(FILE_B).commit(); + + Snapshot appendSnapshot = table.currentSnapshot(); + Assertions.assertThat(appendSnapshot.dataManifests(table.io())).hasSize(1); + Assertions.assertThat(appendSnapshot.deleteManifests(table.io())).isEmpty(); + + table.newRowDelta().addDeletes(FILE_A_DELETES).addDeletes(FILE_A2_DELETES).commit(); + + Snapshot deleteSnapshot = table.currentSnapshot(); + Assertions.assertThat(deleteSnapshot.dataManifests(table.io())).hasSize(1); + Assertions.assertThat(deleteSnapshot.deleteManifests(table.io())).hasSize(1); + + table.rewriteManifests().clusterBy(file -> file.path().toString()).commit(); Review Comment: It is `ContentFile$path` that returns `CharSequence`. The idea of this closure is to split manifest entries based on the content file path. This ensures the original manifest with 2 entires is split into 2 new manifests. It is not the path of the manifest. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org