amogh-jahagirdar commented on code in PR #17039:
URL: https://github.com/apache/iceberg/pull/17039#discussion_r3521029107
##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -2069,6 +2069,36 @@ public void
testUnpartitionedRewriteDataFilesPreservesLineage() throws NoSuchTab
assertEquals("Rows must match", expectedRecordsWithLineage,
actualRecordsWithLineage);
}
+ @TestTemplate
Review Comment:
Ok here's a more minimal test I came up with so we don't even have to append
any records. We can just do a compact + upgrade + rewrite manifests to move the
new file into a single v3 manifest. since there's a single data file/single
manifest we can then reliably assert the row IDs regardless of any change in
the implementation.
```
+ @TestTemplate
+ public void
testLastUpdatedSequenceInheritedFromDataSequenceAfterUpgrade() {
+ assumeThat(formatVersion).isEqualTo(2);
+
+ Table table = createTable();
+
+ // Two data files in a single v2 commit; both share data sequence
number 1.
+ writeRecords(2 /* files */, 4 /* records */);
+ table.refresh();
+ shouldHaveFiles(table, 2);
+ long committedDataSequence = table.currentSnapshot().sequenceNumber();
+
+ basicRewrite(table).execute();
+ table.refresh();
+ shouldHaveFiles(table, 1);
+
+ DataFile compacted = Iterables.getOnlyElement(currentDataFiles(table));
+ long dataSequence = compacted.dataSequenceNumber();
+ long fileSequence = compacted.fileSequenceNumber();
+ assertThat(dataSequence)
+ .as("Compaction must preserve the original data sequence number")
+ .isEqualTo(committedDataSequence);
+ assertThat(fileSequence)
+ .as("Compaction must bump the file sequence above the preserved
data sequence")
+ shouldHaveFiles(table, 1);
+
+ DataFile compacted = Iterables.getOnlyElement(currentDataFiles(table));
+ long dataSequence = compacted.dataSequenceNumber();
+ long fileSequence = compacted.fileSequenceNumber();
+ assertThat(dataSequence)
+ .as("Compaction must preserve the original data sequence number")
+ .isEqualTo(committedDataSequence);
+ assertThat(fileSequence)
+ .as("Compaction must bump the file sequence above the preserved
data sequence")
+ .isGreaterThan(dataSequence);
+
+ // Upgrade to v3, then move the carried-over file into a new v3
manifest
+ table.updateProperties().set(TableProperties.FORMAT_VERSION,
"3").commit();
+ table.rewriteManifests().rewriteIf(manifest -> true).commit();
+ table.refresh();
+
+ List<Object[]> expected =
+ Lists.newArrayList(
+ row(0L, dataSequence, ANY, ANY, ANY),
+ row(1L, dataSequence, ANY, ANY, ANY),
+ row(2L, dataSequence, ANY, ANY, ANY),
+ row(3L, dataSequence, ANY, ANY, ANY));
+
+ assertEquals(
+ "Row IDs must be 0..3 and last-updated must inherit the data
sequence",
+ expected,
+ currentDataWithLineage());
+ }
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]