rdblue commented on code in PR #12593:
URL: https://github.com/apache/iceberg/pull/12593#discussion_r2011045094
##########
core/src/main/java/org/apache/iceberg/TableMetadata.java:
##########
@@ -1269,18 +1242,14 @@ public Builder addSnapshot(Snapshot snapshot) {
snapshotsById.put(snapshot.snapshotId(), snapshot);
changes.add(new MetadataUpdate.AddSnapshot(snapshot));
- if (rowLineage) {
+ if (formatVersion >= 3) {
ValidationException.check(
- snapshot.firstRowId() >= nextRowId,
- "Cannot add a snapshot whose 'first-row-id' (%s) is less than the
metadata 'next-row-id' (%s) because this will end up generating duplicate
row_ids.",
+ snapshot.firstRowId() != null, "Cannot add a snapshot:
first-row-id is null");
+ ValidationException.check(
+ snapshot.firstRowId() != null && snapshot.firstRowId() >=
nextRowId,
+ "Cannot add a snapshot, first-row-id is behind table next-row-id:
%s < %s",
snapshot.firstRowId(),
nextRowId);
- ValidationException.check(
- snapshot.addedRows() != null,
- "Cannot add a snapshot with a null 'added-rows' field when row
lineage is enabled");
- Preconditions.checkArgument(
- snapshot.addedRows() >= 0,
- "Cannot decrease 'last-row-id'. 'last-row-id' must increase
monotonically. Snapshot reports %s added rows");
Review Comment:
@RussellSpitzer, I moved these validations in to `Snapshot` rather than
leaving them here. Now this only validates that `first-row-id` is set for any
`Snapshot` in a v3 table and that the `first-row-id` is increasing. Both of
those can't be checked by the `Snapshot` itself.
Also, I thought about keeping the checks here rather than moving them to
`Snapshot` so that we can always read metadata, even when written incorrectly.
However, when a snapshot is missing `first-row-id`, it isn't safe to read the
snapshot so I think it is okay to fail when reading metadata rather than when
using the metadata.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]