rdblue commented on code in PR #12593: URL: https://github.com/apache/iceberg/pull/12593#discussion_r2011045094
########## core/src/main/java/org/apache/iceberg/TableMetadata.java: ########## @@ -1269,18 +1242,14 @@ public Builder addSnapshot(Snapshot snapshot) { snapshotsById.put(snapshot.snapshotId(), snapshot); changes.add(new MetadataUpdate.AddSnapshot(snapshot)); - if (rowLineage) { + if (formatVersion >= 3) { ValidationException.check( - snapshot.firstRowId() >= nextRowId, - "Cannot add a snapshot whose 'first-row-id' (%s) is less than the metadata 'next-row-id' (%s) because this will end up generating duplicate row_ids.", + snapshot.firstRowId() != null, "Cannot add a snapshot: first-row-id is null"); + ValidationException.check( + snapshot.firstRowId() != null && snapshot.firstRowId() >= nextRowId, + "Cannot add a snapshot, first-row-id is behind table next-row-id: %s < %s", snapshot.firstRowId(), nextRowId); - ValidationException.check( - snapshot.addedRows() != null, - "Cannot add a snapshot with a null 'added-rows' field when row lineage is enabled"); - Preconditions.checkArgument( - snapshot.addedRows() >= 0, - "Cannot decrease 'last-row-id'. 'last-row-id' must increase monotonically. Snapshot reports %s added rows"); Review Comment: @RussellSpitzer, I moved these validations in to `Snapshot` rather than leaving them here. Now this only validates that `first-row-id` is set for any `Snapshot` in a v3 table and that the `first-row-id` is increasing. Both of those can't be checked by the `Snapshot` itself. Also, I thought about keeping the checks here rather than moving them to `Snapshot` so that we can always read metadata, even when written incorrectly. However, when a snapshot is missing `first-row-id`, it isn't safe to read the snapshot so I think it is okay to fail when reading metadata rather than when using the metadata. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org