RussellSpitzer commented on code in PR #12580: URL: https://github.com/apache/iceberg/pull/12580#discussion_r2010594331
########## format/spec.md: ########## @@ -408,16 +408,17 @@ When `null`, a row's `_row_id` field is assigned to the `first_row_id` from its Values for `_row_id` and `_last_updated_sequence_number` are either read from the data file or assigned at read time. As a result on read, rows in a table always have non-null values for these fields when lineage is enabled. -When an existing row is moved to a different data file for any reason, writers are required to write `_row_id` and `_last_updated_sequence_number` according to the following rules: +When an existing row is moved to a different data file for any reason, writers should write `_row_id` and `_last_updated_sequence_number` according to the following rules: 1. The row's existing non-null `_row_id` must be copied into the new data file 2. If the write has modified the row, the `_last_updated_sequence_number` field must be set to `null` (so that the modification's sequence number replaces the current value) 3. If the write has not modified the row, the existing non-null `_last_updated_sequence_number` value must be copied to the new data file +The semantics of whether an operation affecting existing rows is modeled as deleting all modified rows and adding new rows, or upsert with preserved row ids is left to the implementing engine. Review Comment: I think i find this bit a little confusing. Maybe: An implementing engine can choose not follow the above rules when moving a row and in that case a modification of an existing row will be treated as an independent delete and insert. ? Or something like that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org