RussellSpitzer commented on code in PR #12781:
URL: https://github.com/apache/iceberg/pull/12781#discussion_r2047798306


##########
format/spec.md:
##########
@@ -786,9 +790,11 @@ Notes:
 
 #### First Row ID Assignment
 
-When adding a new data manifest file, its `first_row_id` field is assigned the 
value of the snapshot's `first_row_id` plus the sum of `added_rows_count` for 
all data manifests that preceded the manifest in the manifest list.
+The `first_row_id` for existing manifests must be preserved when writing a new 
manifest list. The value of `first_row_id` for delete manifests is always 
`null`. The `first_row_id` is only assigned for data manifests that do not have 
a `first_row_id`. Assignment must account for data files that will be assigned 
`first_row_id` values when the manifest is read.
 
-The `first_row_id` is only assigned for new data manifests. Values for 
existing manifests must be preserved when writing a new manifest list. The 
value of `first_row_id` for delete manifests is always `null`.
+The first manifest without a `first_row_id` is assigned a value that is 
greater than or equal to the `first_row_id` of the snapshot. Subsequent 
manifests without a `first_row_id` are assigned one based on the previous 
manifest to be assigned a `first_row_id`. Each assigned `first_row_id` must 
increase by the row count of all files that will be assigned a `first_row_id` 
via inheritance in the last assigned manifest. That is, each `first_row_id` 
must be greater than or equal to the last assigned `first_row_id` plus the 
total record count of data files with a null `first_row_id` in the last 
assigned manifest.

Review Comment:
   ```suggestion
   The first manifest without a `first_row_id` is assigned a value that is 
greater than or equal to the `first_row_id` of the snapshot. Subsequent 
manifests without a `first_row_id` are assigned a value by summing the 
previously assigned `first_row_id` and the row count of all files that will be 
assigned `first_row_id` via inheritance in that previously assigned manifest.  
Each `first_row_id` must be greater than or equal to the last assigned 
`first_row_id` plus the total record count of data files with a null 
`first_row_id` in the last assigned manifest.
   ```
   
   Tried to simplify this a bit, not sure if I succeeded



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to