rdblue commented on PR #12757: URL: https://github.com/apache/iceberg/pull/12757#issuecomment-2803358324
After working on the implementation and posting a PR to update the spec for how we handle upgrades, I think that this PR is correct and that `next-row-id` can be required in v3. Rather than filling in null until row IDs are assigned, the strategy I think is correct for upgrading tables is to assign IDs for an entire branch in the first snapshot written after the upgrade to v3. When upgrading, all existing snapshots have no IDs so `next-version-id` starts at 0. Then after the upgrade the next commit assigns IDs by rewriting the manifest list and then updates `next-version-id` by the size of the table: `sum(record_count for all existing or added data files)`. +1 to making the field required. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org