rdblue commented on code in PR #9661: URL: https://github.com/apache/iceberg/pull/9661#discussion_r1503431015
########## format/spec.md: ########## @@ -1170,9 +1170,9 @@ Each sort field in the fields list is stored as an object with the following pro | required | required | required | **`direction`** | `JSON string` | `asc` | | required | required | required | **`null-order`** | `JSON string` | `nulls-last`| -Transforms that accept multiple arguments specify source field IDs using `source-ids` instead of `source-id`. Writers producing these transforms in v1 and v2 metadata should additionally produce the `source-id` field by setting it to the first ID from the `source-ids` list. Writers producing these transforms in v3 metadata should populate only the `source-ids` field because v3 readers will fully-support multi-arg transforms by reading this field. +In v3 metadata, writers must use only `source-ids` because v3 requires reader support for multi-arg transforms. In v1 and v2 metadata, writers must always write `source-id`; for multi-arg transforms, writers must produce `source-ids` and set `source-id` to the first ID from the field ID list. -Older versions of the reference implementation can read tables with transforms unknown to it, without the ability to push down filters or write. But other implementations may break if they encounter unknown transforms. +Older versions of the reference implementation can read tables with transforms unknown to it, ignoring them. But other implementations may break if they encounter unknown transforms. All v3 readers are required to read tables with unknown transforms, ignoring them. Writers should not write to tables with unknown transforms. Review Comment: This is okay for now, but the constraint for writers with an unknown transform is a bit more relaxed. Sort orders are best effort... so technically it's up to the writer. Similarly, the table's partition spec is the _default_ spec because there may be more than one spec that is valid in a table. Neither of these cases is necessarily blocking so "should" is a strong word to use. I'd remove that language here for the sort order, and update the partition spec language to "Writers should not write using partition specs that use unknown transforms". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org