rdblue commented on code in PR #9661:
URL: https://github.com/apache/iceberg/pull/9661#discussion_r1503431015


##########
format/spec.md:
##########
@@ -1170,9 +1170,9 @@ Each sort field in the fields list is stored as an object 
with the following pro
 | required | required | required | **`direction`**  | `JSON string`       | 
`asc`       |
 | required | required | required | **`null-order`** | `JSON string`       | 
`nulls-last`|
 
-Transforms that accept multiple arguments specify source field IDs using 
`source-ids` instead of `source-id`. Writers producing these transforms in v1 
and v2 metadata should additionally produce the `source-id` field by setting it 
to the first ID from the `source-ids` list. Writers producing these transforms 
in v3 metadata should populate only the `source-ids` field because v3 readers 
will fully-support multi-arg transforms by reading this field.
+In v3 metadata, writers must use only `source-ids` because v3 requires reader 
support for multi-arg transforms. In v1 and v2 metadata, writers must always 
write `source-id`; for multi-arg transforms, writers must produce `source-ids` 
and set `source-id` to the first ID from the field ID list.
 
-Older versions of the reference implementation can read tables with transforms 
unknown to it, without the ability to push down filters or write. But other 
implementations may break if they encounter unknown transforms.
+Older versions of the reference implementation can read tables with transforms 
unknown to it, ignoring them. But other implementations may break if they 
encounter unknown transforms. All v3 readers are required to read tables with 
unknown transforms, ignoring them. Writers should not write to tables with 
unknown transforms.

Review Comment:
   This is okay for now, but the constraint for writers with an unknown 
transform is a bit more relaxed. Sort orders are best effort... so technically 
it's up to the writer. Similarly, the table's partition spec is the _default_ 
spec because there may be more than one spec that is valid in a table. Neither 
of these cases is necessarily blocking so "should" is a strong word to use. I'd 
remove that language here for the sort order, and update the partition spec 
language to "Writers should not write using partition specs that use unknown 
transforms".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to