rdblue commented on code in PR #9661: URL: https://github.com/apache/iceberg/pull/9661#discussion_r1501923473
########## format/spec.md: ########## @@ -1128,16 +1142,14 @@ Each partition field in the fields list is stored as an object. See the table fo |**`month`**|`JSON string: "month"`|`"month"`| |**`day`**|`JSON string: "day"`|`"day"`| |**`hour`**|`JSON string: "hour"`|`"hour"`| -|**`Partition Field`** [1,2]|`JSON object: {`<br /> `"source-id": <id int>,`<br /> `"field-id": <field id int>,`<br /> `"name": <name string>,`<br /> `"transform": <transform JSON>`<br />`}`|`{`<br /> `"source-id": 1,`<br /> `"field-id": 1000,`<br /> `"name": "id_bucket",`<br /> `"transform": "bucket[16]"`<br />`}`| In some cases partition specs are stored using only the field list instead of the object format that includes the spec ID, like the deprecated `partition-spec` field in table metadata. The object format should be used unless otherwise noted in this spec. The `field-id` property was added for each partition field in v2. In v1, the reference implementation assigned field ids sequentially in each spec starting at 1,000. See Partition Evolution for more details. -Notes: +Transforms that accept multiple arguments specify source field IDs using `source-ids` instead of `source-id`. Writers producing these transforms in v1 and v2 metadata should additionally produce the `source-id` field by setting it to the first ID from the `source-ids` list. Writers producing these transforms in v3 metadata should populate only the `source-ids` field because v3 readers will fully-support multi-arg transforms by reading this field. -1. For partition fields with a transform with a single argument, the ID of the source field is set on `source-id`, and `source-ids` is omitted. -2. For partition fields with a transform of multiple arguments, the IDs of the source fields are set on `source-ids`. To preserve backward compatibility, `source-id` is set to -1. +Older versions of the reference implementation can read tables with transforms unknown to it, without the ability to push down filters or write. But other implementations may break if they encounter unknown transforms. Review Comment: Should we state that in v3, readers are required to ignore unknown transforms and are not allowed to write to tables that use unknown transforms in the target partition spec? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org