advancedxy commented on code in PR #9661:
URL: https://github.com/apache/iceberg/pull/9661#discussion_r1480813301


##########
format/spec.md:
##########
@@ -1117,7 +1117,17 @@ Partition specs are serialized as a JSON object with the 
following fields:
 |**`spec-id`**|`JSON int`|`0`|
 |**`fields`**|`JSON list: [`<br />&nbsp;&nbsp;`<partition field JSON>,`<br 
/>&nbsp;&nbsp;`...`<br />`]`|`[ {`<br />&nbsp;&nbsp;`"source-id": 4,`<br 
/>&nbsp;&nbsp;`"field-id": 1000,`<br />&nbsp;&nbsp;`"name": "ts_day",`<br 
/>&nbsp;&nbsp;`"transform": "day"`<br />`}, {`<br />&nbsp;&nbsp;`"source-id": 
1,`<br />&nbsp;&nbsp;`"field-id": 1001,`<br />&nbsp;&nbsp;`"name": 
"id_bucket",`<br />&nbsp;&nbsp;`"transform": "bucket[16]"`<br />`} ]`|
 
-Each partition field in the fields list is stored as an object. See the table 
for more detail:
+Each partition field in the `fields` is stored as a JSON object with the 
following properties.
+
+| V1       | V2       | V3       | Field            | JSON representation | 
Example      |

Review Comment:
   > it seems the problem existed before then (that V3 is mentioned without a 
proper introduction)
   
   Maybe v3 format is not completed and adopted by the community.
   
   How about we introduce `multi-arg` transform in the `### Partitioning` and 
`### Sorting` section and point it to the details in the `appendix E`. In the 
appendix, we can write detailed documentation about which compatibility flag to 
use and how partition field and sort field are json serialized?
   
   Something like this:
   
   
   ```markdown
   ### Partitioning
   ... omitted ...
   Tables are configured with a **partition spec** that defines how to produce 
a tuple of partition values from a record. A partition spec has a list of 
fields that consist of:
   
   *   A **source column id** or a list of **source column ids** from the 
table’s schema
   *   A **partition field id** that is used to identify a partition field and 
is unique within a partition spec. In v2 table metadata, it is unique across 
all partition specs.
   *   A **transform** that is applied to the source column(s)[1] to produce a 
partition value
   *   A **partition name**
   
   ... omitted ...
   
   Partition field IDs must be reused if an existing partition spec contains an 
equivalent field.
   
   Note:
   1. multi-arg transform is added in format Version 3. For details on how 
multi-arg transform is serialized in JSON, see appendix E
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to