rdblue commented on code in PR #12644:
URL: https://github.com/apache/iceberg/pull/12644#discussion_r2027429855


##########
format/spec.md:
##########
@@ -1414,12 +1414,16 @@ Each partition field in `fields` is stored as a JSON 
object with the following p
 
 | V1       | V2       | V3       | Field            | JSON representation | 
Example      |
 
|----------|----------|----------|------------------|---------------------|--------------|
-| required | required | omitted  | **`source-id`**  | `JSON int`          | 1  
          |

Review Comment:
   I noted this below, but I think we need to fix the issue where we have 2 
required fields that may not actually be present. I think we should make 
`source-ids` required and always write it for v3. I think it is best to have 
simple and clear rules.
   
   Then the question is how to handle `source-id`. If we are saying that 
`source-ids` is required and should always be used, then it doesn't make a lot 
of sense to me to continue writing `source-id`, but at the same time I don't 
really have a problem if it is still produced, but ignored. I think I would 
recommend:
   1. First read `source-ids`
   2. If it is not present, fall back to reading `source-id`
   3. Always write `source-ids` in newer versions
   4. If it is hard to write only `source-ids` then it is okay to also write 
`source-id`
   
   I think that would make it so we don't need to worry about the conflict 
between the two fields. It is also fairly permissive when reading, if we want 
that. I think it would also be fine to validate that `source-ids` is always 
used for v3 tables, but it sounds like that's additional parser work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to