Fokko commented on issue #6798:
URL: https://github.com/apache/iceberg/issues/6798#issuecomment-1430363718

   @haizhou-zhao Thanks again for the elaborate response.
   
   I don't think the open-API spec is incorrect, but the generated code isn't 
smart enough to handle the complexity of the spec. We validate the spec in the 
CI when it is changed in a PR, so it is technically correct, but the issue here 
is that the code was written first, and then it was encapsulated in the spec.
   
   > Finally, since we talked about potentially correct the OpenAPI spec, I'd 
like to learn more about the "endeavor" we need to take to update it.
   
   Let me give an example. The open-API spec also describes the Iceberg schema, 
and here we have the issue of the primitive (`"string"`), and complex types 
(`{"type": "list", "element-id": 3, "element-required": true, "element": 
"string"}`). To make use of the generated code, would need to change the type 
to always be a dictionary instead of a plain string (at least this was one of 
the issues I bumped into with PyIceberg). These schemas live everywhere, for 
example, they are encoded in the metadata and the parquet metadata. If we would 
change this in an incompatible way then older readers won't be able to read 
newer files (because they don't understand `{"type": "string"}`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to