Fokko commented on issue #6798: URL: https://github.com/apache/iceberg/issues/6798#issuecomment-1430363718
@haizhou-zhao Thanks again for the elaborate response. I don't think the open-API spec is incorrect, but the generated code isn't smart enough to handle the complexity of the spec. We validate the spec in the CI when it is changed in a PR, so it is technically correct, but the issue here is that the code was written first, and then it was encapsulated in the spec. > Finally, since we talked about potentially correct the OpenAPI spec, I'd like to learn more about the "endeavor" we need to take to update it. Let me give an example. The open-API spec also describes the Iceberg schema, and here we have the issue of the primitive (`"string"`), and complex types (`{"type": "list", "element-id": 3, "element-required": true, "element": "string"}`). To make use of the generated code, would need to change the type to always be a dictionary instead of a plain string (at least this was one of the issues I bumped into with PyIceberg). These schemas live everywhere, for example, they are encoded in the metadata and the parquet metadata. If we would change this in an incompatible way then older readers won't be able to read newer files (because they don't understand `{"type": "string"}`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org