Fokko commented on code in PR #7523: URL: https://github.com/apache/iceberg/pull/7523#discussion_r1184787197
########## python/pyiceberg/io/pyarrow.py: ########## @@ -612,14 +612,40 @@ def _get_field_doc(field: pa.Field) -> Optional[str]: class _ConvertToIceberg(PyArrowSchemaVisitor[Union[IcebergType, Schema]]): + def __init__(self, expected_schema: Optional[Schema] = None): + self.expected_schema = expected_schema + + def cast_if_needed(self, field_id: int, field_type: IcebergType) -> IcebergType: Review Comment: For context, I was afraid that this would also affect other logical types, such as dates. But that seems to work, and PyArrow directly reads it as a `date32`, and no promotion is needed. Digging into it, it looks like the source of the bug is upstream in Arrow, where it doesn't pick up the UUID logical type on the Parquet field. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
