syun64 commented on code in PR #921: URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1678639461
########## pyiceberg/io/pyarrow.py: ########## @@ -1549,9 +1552,16 @@ def __init__(self, iceberg_type: PrimitiveType, physical_type_string: str, trunc expected_physical_type = _primitive_to_physical(iceberg_type) if expected_physical_type != physical_type_string: - raise ValueError( - f"Unexpected physical type {physical_type_string} for {iceberg_type}, expected {expected_physical_type}" - ) + # Allow promotable physical types + # INT32 -> INT64 and FLOAT -> DOUBLE are safe type casts + if (physical_type_string == "INT32" and expected_physical_type == "INT64") or ( + physical_type_string == "FLOAT" and expected_physical_type == "DOUBLE" Review Comment: I've put in this logic to allow StatsAggregator to collect stats for files that are added through `add_files` that have file field types that map to broader Iceberg Schema types. This feels overly specific, and I feel as though I am duplicating the type [promote](https://github.com/apache/iceberg-python/blob/e27cd9095503cfe9fa7e0a806ba25d42920c68c5/pyiceberg/schema.py#L1551) mappings in a different format. I'm open to other ideas if we want to keep this check on the parquet physical types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org