syun64 commented on code in PR #921:
URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1678639461


##########
pyiceberg/io/pyarrow.py:
##########
@@ -1549,9 +1552,16 @@ def __init__(self, iceberg_type: PrimitiveType, 
physical_type_string: str, trunc
 
         expected_physical_type = _primitive_to_physical(iceberg_type)
         if expected_physical_type != physical_type_string:
-            raise ValueError(
-                f"Unexpected physical type {physical_type_string} for 
{iceberg_type}, expected {expected_physical_type}"
-            )
+            # Allow promotable physical types
+            # INT32 -> INT64 and FLOAT -> DOUBLE are safe type casts
+            if (physical_type_string == "INT32" and expected_physical_type == 
"INT64") or (
+                physical_type_string == "FLOAT" and expected_physical_type == 
"DOUBLE"

Review Comment:
   I've put in this logic to allow StatsAggregator to collect stats for files 
that are added through `add_files` that have file field types that map to 
broader Iceberg Schema types. This feels overly specific, and I feel as though 
I am duplicating the type promote mappings in a different format. I'm open to 
other ideas if we want to keep this check on the parquet physical types.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to