syun64 commented on code in PR #921:
URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1678639461


##########
pyiceberg/io/pyarrow.py:
##########
@@ -1549,9 +1552,16 @@ def __init__(self, iceberg_type: PrimitiveType, 
physical_type_string: str, trunc
 
         expected_physical_type = _primitive_to_physical(iceberg_type)
         if expected_physical_type != physical_type_string:
-            raise ValueError(
-                f"Unexpected physical type {physical_type_string} for 
{iceberg_type}, expected {expected_physical_type}"
-            )
+            # Allow promotable physical types
+            # INT32 -> INT64 and FLOAT -> DOUBLE are safe type casts
+            if (physical_type_string == "INT32" and expected_physical_type == 
"INT64") or (
+                physical_type_string == "FLOAT" and expected_physical_type == 
"DOUBLE"

Review Comment:
   I've put in this targeted approach for to allow StatsAggregator to collect 
stats for files that are added through `add_files` that have schema that map to 
broader types. This feels overly specific, and I feel as though I am 
duplicating the type promote mappings in a different format. I'm open to other 
ideas if we want to keep this check on the parquet physical types.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to