syun64 commented on code in PR #848: URL: https://github.com/apache/iceberg-python/pull/848#discussion_r1667028145
########## pyiceberg/io/pyarrow.py: ########## @@ -918,11 +919,24 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return TimeType() elif pa.types.is_timestamp(primitive): primitive = cast(pa.TimestampType, primitive) - if primitive.unit == "us": - if primitive.tz == "UTC" or primitive.tz == "+00:00": - return TimestamptzType() - elif primitive.tz is None: - return TimestampType() + if primitive.unit in ("s", "ms", "us"): + # Supported types, will be upcast automatically to 'us' + pass + elif primitive.unit == "ns": + if Config().get_bool("downcast-ns-timestamp-on-write"): Review Comment: Hi folks - thank you all for the valuable feedback. So it sounds like what we want is for the flag to be controlled by the configuration flag, but that flag to be passed as a parameter to the `schema_to_pyarrow` API so that its behavior can be fully controlled by its input parameters. I've made the following changes: 1. Introduced downcast_ns_timestamp_to_us as a new input parameter to `pyarrow_to_schema` and `to_requested_schema` public APIs 2. Now `table` and `catalog` level functions infer the flag from the Config on write. (e.g. `_check_schema_compatible` and `_convert_schema_if_needed`) 3. Always downcast `ns` to `us` on read, if there is `ns` timestamp in the parquet file (we will want to revise this behavior when we introduce nanosecond support in V3 spec, but until then, I think it's a reasonable assumption that data files that are in Iceberg will only be read with microseconds precision). https://github.com/apache/iceberg-python/pull/848/files#diff-8d5e63f2a87ead8cebe2fd8ac5dcf2198d229f01e16bb9e06e21f7277c328abdR1030-R1033 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org