Fokko commented on code in PR #848: URL: https://github.com/apache/iceberg-python/pull/848#discussion_r1666824972
########## pyiceberg/io/pyarrow.py: ########## @@ -918,11 +919,24 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return TimeType() elif pa.types.is_timestamp(primitive): primitive = cast(pa.TimestampType, primitive) - if primitive.unit == "us": - if primitive.tz == "UTC" or primitive.tz == "+00:00": - return TimestamptzType() - elif primitive.tz is None: - return TimestampType() + if primitive.unit in ("s", "ms", "us"): + # Supported types, will be upcast automatically to 'us' + pass + elif primitive.unit == "ns": + if Config().get_bool("downcast-ns-timestamp-on-write"): Review Comment: I also got some issues with the nanosecond timestamp when collecting statistics: ``` > ??? E ValueError: Nanosecond resolution temporal type 1615967687249846175 is not safely convertible to microseconds to convert to datetime.datetime. Install pandas to return as Timestamp with nanosecond support or access the .value attribute. ``` At the lines: https://github.com/apache/iceberg-python/blob/7afd6d677d0fbc73d6baa8a5b57bdd889d0afe87/pyiceberg/io/pyarrow.py#L1870-L1871 This got fixed after updating this to: ```python col_aggs[field_id].update_min(statistics.min_raw) col_aggs[field_id].update_max(statistics.max_raw) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org