corleyma commented on code in PR #848: URL: https://github.com/apache/iceberg-python/pull/848#discussion_r1663349324
########## pyiceberg/io/pyarrow.py: ########## @@ -918,11 +919,24 @@ def primitive(self, primitive: pa.DataType) -> PrimitiveType: return TimeType() elif pa.types.is_timestamp(primitive): primitive = cast(pa.TimestampType, primitive) - if primitive.unit == "us": - if primitive.tz == "UTC" or primitive.tz == "+00:00": - return TimestamptzType() - elif primitive.tz is None: - return TimestampType() + if primitive.unit in ("s", "ms", "us"): + # Supported types, will be upcast automatically to 'us' + pass + elif primitive.unit == "ns": + if Config().get_bool("downcast-ns-timestamp-on-write"): Review Comment: What @Fokko says matches my expectations. I think the default behavior for pyiceberg should be to fail if attempting to write ns precision timestamps to <=v2 table. I think using pyiceberg config, you should be able to have the write operations succeed with an automatic downcast to microsecond precision. I think the `schema_to_pyarrow` and `pyarrow_to_schema` APIs are very useful public APIs whose behavior should be fully controlled by their parameters. And finally... not sure if it is totally germane to this MR, but while I don't have much context for the `add_files` implementation on the Java side, I do think it is a useful API. I imagine this API will be particularly useful for projects that want to integrate iceberg writing but have their own ways of scheduling/parallelizing writes (not using the pyarrow-based write path that ships with pyiceberg). Making sure the `add_files` API is a as safe as possible while still being performant sounds like a good idea to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org