Re: [PR] Cast 's', 'ms' and 'ns' PyArrow timestamp to 'us' precision on write [iceberg-python]

via GitHub Fri, 05 Jul 2024 06:43:48 -0700


Fokko commented on code in PR #848:
URL: https://github.com/apache/iceberg-python/pull/848#discussion_r1666586779



##########
mkdocs/docs/configuration.md:
##########
@@ -299,4 +299,8 @@ PyIceberg uses multiple threads to parallelize operations. 
The number of workers
 
 # Backward Compatibility
 
-Previous versions of Java (`<1.4.0`) implementations incorrectly assume the 
optional attribute `current-snapshot-id` to be a required attribute in 
TableMetadata. This means that if `current-snapshot-id` is missing in the 
metadata file (e.g. on table creation), the application will throw an exception 
without being able to load the table. This assumption has been corrected in 
more recent Iceberg versions. However, it is possible to force PyIceberg to 
create a table with a metadata file that will be compatible with previous 
versions. This can be configured by setting the `legacy-current-snapshot-id` 
entry as "True" in the configuration file, or by setting the 
`PYICEBERG_LEGACY_CURRENT_SNAPSHOT_ID` environment variable. Refer to the [PR 
discussion](https://github.com/apache/iceberg-python/pull/473) for more details 
on the issue
+Previous versions of Java (`<1.4.0`) implementations incorrectly assume the 
optional attribute `current-snapshot-id` to be a required attribute in 
TableMetadata. This means that if `current-snapshot-id` is missing in the 
metadata file (e.g. on table creation), the application will throw an exception 
without being able to load the table. This assumption has been corrected in 
more recent Iceberg versions. However, it is possible to force PyIceberg to 
create a table with a metadata file that will be compatible with previous 
versions. This can be configured by setting the `legacy-current-snapshot-id` 
property as "True" in the configuration file, or by setting the 
`PYICEBERG_LEGACY_CURRENT_SNAPSHOT_ID` environment variable. Refer to the [PR 
discussion](https://github.com/apache/iceberg-python/pull/473) for more details 
on the issue
+
+# Nanoseconds Support
+
+PyIceberg currently only supports upto microsecond precision in its 
TimestampType. PyArrow timestamp types in 's' and 'ms' will be upcast 
automatically to 'us' precision timestamps on write. Timestamps in 'ns' 
precision can also be downcast automatically on write if desired. This can be 
configured by setting the `downcast-ns-timestamp-on-write` property as "True" 
in the configuration file, or by setting the 
`PYICEBERG_DOWNCAST_NS_TIMESTAMP_ON_WRITE` environment variable. Refer to the 
[nanoseconds timestamp proposal 
document](https://docs.google.com/document/d/1bE1DcEGNzZAMiVJSZ0X1wElKLNkT9kRkk0hDlfkXzvU/edit#heading=h.ibflcctc9i1d)
 for more details on the long term roadmap for nanoseconds support

Review Comment:
   ```suggestion
   PyIceberg currently only supports microsecond precision in its 
TimestampType. PyArrow timestamp types in 's' and 'ms' will be upcast 
automatically to 'us' precision timestamps on write. Timestamps in 'ns' 
precision can also be downcast automatically on write if desired. This can be 
configured by setting the `downcast-ns-timestamp-on-write` property as "True" 
in the configuration file, or by setting the 
`PYICEBERG_DOWNCAST_NS_TIMESTAMP_ON_WRITE` environment variable. Refer to the 
[nanoseconds timestamp proposal 
document](https://docs.google.com/document/d/1bE1DcEGNzZAMiVJSZ0X1wElKLNkT9kRkk0hDlfkXzvU/edit#heading=h.ibflcctc9i1d)
 for more details on the long term roadmap for nanoseconds support
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Cast 's', 'ms' and 'ns' PyArrow timestamp to 'us' precision on write [iceberg-python]

Reply via email to