kevinjqliu opened a new pull request, #2333:
URL: https://github.com/apache/iceberg-python/pull/2333
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
I ran into an interesting edge case while testing metadata virtualization
between delta and iceberg.
Delta has both [TIMESTAMP and TIMESTAMP_NTZ data
types](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-datatypes).
TIMESTAMP has a timezone while TIMESTAMP_NTZ has no timezone.
While Iceberg has [timestamp and
timestamptz](https://iceberg.apache.org/spec/#primitive-types). timestamp has
no timezone and timestamptz has a timezone.
So Delta's TIMESTAMP -> Iceberg timestamptz and Delta's TIMESTAMP_NTZ ->
Iceberg timestamp.
Regardless of delta or iceberg, the [parquet file stores timestamp without
the timezone
information](https://github.com/apache/parquet-format/blob/1dbc814b97c9307687a2e4bee55545ab6a2ef106/LogicalTypes.md#timestamp)
So I end up a parquet file with timestamp column, and an iceberg table with
timestamptz column, and pyiceberg cannot read this table. Its hard to recreate
the scenario but i did trace it to the `_to_requested_schema` function.
```
E pyiceberg.exceptions.ResolveError: Cannot promote timestamp to
timestamptz
```
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label. -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]