HonahX commented on code in PR #585: URL: https://github.com/apache/iceberg-python/pull/585#discussion_r1554855296
########## pyiceberg/catalog/hive.py: ########## @@ -199,6 +184,7 @@ def _annotate_namespace(database: HiveDatabase, properties: Properties) -> HiveD DateType: "date", TimeType: "string", TimestampType: "timestamp", + TimestamptzType: "timestamp", Review Comment: Theoretically, the content here should be constants in [serdeConstants](https://github.com/apache/hive/blob/master/serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java#L76-L86) since java use these to [test the Hive schema util](https://github.com/apache/iceberg/blob/main/hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveSchemaUtil.java#L213) Among these types, `timestamp with local time zone` is a new one added in Hive3: [ref](https://github.com/apache/iceberg/pull/1897). Interestingly, when doing integration tests (use pyiceberg to write and spark to read) I found that Hive server will raise error for `timestampabc` but work normally for `timestamp arbitrary string`. Seems as long as the first word is within serdeConstants there will be no error loading the table. Not sure if `timestamp arbitrary string` will cause error in other Hive use cases. How about we default to "timestamp with local time zone" and add a HiveCatalog property (e.g. `hive.hive2-compatible-mode`) to use "timestamp" here when it set to `true`? (Will add the integration test soon) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org