[
https://issues.apache.org/jira/browse/SPARK-51359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18000560#comment-18000560
]
Andrew Lamb commented on SPARK-51359:
-------------------------------------
Here is some additional information from [~alkis] on the parquet mailing list:
https://lists.apache.org/thread/ybthxfznokkvmm66r412f6m00ywmvvh8
> I also checked internally with the Spark OSS team and the plan for having
> INT64 timestamps in Spark by default is to make the change when Delta v5
> and Iceberg v4 are proposed. This is expected to happen around the first
> half of 2026.
> Set INT64 as the default timestamp type for Parquet files
> ---------------------------------------------------------
>
> Key: SPARK-51359
> URL: https://issues.apache.org/jira/browse/SPARK-51359
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.5.5
> Reporter: Ganesha Shreedhara
> Priority: Major
> Labels: pull-request-available
>
> The INT96 timestamp type has been deprecated as part of PARQUET-323. However,
> Apache Spark still uses INT96 as the default outputTimestampType for Parquet
> files ([code
> link|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L1157]).
> This could create incompatibilities when Parquet data written by Spark is
> read by readers that do not support the INT96 type. We should consider
> changing the default outputTimestampType to INT64 unless there is a
> compelling reason to maintain INT96 as the default option.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]