This is an automated email from the ASF dual-hosted git repository.
MaxGekk pushed a commit to branch branch-4.x
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.x by this push:
new ed5ceaf137e1 [SPARK-56969][SQL] Enhance the SQL config
`spark.sql.timestampNanosTypes.enabled`
ed5ceaf137e1 is described below
commit ed5ceaf137e1fee7766d201845619ebff58d7f25
Author: Maxim Gekk <[email protected]>
AuthorDate: Tue May 26 17:47:44 2026 +0200
[SPARK-56969][SQL] Enhance the SQL config
`spark.sql.timestampNanosTypes.enabled`
### What changes were proposed in this pull request?
This PR completes
[SPARK-56969](https://issues.apache.org/jira/browse/SPARK-56969) on top of the
parser gating added in
[SPARK-56965](https://issues.apache.org/jira/browse/SPARK-56965) /
[#56041](https://github.com/apache/spark/pull/56041).
- Extend `TypeUtils.failUnsupportedDataType` to recursively reject
`TimestampNTZNanosType` and `TimestampLTZNanosType` when
`spark.sql.timestampNanosTypes.enabled` is off.
- Add `UNSUPPORTED_TIMESTAMP_NANOS_TYPE` with a message naming the conf key.
- Expand the SQLConf doc for `spark.sql.timestampNanosTypes.enabled` and
align its default with `Utils.isTesting` (mirroring
`spark.sql.timeType.enabled`).
- Add a short enablement note for preview nanos timestamp types in
`docs/sql-ref-datatypes.md`.
Part of SPIP
[SPARK-56822](https://issues.apache.org/jira/browse/SPARK-56822).
### Why are the changes needed?
Parser and JSON entry points already gate parameterized nanos timestamp
types behind `spark.sql.timestampNanosTypes.enabled`, but analyzed
schemas/plans could still surface these types through other paths (for example
`CREATE TABLE`, connectors, or programmatic schemas) before downstream
execution support is ready.
Analysis-time gating closes that gap and keeps behavior consistent with the
existing preview flag.
### Does this PR introduce _any_ user-facing change?
Yes.
- When `spark.sql.timestampNanosTypes.enabled` is off, analyzed
schemas/plans containing `TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)` with `p` in
`[7, 9]` now fail with `UNSUPPORTED_TIMESTAMP_NANOS_TYPE`.
- The conf default is now `Utils.isTesting` instead of always `false`, so
tests enable the preview by default while production remains off.
- `docs/sql-ref-datatypes.md` documents how to enable the preview feature.
Unparameterized `TIMESTAMP`, `TIMESTAMP_NTZ`, and `TIMESTAMP_LTZ` behavior
is unchanged.
Example:
```sql
SET spark.sql.timestampNanosTypes.enabled=false;
CREATE TABLE t (c TIMESTAMP_NTZ(9));
-- UNSUPPORTED_TIMESTAMP_NANOS_TYPE: ... Set
"spark.sql.timestampNanosTypes.enabled" to "true" ...
```
### How was this patch tested?
Added/updated unit tests:
- TypeUtilsSuite: default conf behavior, analysis gating on/off,
microsecond types unaffected
- DataTypeParserSuite: explicitly disable conf when testing parser rejection
- DataTypeSuite: explicitly disable conf when testing JSON rejection
Existing nanos parser/JSON tests continue to pass with the conf enabled.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor Auto
Closes #56094 from MaxGekk/nanos-conf.
Authored-by: Maxim Gekk <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
(cherry picked from commit 5477eea9c01c6015c417940971e13b02381388b5)
Signed-off-by: Max Gekk <[email protected]>
---
docs/sql-ref-datatypes.md | 1 +
.../apache/spark/sql/errors/DataTypeErrors.scala | 18 +++---
.../org/apache/spark/sql/internal/SqlApiConf.scala | 4 +-
.../apache/spark/sql/catalyst/util/TypeUtils.scala | 12 +++-
.../org/apache/spark/sql/internal/SQLConf.scala | 18 +++---
.../sql/catalyst/parser/DataTypeParserSuite.scala | 34 ++++++------
.../spark/sql/catalyst/util/TypeUtilsSuite.scala | 64 +++++++++++++++++++++-
.../org/apache/spark/sql/types/DataTypeSuite.scala | 28 +++++-----
8 files changed, 131 insertions(+), 48 deletions(-)
diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md
index 0ae05d8f46be..fe1b8724d8d6 100644
--- a/docs/sql-ref-datatypes.md
+++ b/docs/sql-ref-datatypes.md
@@ -54,6 +54,7 @@ Spark SQL and DataFrames support the following data types:
- `TimestampNTZType`: Timestamp without time zone(TIMESTAMP_NTZ). It
represents values comprising values of fields year, month, day,
hour, minute, and second. All operations are performed without taking any
time zone into account.
- Note: TIMESTAMP in Spark is a user-specified alias associated with one
of the TIMESTAMP_LTZ and TIMESTAMP_NTZ variations. Users can set the default
timestamp type as `TIMESTAMP_LTZ`(default value) or `TIMESTAMP_NTZ` via the
configuration `spark.sql.timestampType`.
+ - `TimestampNTZNanosType(precision)` / `TimestampLTZNanosType(precision)`:
Preview nanosecond-capable variants of `TIMESTAMP_NTZ` and `TIMESTAMP_LTZ` with
fractional seconds precision `precision` in `[7, 9]`. Unparameterized
`TIMESTAMP`, `TIMESTAMP_NTZ`, and `TIMESTAMP_LTZ` remain microsecond types.
Enable the preview feature with `SET
spark.sql.timestampNanosTypes.enabled=true;` before using these types in
schemas or SQL.
* Interval types
- `YearMonthIntervalType(startField, endField)`: Represents a year-month
interval which is made up of a contiguous subset of the following fields:
diff --git
a/sql/api/src/main/scala/org/apache/spark/sql/errors/DataTypeErrors.scala
b/sql/api/src/main/scala/org/apache/spark/sql/errors/DataTypeErrors.scala
index b89da2c246a7..955e242d4ab5 100644
--- a/sql/api/src/main/scala/org/apache/spark/sql/errors/DataTypeErrors.scala
+++ b/sql/api/src/main/scala/org/apache/spark/sql/errors/DataTypeErrors.scala
@@ -286,13 +286,17 @@ private[sql] object DataTypeErrors extends
DataTypeErrorsBase {
def checkTimestampNanosTypesEnabled(): Unit = {
if (!SqlApiConf.get.timestampNanosTypesEnabled) {
- throw new SparkException(
- errorClass = "FEATURE_NOT_ENABLED",
- messageParameters = Map(
- "featureName" -> "Nanosecond-precision timestamp types",
- "configKey" -> "spark.sql.timestampNanosTypes.enabled",
- "configValue" -> "true"),
- cause = null)
+ throw timestampNanosTypesNotEnabledError()
}
}
+
+ def timestampNanosTypesNotEnabledError(): Throwable = {
+ new SparkException(
+ errorClass = "FEATURE_NOT_ENABLED",
+ messageParameters = Map(
+ "featureName" -> "Nanosecond-precision timestamp types",
+ "configKey" -> "spark.sql.timestampNanosTypes.enabled",
+ "configValue" -> "true"),
+ cause = null)
+ }
}
diff --git
a/sql/api/src/main/scala/org/apache/spark/sql/internal/SqlApiConf.scala
b/sql/api/src/main/scala/org/apache/spark/sql/internal/SqlApiConf.scala
index 6bd747c74399..99f851829565 100644
--- a/sql/api/src/main/scala/org/apache/spark/sql/internal/SqlApiConf.scala
+++ b/sql/api/src/main/scala/org/apache/spark/sql/internal/SqlApiConf.scala
@@ -21,7 +21,7 @@ import java.util.TimeZone
import scala.util.Try
import org.apache.spark.sql.types.{AtomicType, TimestampType}
-import org.apache.spark.util.SparkClassUtils
+import org.apache.spark.util.{SparkClassUtils, SparkEnvUtils}
/**
* Configuration for all objects that are placed in the `sql/api` project. The
normal way of
@@ -113,5 +113,5 @@ private[sql] object DefaultSqlApiConf extends SqlApiConf {
override def legacyParameterSubstitutionConstantsOnly: Boolean = false
override def legacyIdentifierClauseOnly: Boolean = false
override def typesFrameworkEnabled: Boolean = false
- override def timestampNanosTypesEnabled: Boolean = false
+ override def timestampNanosTypesEnabled: Boolean = SparkEnvUtils.isTesting
}
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala
index 9c5df04f9569..f82b6c58ddf4 100644
---
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala
+++
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala
@@ -22,7 +22,7 @@ import
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.DataTypeMismatch
import org.apache.spark.sql.catalyst.expressions.{Expression, RowOrdering}
import
org.apache.spark.sql.catalyst.expressions.st.STExpressionUtils.isGeoSpatialType
import org.apache.spark.sql.catalyst.types.{PhysicalDataType,
PhysicalNumericType}
-import org.apache.spark.sql.errors.{QueryCompilationErrors, QueryErrorsBase}
+import org.apache.spark.sql.errors.{DataTypeErrors, QueryCompilationErrors,
QueryErrorsBase}
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types._
@@ -139,10 +139,20 @@ object TypeUtils extends QueryErrorsBase {
if (dataType.existsRecursively(isInterval)) f
}
+ private def containsTimestampNanosType(dataType: DataType): Boolean = {
+ dataType.existsRecursively {
+ case _: TimestampNTZNanosType | _: TimestampLTZNanosType => true
+ case _ => false
+ }
+ }
+
def failUnsupportedDataType(dataType: DataType, conf: SQLConf): Unit = {
if (!conf.isTimeTypeEnabled &&
dataType.existsRecursively(_.isInstanceOf[TimeType])) {
throw QueryCompilationErrors.unsupportedTimeTypeError()
}
+ if (!conf.timestampNanosTypesEnabled &&
containsTimestampNanosType(dataType)) {
+ throw DataTypeErrors.timestampNanosTypesNotEnabledError()
+ }
if (!conf.geospatialEnabled &&
dataType.existsRecursively(isGeoSpatialType)) {
throw new org.apache.spark.sql.AnalysisException(
errorClass = "UNSUPPORTED_FEATURE.GEOSPATIAL_DISABLED",
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 270b8aa31a56..328f434195f4 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -647,15 +647,19 @@ object SQLConf {
val TIMESTAMP_NANOS_TYPES_ENABLED =
buildConf("spark.sql.timestampNanosTypes.enabled")
.internal()
- .doc("When true, the parameterized nanosecond-precision timestamp types
" +
- "TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p) for p in [7, 9] are recognized as
" +
- "Spark SQL data types at user-facing entry points. Default is false
because " +
- "downstream execution paths (Cast, PhysicalDataType, AnyTimestampType,
encoders, " +
- "Connect proto) are not yet wired for these types. See SPARK-56822.")
- .version("4.2.0")
+ .doc("When true, allows nanosecond-capable timestamp types
TIMESTAMP_NTZ(p) and " +
+ "TIMESTAMP_LTZ(p) with fractional seconds precision p in [7, 9] at
user-facing " +
+ "entry points, including the SQL parser, schemas, and analyzed plans.
This is a " +
+ "preview feature under SPARK-56822 and may change in future releases.
The default is " +
+ "false in production; tests enable it by default via Utils.isTesting.
" +
+ "Unparameterized TIMESTAMP, TIMESTAMP_NTZ, and TIMESTAMP_LTZ remain
microsecond " +
+ "types. Enabling this flag does not guarantee full SQL support: casts,
Parquet read, " +
+ "typed literals, and other operations may still fail until their
respective features " +
+ "are implemented.")
+ .version("4.3.0")
.withBindingPolicy(ConfigBindingPolicy.SESSION)
.booleanConf
- .createWithDefault(false)
+ .createWithDefault(Utils.isTesting)
val EXTENDED_EXPLAIN_PROVIDERS =
buildConf("spark.sql.extendedExplainProviders")
.doc("A comma-separated list of classes that implement the" +
diff --git
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
index b55ed2b9c18a..6543b209ccd8 100644
---
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
+++
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
@@ -211,29 +211,31 @@ class DataTypeParserSuite extends SparkFunSuite with
SQLHelper {
}
}
- test("nanos timestamp parser surface is gated by SQL conf, disabled by
default") {
+ test("nanos timestamp parser surface is gated by SQL conf when disabled") {
val gatedSpellings = Seq(
"TIMESTAMP_NTZ(7)",
"TIMESTAMP_LTZ(9)",
"TIMESTAMP(9) WITHOUT TIME ZONE",
"TIMESTAMP(9) WITH LOCAL TIME ZONE",
"TIMESTAMP(9)")
- gatedSpellings.foreach { spelling =>
- checkError(
- exception = intercept[SparkException] {
- CatalystSqlParser.parseDataType(spelling)
- },
- condition = "FEATURE_NOT_ENABLED",
- parameters = Map(
- "featureName" -> "Nanosecond-precision timestamp types",
- "configKey" -> "spark.sql.timestampNanosTypes.enabled",
- "configValue" -> "true"))
+ withSQLConf(SQLConf.TIMESTAMP_NANOS_TYPES_ENABLED.key -> "false") {
+ gatedSpellings.foreach { spelling =>
+ checkError(
+ exception = intercept[SparkException] {
+ CatalystSqlParser.parseDataType(spelling)
+ },
+ condition = "FEATURE_NOT_ENABLED",
+ parameters = Map(
+ "featureName" -> "Nanosecond-precision timestamp types",
+ "configKey" -> "spark.sql.timestampNanosTypes.enabled",
+ "configValue" -> "true"))
+ }
+ // Bare unparameterized forms remain accepted even with the gate off.
+ assert(parse("TIMESTAMP_NTZ") === TimestampNTZType)
+ assert(parse("TIMESTAMP_LTZ") === TimestampType)
+ assert(parse("TIMESTAMP WITHOUT TIME ZONE") === TimestampNTZType)
+ assert(parse("TIMESTAMP WITH LOCAL TIME ZONE") === TimestampType)
}
- // Bare unparameterized forms remain accepted even with the gate off.
- assert(parse("TIMESTAMP_NTZ") === TimestampNTZType)
- assert(parse("TIMESTAMP_LTZ") === TimestampType)
- assert(parse("TIMESTAMP WITHOUT TIME ZONE") === TimestampNTZType)
- assert(parse("TIMESTAMP WITH LOCAL TIME ZONE") === TimestampType)
}
// DataType parser accepts certain reserved keywords.
diff --git
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TypeUtilsSuite.scala
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TypeUtilsSuite.scala
index b209b93ce4d1..6c5afd30d8e1 100644
---
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TypeUtilsSuite.scala
+++
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TypeUtilsSuite.scala
@@ -17,11 +17,14 @@
package org.apache.spark.sql.catalyst.util
-import org.apache.spark.SparkFunSuite
+import org.apache.spark.{SparkException, SparkFunSuite}
import
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{DataTypeMismatch,
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.plans.SQLHelper
+import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types._
+import org.apache.spark.util.Utils
-class TypeUtilsSuite extends SparkFunSuite {
+class TypeUtilsSuite extends SparkFunSuite with SQLHelper {
private def typeCheckPass(types: Seq[DataType]): Unit = {
assert(TypeUtils.checkForSameTypeInputExpr(types, "a") == TypeCheckSuccess)
@@ -44,4 +47,61 @@ class TypeUtilsSuite extends SparkFunSuite {
typeCheckPass(ArrayType(StringType, containsNull = true) ::
ArrayType(StringType, containsNull = false) :: Nil)
}
+
+ test("TIMESTAMP_NANOS_TYPES_ENABLED defaults to Utils.isTesting") {
+ assert(SQLConf.get.timestampNanosTypesEnabled === Utils.isTesting)
+ }
+
+ test("failUnsupportedDataType rejects timestamp nanos types when preview is
disabled") {
+ val ntzNanos = TimestampNTZNanosType(9)
+ val ltzNanos = TimestampLTZNanosType(9)
+ val nestedNtzNanos = StructType(StructField("ts", ntzNanos) :: Nil)
+
+ withSQLConf(SQLConf.TIMESTAMP_NANOS_TYPES_ENABLED.key -> "false") {
+ val conf = SQLConf.get
+ val expectedParams = Map(
+ "featureName" -> "Nanosecond-precision timestamp types",
+ "configKey" -> "spark.sql.timestampNanosTypes.enabled",
+ "configValue" -> "true")
+ checkError(
+ intercept[SparkException] {
+ TypeUtils.failUnsupportedDataType(ntzNanos, conf)
+ },
+ condition = "FEATURE_NOT_ENABLED",
+ parameters = expectedParams)
+
+ checkError(
+ intercept[SparkException] {
+ TypeUtils.failUnsupportedDataType(ltzNanos, conf)
+ },
+ condition = "FEATURE_NOT_ENABLED",
+ parameters = expectedParams)
+
+ checkError(
+ intercept[SparkException] {
+ TypeUtils.failUnsupportedDataType(nestedNtzNanos, conf)
+ },
+ condition = "FEATURE_NOT_ENABLED",
+ parameters = expectedParams)
+ }
+ }
+
+ test("failUnsupportedDataType allows timestamp nanos types when preview is
enabled") {
+ val ntzNanos = TimestampNTZNanosType(9)
+ val ltzNanos = TimestampLTZNanosType(9)
+ val nestedLtzNanos = ArrayType(ltzNanos)
+
+ withSQLConf(SQLConf.TIMESTAMP_NANOS_TYPES_ENABLED.key -> "true") {
+ TypeUtils.failUnsupportedDataType(ntzNanos, SQLConf.get)
+ TypeUtils.failUnsupportedDataType(ltzNanos, SQLConf.get)
+ TypeUtils.failUnsupportedDataType(nestedLtzNanos, SQLConf.get)
+ }
+ }
+
+ test("failUnsupportedDataType does not reject microsecond timestamp types") {
+ withSQLConf(SQLConf.TIMESTAMP_NANOS_TYPES_ENABLED.key -> "false") {
+ TypeUtils.failUnsupportedDataType(TimestampType, SQLConf.get)
+ TypeUtils.failUnsupportedDataType(TimestampNTZType, SQLConf.get)
+ }
+ }
}
diff --git
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
index afa657c95ede..ad09b7411e65 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
@@ -1555,19 +1555,21 @@ class DataTypeSuite extends SparkFunSuite with
SQLHelper {
}
test("SPARK-56965: JSON parser rejects nanos timestamp types when preview
flag is off") {
- Seq(
- "\"timestamp_ltz(7)\"" -> "Nanosecond-precision timestamp types",
- "\"timestamp_ntz(9)\"" -> "Nanosecond-precision timestamp
types").foreach {
- case (json, featureName) =>
- checkError(
- exception = intercept[SparkException] {
- DataType.fromJson(json)
- },
- condition = "FEATURE_NOT_ENABLED",
- parameters = Map(
- "featureName" -> featureName,
- "configKey" -> "spark.sql.timestampNanosTypes.enabled",
- "configValue" -> "true"))
+ withSQLConf(SQLConf.TIMESTAMP_NANOS_TYPES_ENABLED.key -> "false") {
+ Seq(
+ "\"timestamp_ltz(7)\"" -> "Nanosecond-precision timestamp types",
+ "\"timestamp_ntz(9)\"" -> "Nanosecond-precision timestamp
types").foreach {
+ case (json, featureName) =>
+ checkError(
+ exception = intercept[SparkException] {
+ DataType.fromJson(json)
+ },
+ condition = "FEATURE_NOT_ENABLED",
+ parameters = Map(
+ "featureName" -> featureName,
+ "configKey" -> "spark.sql.timestampNanosTypes.enabled",
+ "configValue" -> "true"))
+ }
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]