(spark) branch branch-4.x updated: [SPARK-57211][SQL] Cast strings to TIMESTAMP_NTZ(p)/TIMESTAMP_LTZ(p)

uros Wed, 03 Jun 2026 06:08:05 -0700

This is an automated email from the ASF dual-hosted git repository.

uros-b pushed a commit to branch branch-4.x
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-4.x by this push:
     new a660811c7d82 [SPARK-57211][SQL] Cast strings to 
TIMESTAMP_NTZ(p)/TIMESTAMP_LTZ(p)
a660811c7d82 is described below

commit a660811c7d822dd1a06b4b9cb1dd1f2b32d86eb6
Author: Maxim Gekk <[email protected]>
AuthorDate: Wed Jun 3 15:07:02 2026 +0200

    [SPARK-57211][SQL] Cast strings to TIMESTAMP_NTZ(p)/TIMESTAMP_LTZ(p)
    
    ### What changes were proposed in this pull request?
    
    This PR wires `Cast` to support casting `StringType` to the 
nanosecond-capable timestamp types `TimestampNTZNanosType(p)` and 
`TimestampLTZNanosType(p)` with fractional-seconds precision `p` in `[7, 9]`, 
on both the interpreted and codegen paths and across all eval modes (`LEGACY`, 
`ANSI`, `TRY`):
    
    - `CAST(<string> AS TIMESTAMP_NTZ(p))`
    - `CAST(<string> AS TIMESTAMP_LTZ(p))`
    
    Concretely, in `Cast.scala`:
    - Add `StringType -> TimestampNTZNanosType(p)` / `TimestampLTZNanosType(p)` 
arms to `canCast` and `canAnsiCast`. Try-cast is covered automatically 
(`canTryCast` delegates to `canAnsiCast`, and `canUseLegacyCastForTryCast` 
already matches `(StringType, DatetimeType)`, which the nanos types extend).
    - Add `(StringType, TimestampLTZNanosType)` to `Cast.needsTimeZone`. The 
NTZ string is zone-independent, mirroring the micro `TIMESTAMP_NTZ` cast.
    - Add interpreted `castToTimestampLTZNanos` / `castToTimestampNTZNanos` and 
matching codegen, dispatched from `castInternal` / `nullSafeCastFunction` with 
the precision taken from the target type. The result is a `TimestampNanosVal` 
(or `null` in legacy/try mode on malformed input).
    - The NTZ cast adopts `allowTimeZone = true` to match the existing micro 
`TIMESTAMP_NTZ` string cast, and resolves the `TODO(SPARK-57032)` left on 
`stringToTimestampNTZNanosAnsi`.
    
    This reuses the parse entry points added in SPARK-57032 on 
`SparkDateTimeUtils` (inherited by `DateTimeUtils`), which already return a 
normalized `TimestampNanosVal` and apply per-precision truncation, so no 
separate normalization module is required for the string path.
    
    Existing preview gating is unchanged: `Cast.checkInputDataTypes` calls 
`TypeUtils.failUnsupportedDataType`, which throws `FEATURE_NOT_ENABLED` when 
`spark.sql.timestampNanosTypes.enabled` is off.
    
    ### Why are the changes needed?
    
    This is a sub-task of 
[SPARK-56822](https://issues.apache.org/jira/browse/SPARK-56822) (SPIP: 
Timestamps with nanosecond precision).
    
    The logical types, the `TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)` SQL syntax, 
the physical row value `TimestampNanosVal`, and the string-to-nanos parse 
helpers all exist, but `Cast` had zero arms for the nanos types. As a result 
`CAST(s AS TIMESTAMP_NTZ(9))` failed type-check with `CAST_WITHOUT_SUGGESTION` 
even when the preview flag `spark.sql.timestampNanosTypes.enabled` was on. 
String ingestion is the most common entry point for these types and unblocks 
typed literals, filters, and CTA [...]
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, but only when the preview flag `spark.sql.timestampNanosTypes.enabled` 
is enabled (it defaults to off in production). With the flag on, `CAST(<string> 
AS TIMESTAMP_NTZ(p))` and `CAST(<string> AS TIMESTAMP_LTZ(p))` for `p` in `[7, 
9]` now produce correct nanosecond values in `LEGACY`, `ANSI`, and `TRY` modes; 
previously they failed type-checking. With the flag off, the behavior is 
unchanged (`FEATURE_NOT_ENABLED`). Existing microsecond timestamp string casts 
are unchanged.
    
    ### How was this patch tested?
    
    - `CastSuiteBase`: success cases for both types over `p` in `[7, 9]` and a 
7-9 digit fractional corpus; LTZ parameterized over time zones, NTZ 
zone-independent (including a discarded zone suffix). Plus a flag-off guard 
asserting `FEATURE_NOT_ENABLED`.
    - `CastWithAnsiOnSuite`: malformed-input parse errors (`DateTimeException` 
/ `CAST_INVALID_INPUT`).
    - `CastWithAnsiOffSuite` / `TryCastSuite`: malformed input returns `NULL`.
    - Golden-file checks added to `cast.sql` (regenerated with 
`SPARK_GENERATE_GOLDEN_FILES=1`): positive cases assert the result type via 
`typeof` (the reverse direction, nanos -> string rendering, is not wired yet 
and is tracked under SPARK-57162); negative cases exercise the ANSI parse-error 
path (and `NULL` in non-ANSI mode).
    
    Verified locally:
    ```
    $ build/sbt 'catalyst/testOnly *CastSuite *CastWithAnsiOnSuite 
*CastWithAnsiOffSuite *TryCastSuite'
    $ build/sbt 'sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z 
cast.sql'
    $ ./dev/scalastyle
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Cursor (Claude Opus 4.8)
    
    Closes #56288 from MaxGekk/nanos-cast-string.
    
    Authored-by: Maxim Gekk <[email protected]>
    Signed-off-by: Uros Bojanic <[email protected]>
    (cherry picked from commit 7d0a8cd314ac827b7298b1ceed89619892c6a4b2)
    Signed-off-by: Uros Bojanic <[email protected]>
---
 .../sql/catalyst/util/SparkDateTimeUtils.scala     |  7 +-
 .../spark/sql/catalyst/expressions/Cast.scala      | 95 +++++++++++++++++++++-
 .../sql/catalyst/expressions/CastSuiteBase.scala   | 49 ++++++++++-
 .../expressions/CastWithAnsiOffSuite.scala         | 12 +++
 .../catalyst/expressions/CastWithAnsiOnSuite.scala | 18 ++++
 .../sql-tests/analyzer-results/cast.sql.out        | 28 +++++++
 .../analyzer-results/nonansi/cast.sql.out          | 28 +++++++
 .../src/test/resources/sql-tests/inputs/cast.sql   |  9 ++
 .../test/resources/sql-tests/results/cast.sql.out  | 66 +++++++++++++++
 .../sql-tests/results/nonansi/cast.sql.out         | 32 ++++++++
 10 files changed, 339 insertions(+), 5 deletions(-)

diff --git 
a/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala
 
b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala
index d7200715f937..29f280fdd09c 100644
--- 
a/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala
+++ 
b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala
@@ -946,9 +946,10 @@ trait SparkDateTimeUtils {
       s: UTF8String,
       precision: Int,
       context: QueryContext = null): TimestampNanosVal = {
-    // TODO(SPARK-57032): when this is wired to a user-facing CAST(... AS 
TIMESTAMP_NTZ(p)), the
-    // cast must decide `allowTimeZone` explicitly (per ANSI/legacy mode) 
instead of relying on
-    // the `true` default used here, which silently discards a zone suffix.
+    // CAST(... AS TIMESTAMP_NTZ(p)) intentionally uses `allowTimeZone = true` 
here, mirroring the
+    // micro `TIMESTAMP_NTZ` string cast 
(`stringToTimestampWithoutTimeZoneAnsi`): a zone suffix in
+    // the input is silently discarded rather than rejected. Callers that need 
strict NTZ rejection
+    // should call `stringToTimestampNTZNanos` directly with `allowTimeZone = 
false`.
     stringToTimestampNTZNanos(s, precision).getOrElse {
       throw ExecutionErrors.invalidInputInCastToDatetimeError(
         s,
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
index ad3e22dc2257..a1935c739643 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
@@ -38,7 +38,7 @@ import 
org.apache.spark.sql.catalyst.util.IntervalUtils.{dayTimeIntervalToByte,
 import org.apache.spark.sql.errors.{QueryErrorsBase, QueryExecutionErrors}
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
-import org.apache.spark.unsafe.types.{BinaryView, UTF8String, VariantVal}
+import org.apache.spark.unsafe.types.{BinaryView, TimestampNanosVal, 
UTF8String, VariantVal}
 import org.apache.spark.unsafe.types.UTF8String.{IntWrapper, LongWrapper}
 import org.apache.spark.util.ArrayImplicits._
 
@@ -113,6 +113,9 @@ object Cast extends QueryErrorsBase {
     case (DateType, TimestampNTZType) => true
     case (TimestampType, TimestampNTZType) => true
 
+    case (_: StringType, _: TimestampNTZNanosType) => true
+    case (_: StringType, _: TimestampLTZNanosType) => true
+
     case (_: StringType, _: CalendarIntervalType) => true
     case (_: StringType, _: AnsiIntervalType) => true
 
@@ -248,6 +251,9 @@ object Cast extends QueryErrorsBase {
     case (DateType, TimestampNTZType) => true
     case (TimestampType, TimestampNTZType) => true
 
+    case (_: StringType, _: TimestampNTZNanosType) => true
+    case (_: StringType, _: TimestampLTZNanosType) => true
+
     case (_: StringType, DateType) => true
     case (_: StringType, _: TimeType) => true
     case (TimestampType, DateType) => true
@@ -335,6 +341,9 @@ object Cast extends QueryErrorsBase {
     case (TimestampType, DateType) => true
     case (TimestampType, TimestampNTZType) => true
     case (TimestampNTZType, TimestampType) => true
+    // NTZ string is zone-independent (mirroring micro TIMESTAMP_NTZ, which is 
not listed); only
+    // the LTZ string parse depends on the session time zone.
+    case (_: StringType, _: TimestampLTZNanosType) => true
     case (ArrayType(fromType, _), ArrayType(toType, _)) => 
needsTimeZone(fromType, toType)
     case (MapType(fromKey, fromValue, _), MapType(toKey, toValue, _)) =>
       needsTimeZone(fromKey, toKey) || needsTimeZone(fromValue, toValue)
@@ -786,6 +795,30 @@ case class Cast(
       buildCast[Long](_, ts => convertTz(ts, ZoneOffset.UTC, zoneId))
   }
 
+  private[this] def castToTimestampLTZNanos(
+      from: DataType,
+      precision: Int): Any => Any = from match {
+    case _: StringType =>
+      buildCast[UTF8String](_, utfs =>
+        if (ansiEnabled) {
+          DateTimeUtils.stringToTimestampLTZNanosAnsi(utfs, precision, zoneId, 
getContextOrNull())
+        } else {
+          DateTimeUtils.stringToTimestampLTZNanos(utfs, precision, 
zoneId).orNull
+        })
+  }
+
+  private[this] def castToTimestampNTZNanos(
+      from: DataType,
+      precision: Int): Any => Any = from match {
+    case _: StringType =>
+      buildCast[UTF8String](_, utfs =>
+        if (ansiEnabled) {
+          DateTimeUtils.stringToTimestampNTZNanosAnsi(utfs, precision, 
getContextOrNull())
+        } else {
+          DateTimeUtils.stringToTimestampNTZNanos(utfs, precision, 
allowTimeZone = true).orNull
+        })
+  }
+
   private[this] def decimalToTimestamp(d: Decimal): Long = {
     (d.toBigDecimal * MICROS_PER_SECOND).longValue
   }
@@ -1299,6 +1332,8 @@ case class Cast(
         case decimal: DecimalType => castToDecimal(from, decimal)
         case TimestampType => castToTimestamp(from)
         case TimestampNTZType => castToTimestampNTZ(from)
+        case t: TimestampNTZNanosType => castToTimestampNTZNanos(from, 
t.precision)
+        case t: TimestampLTZNanosType => castToTimestampLTZNanos(from, 
t.precision)
         case CalendarIntervalType => castToInterval(from)
         case it: DayTimeIntervalType => castToDayTimeInterval(from, it)
         case it: YearMonthIntervalType => castToYearMonthInterval(from, it)
@@ -1409,6 +1444,8 @@ case class Cast(
     case decimal: DecimalType => castToDecimalCode(from, decimal, ctx)
     case TimestampType => castToTimestampCode(from, ctx)
     case TimestampNTZType => castToTimestampNTZCode(from, ctx)
+    case t: TimestampNTZNanosType => castToTimestampNTZNanosCode(from, 
t.precision, ctx)
+    case t: TimestampLTZNanosType => castToTimestampLTZNanosCode(from, 
t.precision, ctx)
     case CalendarIntervalType => castToIntervalCode(from)
     case it: DayTimeIntervalType => castToDayTimeIntervalCode(from, it)
     case it: YearMonthIntervalType => castToYearMonthIntervalCode(from, it)
@@ -1772,6 +1809,62 @@ case class Cast(
         code"$evPrim = $dateTimeUtilsCls.convertTz($c, 
java.time.ZoneOffset.UTC, $zid);"
   }
 
+  private[this] def castToTimestampLTZNanosCode(
+      from: DataType,
+      precision: Int,
+      ctx: CodegenContext): CastFunction = from match {
+    case _: StringType =>
+      val zoneIdClass = classOf[ZoneId]
+      val zid = JavaCode.global(
+        ctx.addReferenceObj("zoneId", zoneId, zoneIdClass.getName),
+        zoneIdClass)
+      val tsOpt = ctx.freshVariable("tsOpt", 
classOf[Option[TimestampNanosVal]])
+      (c, evPrim, evNull) =>
+        if (ansiEnabled) {
+          val errorContext = getContextOrNullCode(ctx)
+          code"""
+            $evPrim = $dateTimeUtilsCls.stringToTimestampLTZNanosAnsi(
+              $c, $precision, $zid, $errorContext);
+           """
+        } else {
+          code"""
+            scala.Option<TimestampNanosVal> $tsOpt =
+              $dateTimeUtilsCls.stringToTimestampLTZNanos($c, $precision, 
$zid);
+            if ($tsOpt.isDefined()) {
+              $evPrim = (TimestampNanosVal) $tsOpt.get();
+            } else {
+              $evNull = true;
+            }
+           """
+        }
+  }
+
+  private[this] def castToTimestampNTZNanosCode(
+      from: DataType,
+      precision: Int,
+      ctx: CodegenContext): CastFunction = from match {
+    case _: StringType =>
+      val tsOpt = ctx.freshVariable("tsOpt", 
classOf[Option[TimestampNanosVal]])
+      (c, evPrim, evNull) =>
+        if (ansiEnabled) {
+          val errorContext = getContextOrNullCode(ctx)
+          code"""
+            $evPrim = $dateTimeUtilsCls.stringToTimestampNTZNanosAnsi(
+              $c, $precision, $errorContext);
+           """
+        } else {
+          code"""
+            scala.Option<TimestampNanosVal> $tsOpt =
+              $dateTimeUtilsCls.stringToTimestampNTZNanos($c, $precision, 
true);
+            if ($tsOpt.isDefined()) {
+              $evPrim = (TimestampNanosVal) $tsOpt.get();
+            } else {
+              $evNull = true;
+            }
+           """
+        }
+  }
+
   private[this] def castToIntervalCode(from: DataType): CastFunction = from 
match {
     case _: StringType =>
       val util = IntervalUtils.getClass.getCanonicalName.stripSuffix("$")
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
index e888432ef91e..b33045ad90a8 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
@@ -22,7 +22,7 @@ import java.time.{Duration, LocalDate, LocalDateTime, 
LocalTime, Period}
 import java.time.temporal.ChronoUnit
 import java.util.{Calendar, Locale, TimeZone}
 
-import org.apache.spark.{SparkFunSuite, SparkIllegalArgumentException}
+import org.apache.spark.{SparkException, SparkFunSuite, 
SparkIllegalArgumentException}
 import org.apache.spark.sql.Row
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.DataTypeMismatch
@@ -33,6 +33,7 @@ import org.apache.spark.sql.catalyst.util.DateTimeTestUtils._
 import org.apache.spark.sql.catalyst.util.DateTimeUtils._
 import org.apache.spark.sql.catalyst.util.IntervalUtils
 import org.apache.spark.sql.catalyst.util.IntervalUtils.microsToDuration
+import org.apache.spark.sql.catalyst.util.TimestampNanosTestUtils._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 import org.apache.spark.sql.types.DataTypeTestUtils.{dayTimeIntervalTypes, 
yearMonthIntervalTypes}
@@ -1023,6 +1024,52 @@ abstract class CastSuiteBase extends SparkFunSuite with 
ExpressionEvalHelper {
       LocalDateTime.of(2021, 6, 17, 0, 0))
   }
 
+  test("SPARK-57211: cast string to timestamp_ltz with nanosecond precision") {
+    foreachNanosPrecision { precision =>
+      val truncate = nanoOfSecTruncator(precision)
+      outstandingZoneIds.foreach { zid =>
+        specialNanosTs.foreach { s =>
+          val ldt = 
parseSpecialNanosNTZ(s).withNano(truncate(parseSpecialNanosNTZ(s).getNano))
+          val expected = instantToNanosVal(ldt.atZone(zid).toInstant)
+          checkEvaluation(
+            cast(Literal(s), TimestampLTZNanosType(precision), 
Option(zid.getId)),
+            expected)
+        }
+      }
+    }
+  }
+
+  test("SPARK-57211: cast string to timestamp_ntz with nanosecond precision") {
+    foreachNanosPrecision { precision =>
+      val truncate = nanoOfSecTruncator(precision)
+      specialNanosTs.foreach { s =>
+        val ldt = 
parseSpecialNanosNTZ(s).withNano(truncate(parseSpecialNanosNTZ(s).getNano))
+        val expected = localDateTimeToNanosVal(ldt)
+        // NTZ result is independent of the session time zone.
+        checkEvaluation(cast(Literal(s), TimestampNTZNanosType(precision)), 
expected)
+        // A zone suffix is discarded (allowTimeZone = true), mirroring micro 
TIMESTAMP_NTZ.
+        checkEvaluation(cast(Literal(s + "Z"), 
TimestampNTZNanosType(precision)), expected)
+      }
+    }
+  }
+
+  test("SPARK-57211: nanosecond timestamp cast requires the preview flag") {
+    withSQLConf(SQLConf.TIMESTAMP_NANOS_TYPES_ENABLED.key -> "false") {
+      val expectedParams = Map(
+        "featureName" -> "Nanosecond-precision timestamp types",
+        "configKey" -> "spark.sql.timestampNanosTypes.enabled",
+        "configValue" -> "true")
+      Seq(TimestampNTZNanosType(9), TimestampLTZNanosType(9)).foreach { to =>
+        checkError(
+          exception = intercept[SparkException] {
+            cast(Literal("2020-01-01 00:00:00"), to, 
UTC_OPT).checkInputDataTypes()
+          },
+          condition = "FEATURE_NOT_ENABLED",
+          parameters = expectedParams)
+      }
+    }
+  }
+
   test("SPARK-35112: Cast string to day-time interval") {
     checkEvaluation(cast(Literal.create("0 0:0:0"), DayTimeIntervalType()), 0L)
     checkEvaluation(cast(Literal.create(" interval '0 0:0:0' Day TO second   
"),
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOffSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOffSuite.scala
index ec347a14a9a4..9f9a6f275a3f 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOffSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOffSuite.scala
@@ -27,6 +27,7 @@ import 
org.apache.spark.sql.catalyst.analysis.TypeCoercionSuite
 import org.apache.spark.sql.catalyst.expressions.aggregate.{CollectList, 
CollectSet}
 import org.apache.spark.sql.catalyst.util.DateTimeConstants._
 import org.apache.spark.sql.catalyst.util.DateTimeTestUtils._
+import 
org.apache.spark.sql.catalyst.util.TimestampNanosTestUtils.foreachNanosPrecision
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 import org.apache.spark.sql.types.DayTimeIntervalType.{DAY, HOUR, MINUTE, 
SECOND}
@@ -56,6 +57,17 @@ class CastWithAnsiOffSuite extends CastSuiteBase {
     checkEvaluation(cast(123L, DecimalType(2, 0)), null)
   }
 
+  test("SPARK-57211: legacy mode cast malformed string to nanosecond timestamp 
returns null") {
+    Seq("123", "2015-03-18 123142", "2015-03-18X", "abdef").foreach { str =>
+      foreachNanosPrecision { precision =>
+        checkEvaluation(
+          cast(Literal(str), TimestampLTZNanosType(precision), UTC_OPT), null)
+        checkEvaluation(
+          cast(Literal(str), TimestampNTZNanosType(precision)), null)
+      }
+    }
+  }
+
   test("cast from int #2") {
     checkEvaluation(cast(cast(1000, TimestampType), LongType), 1000.toLong)
     checkEvaluation(cast(cast(-1200, TimestampType), LongType), -1200.toLong)
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala
index b76aec6d6ce0..ce7850d8c9c1 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala
@@ -28,6 +28,7 @@ import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.DataTypeMismatch
 import org.apache.spark.sql.catalyst.util.DateTimeConstants.MILLIS_PER_SECOND
 import org.apache.spark.sql.catalyst.util.DateTimeTestUtils
 import 
org.apache.spark.sql.catalyst.util.DateTimeTestUtils.{withDefaultTimeZone, UTC}
+import 
org.apache.spark.sql.catalyst.util.TimestampNanosTestUtils.foreachNanosPrecision
 import org.apache.spark.sql.errors.QueryErrorsBase
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
@@ -800,6 +801,23 @@ class CastWithAnsiOnSuite extends CastSuiteBase with 
QueryErrorsBase {
     }
   }
 
+  test("SPARK-57211: ANSI mode cast string to nanosecond timestamp with parse 
error") {
+    val invalidInputs = Seq(
+      "123", "2015-03-18 123142", "2015-03-18X", "2015/03/18", "abdef", 
"2015-031-8")
+    DateTimeTestUtils.outstandingZoneIds.foreach { zid =>
+      foreachNanosPrecision { precision =>
+        invalidInputs.foreach { str =>
+          checkExceptionInExpression[DateTimeException](
+            cast(Literal(str), TimestampLTZNanosType(precision), 
Option(zid.getId)),
+            castErrMsg(str, TimestampLTZNanosType(precision)))
+          checkExceptionInExpression[DateTimeException](
+            cast(Literal(str), TimestampNTZNanosType(precision)),
+            castErrMsg(str, TimestampNTZNanosType(precision)))
+        }
+      }
+    }
+  }
+
   test("ANSI mode: cast string to date with parse error") {
     DateTimeTestUtils.outstandingZoneIds.foreach { zid =>
       def checkCastWithParseError(str: String): Unit = {
diff --git 
a/sql/core/src/test/resources/sql-tests/analyzer-results/cast.sql.out 
b/sql/core/src/test/resources/sql-tests/analyzer-results/cast.sql.out
index 053d7af3df45..b077443a9f28 100644
--- a/sql/core/src/test/resources/sql-tests/analyzer-results/cast.sql.out
+++ b/sql/core/src/test/resources/sql-tests/analyzer-results/cast.sql.out
@@ -641,6 +641,34 @@ Project [cast(a as timestamp_ntz) AS CAST(a AS 
TIMESTAMP_NTZ)#x]
 +- OneRowRelation
 
 
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ntz(9)))
+-- !query analysis
+Project [typeof(cast(2022-01-01 00:00:00.123456789 as timestamp_ntz(9))) AS 
typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_NTZ(9)))#x]
++- OneRowRelation
+
+
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ltz(7)))
+-- !query analysis
+Project [typeof(cast(2022-01-01 00:00:00.123456789 as timestamp_ltz(7))) AS 
typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_LTZ(7)))#x]
++- OneRowRelation
+
+
+-- !query
+select cast('a' as timestamp_ntz(9)) is null
+-- !query analysis
+Project [isnull(cast(a as timestamp_ntz(9))) AS (CAST(a AS TIMESTAMP_NTZ(9)) 
IS NULL)#x]
++- OneRowRelation
+
+
+-- !query
+select cast('a' as timestamp_ltz(9)) is null
+-- !query analysis
+Project [isnull(cast(a as timestamp_ltz(9))) AS (CAST(a AS TIMESTAMP_LTZ(9)) 
IS NULL)#x]
++- OneRowRelation
+
+
 -- !query
 select cast(cast('inf' as double) as timestamp)
 -- !query analysis
diff --git 
a/sql/core/src/test/resources/sql-tests/analyzer-results/nonansi/cast.sql.out 
b/sql/core/src/test/resources/sql-tests/analyzer-results/nonansi/cast.sql.out
index 0113716bdf71..1255f2266629 100644
--- 
a/sql/core/src/test/resources/sql-tests/analyzer-results/nonansi/cast.sql.out
+++ 
b/sql/core/src/test/resources/sql-tests/analyzer-results/nonansi/cast.sql.out
@@ -505,6 +505,34 @@ Project [cast(a as timestamp_ntz) AS CAST(a AS 
TIMESTAMP_NTZ)#x]
 +- OneRowRelation
 
 
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ntz(9)))
+-- !query analysis
+Project [typeof(cast(2022-01-01 00:00:00.123456789 as timestamp_ntz(9))) AS 
typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_NTZ(9)))#x]
++- OneRowRelation
+
+
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ltz(7)))
+-- !query analysis
+Project [typeof(cast(2022-01-01 00:00:00.123456789 as timestamp_ltz(7))) AS 
typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_LTZ(7)))#x]
++- OneRowRelation
+
+
+-- !query
+select cast('a' as timestamp_ntz(9)) is null
+-- !query analysis
+Project [isnull(cast(a as timestamp_ntz(9))) AS (CAST(a AS TIMESTAMP_NTZ(9)) 
IS NULL)#x]
++- OneRowRelation
+
+
+-- !query
+select cast('a' as timestamp_ltz(9)) is null
+-- !query analysis
+Project [isnull(cast(a as timestamp_ltz(9))) AS (CAST(a AS TIMESTAMP_LTZ(9)) 
IS NULL)#x]
++- OneRowRelation
+
+
 -- !query
 select cast(cast('inf' as double) as timestamp)
 -- !query analysis
diff --git a/sql/core/src/test/resources/sql-tests/inputs/cast.sql 
b/sql/core/src/test/resources/sql-tests/inputs/cast.sql
index 9d191dff6702..5065e7c335e7 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/cast.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/cast.sql
@@ -102,6 +102,15 @@ select cast('a' as timestamp);
 select cast('2022-01-01 00:00:00' as timestamp_ntz);
 select cast('a' as timestamp_ntz);
 
+-- SPARK-57211: cast string to nanosecond-precision timestamps 
TIMESTAMP_NTZ(p)/TIMESTAMP_LTZ(p).
+-- The reverse direction (nanos -> string) is not wired yet, so positive cases 
assert the result
+-- type via typeof. Negative cases exercise the ANSI parse-error path and use 
IS NULL so the result
+-- column stays non-nanos (a bare nanos result column is not yet serializable 
by JDBC/thrift).
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ntz(9)));
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ltz(7)));
+select cast('a' as timestamp_ntz(9)) is null;
+select cast('a' as timestamp_ltz(9)) is null;
+
 select cast(cast('inf' as double) as timestamp);
 select cast(cast('inf' as float) as timestamp);
 
diff --git a/sql/core/src/test/resources/sql-tests/results/cast.sql.out 
b/sql/core/src/test/resources/sql-tests/results/cast.sql.out
index ca2f739113f1..10b6f4526889 100644
--- a/sql/core/src/test/resources/sql-tests/results/cast.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/cast.sql.out
@@ -1288,6 +1288,72 @@ org.apache.spark.SparkDateTimeException
 }
 
 
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ntz(9)))
+-- !query schema
+struct<typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_NTZ(9))):string>
+-- !query output
+timestamp_ntz(9)
+
+
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ltz(7)))
+-- !query schema
+struct<typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_LTZ(7))):string>
+-- !query output
+timestamp_ltz(7)
+
+
+-- !query
+select cast('a' as timestamp_ntz(9)) is null
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkDateTimeException
+{
+  "errorClass" : "CAST_INVALID_INPUT",
+  "sqlState" : "22018",
+  "messageParameters" : {
+    "ansiConfig" : "\"spark.sql.ansi.enabled\"",
+    "expression" : "'a'",
+    "sourceType" : "\"STRING\"",
+    "targetType" : "\"TIMESTAMP_NTZ(9)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 36,
+    "fragment" : "cast('a' as timestamp_ntz(9))"
+  } ]
+}
+
+
+-- !query
+select cast('a' as timestamp_ltz(9)) is null
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkDateTimeException
+{
+  "errorClass" : "CAST_INVALID_INPUT",
+  "sqlState" : "22018",
+  "messageParameters" : {
+    "ansiConfig" : "\"spark.sql.ansi.enabled\"",
+    "expression" : "'a'",
+    "sourceType" : "\"STRING\"",
+    "targetType" : "\"TIMESTAMP_LTZ(9)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 36,
+    "fragment" : "cast('a' as timestamp_ltz(9))"
+  } ]
+}
+
+
 -- !query
 select cast(cast('inf' as double) as timestamp)
 -- !query schema
diff --git a/sql/core/src/test/resources/sql-tests/results/nonansi/cast.sql.out 
b/sql/core/src/test/resources/sql-tests/results/nonansi/cast.sql.out
index 64d7b3597055..2b73fe4e63da 100644
--- a/sql/core/src/test/resources/sql-tests/results/nonansi/cast.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/nonansi/cast.sql.out
@@ -584,6 +584,38 @@ struct<CAST(a AS TIMESTAMP_NTZ):timestamp_ntz>
 NULL
 
 
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ntz(9)))
+-- !query schema
+struct<typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_NTZ(9))):string>
+-- !query output
+timestamp_ntz(9)
+
+
+-- !query
+select typeof(cast('2022-01-01 00:00:00.123456789' as timestamp_ltz(7)))
+-- !query schema
+struct<typeof(CAST(2022-01-01 00:00:00.123456789 AS TIMESTAMP_LTZ(7))):string>
+-- !query output
+timestamp_ltz(7)
+
+
+-- !query
+select cast('a' as timestamp_ntz(9)) is null
+-- !query schema
+struct<(CAST(a AS TIMESTAMP_NTZ(9)) IS NULL):boolean>
+-- !query output
+true
+
+
+-- !query
+select cast('a' as timestamp_ltz(9)) is null
+-- !query schema
+struct<(CAST(a AS TIMESTAMP_LTZ(9)) IS NULL):boolean>
+-- !query output
+true
+
+
 -- !query
 select cast(cast('inf' as double) as timestamp)
 -- !query schema


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch branch-4.x updated: [SPARK-57211][SQL] Cast strings to TIMESTAMP_NTZ(p)/TIMESTAMP_LTZ(p)

Reply via email to