(spark) branch branch-4.2 updated: Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamReader.table()"

ashrigondekar Fri, 29 May 2026 17:02:51 -0700

This is an automated email from the ASF dual-hosted git repository.

anishshri-db pushed a commit to branch branch-4.2
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-4.2 by this push:
     new 3e40e71246ca Revert "[SPARK-56975][SS] Reject user-specified schema in 
DataStreamReader.table()"
3e40e71246ca is described below

commit 3e40e71246caddcecbb9ce1c3cdf444a8c1f30b9
Author: You Zhou <[email protected]>
AuthorDate: Fri May 29 17:01:03 2026 -0700

    Revert "[SPARK-56975][SS] Reject user-specified schema in 
DataStreamReader.table()"
    
    ### What changes were proposed in this pull request?
    This reverts commit `05b4d81f3f938ff140886d6f66ad66d08c66d5b2` 
(SPARK-56975), which made `DataStreamReader.table()` reject a user-specified 
schema by calling `assertNoSpecifiedSchema("table")`. This restores the 
previous behavior, where a user-specified schema passed before `.table()` is 
accepted (and ignored).
    
    ### Why are the changes needed?
    SPARK-56975 is a behavior-breaking change. Code that previously ran 
successfully — e.g. `spark.readStream.schema(s).table(name)` — now throws an 
`AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). While a schema has no effect 
on `.table()`, rejecting it outright breaks existing user workloads that set a 
schema on the `DataStreamReader` before calling `.table()`.
    
    A user-facing behavior change like this must go through the project's 
breaking-change process, which was not followed for SPARK-56975. We are 
reverting it to restore backward compatibility; a proper deprecation path can 
be pursued separately if the stricter behavior is still desired.
    
    ### Does this PR introduce _any_ user-facing change?
    Yes. It restores the pre-SPARK-56975 behavior: `DataStreamReader.table()` 
again accepts (and silently ignores) a user-specified schema instead of 
throwing `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). Since SPARK-56975 
only landed in unreleased branches (`master` and `branch-4.2`), there is no 
change relative to any released Spark version.
    
    ### How was this patch tested?
    This is a straight `git revert`. Existing `DataStreamTableAPISuite` tests 
pass; the test added by SPARK-56975 (`"read: user-specified schema is not 
allowed with table API"`) is removed as part of the revert.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No.
    
    Closes #56189 from PorridgeSwim/revert-SPARK-56975.
    
    Lead-authored-by: You Zhou <[email protected]>
    Co-authored-by: You Zhou <[email protected]>
    Signed-off-by: Anish Shrigondekar <[email protected]>
    (cherry picked from commit 6039af85e961aa6c3a361ed63665f3f8d3878f10)
    Signed-off-by: Anish Shrigondekar <[email protected]>
---
 .../org/apache/spark/sql/classic/DataStreamReader.scala  |  1 -
 .../sql/streaming/test/DataStreamTableAPISuite.scala     | 16 ----------------
 2 files changed, 17 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamReader.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamReader.scala
index c8df93768808..eb3120cac05a 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamReader.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamReader.scala
@@ -102,7 +102,6 @@ final class DataStreamReader private[sql](sparkSession: 
SparkSession)
   /** @inheritdoc */
   def table(tableName: String): DataFrame = {
     require(tableName != null, "The table name can't be null")
-    assertNoSpecifiedSchema("table")
     val identifier = 
sparkSession.sessionState.sqlParser.parseMultipartIdentifier(tableName)
     val unresolved = UnresolvedRelation(
       identifier,
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamTableAPISuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamTableAPISuite.scala
index 3930beec084d..e2c74533e7f3 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamTableAPISuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamTableAPISuite.scala
@@ -84,22 +84,6 @@ class DataStreamTableAPISuite extends StreamTest with 
BeforeAndAfter {
     checkErrorTableNotFound(e, "`non_exist_table`")
   }
 
-  test("read: user-specified schema is not allowed with table API") {
-    val tblName = "my_table"
-    withTable(tblName) {
-      spark.range(3).write.format("parquet").saveAsTable(tblName)
-      val e = intercept[AnalysisException] {
-        spark.readStream
-          .schema(new StructType().add("a", IntegerType))
-          .table(tblName)
-      }
-      checkError(
-        exception = e,
-        condition = "_LEGACY_ERROR_TEMP_1189",
-        parameters = Map("operation" -> "table"))
-    }
-  }
-
   test("read: stream table API with temp view") {
     val tblName = "my_table"
     val stream = MemoryStream[Int]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch branch-4.2 updated: Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamReader.table()"

Reply via email to