This is an automated email from the ASF dual-hosted git repository.

cloud-fan pushed a commit to branch branch-4.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-4.2 by this push:
     new 425bfab0f73a [SPARK-57109][SQL] SYSTEM.SESSION should not be part of 
SYSTEM_PATH
425bfab0f73a is described below

commit 425bfab0f73aad49b7c01c868a50b74e6b0804d8
Author: Serge Rielau <[email protected]>
AuthorDate: Sun May 31 09:00:02 2026 +0800

    [SPARK-57109][SQL] SYSTEM.SESSION should not be part of SYSTEM_PATH
    
    ### What changes were proposed in this pull request?
    
    Remove `system.session` from the expansion of the `SYSTEM_PATH` virtual 
schema in `SET PATH`.
    
    - `SET PATH = SYSTEM_PATH` now expands to just `system.builtin` (the 
system-managed namespaces under the `system` catalog).
    - `DEFAULT_PATH` is unchanged: it still expands to `system.builtin`, 
`system.session`, and the current schema (when `spark.sql.defaultPath` is 
empty).
    
    Mechanically this decouples `SQLConf.systemPathOrder` from 
`SQLConf.defaultPathOrder` -- the former is now a fixed `[system.builtin]`, 
while the latter still seeds both system entries plus any catalog entries. The 
standalone-test fallback in 
`SessionCatalog.sessionFunctionKindsInResolutionOrder` switches to 
`defaultPathOrder(Seq.empty)` so unqualified temp functions remain resolvable 
in tests with no `CatalogManager` bound.
    
    Docs (`docs/sql-ref-syntax-aux-conf-mgmt-set-path.md`) and the SQL golden 
test (`sql-path.sql`) are updated accordingly. A new `SET PATH = SYSTEM_PATH, 
CURRENT_SCHEMA` example is added, since that combination is the canonical 
"system functions plus my working schema" path.
    
    ### Why are the changes needed?
    
    `SYSTEM_PATH` is meant to make access to system-managed objects easy and to 
scale to additional system schemas in the future (e.g. `system.ai`, 
`system.geo`, `system.ml`). The contents of `system.session`, however, are 
user-defined (temporary views, functions, and variables), so it does not fit 
that bucket. It also encourages users to qualify references to temporary 
objects with `session.` so they cannot be silently confused with or shadowed by 
persistent objects.
    
    See [SPARK-57109](https://issues.apache.org/jira/browse/SPARK-57109) 
(subtask of [SPARK-54806](https://issues.apache.org/jira/browse/SPARK-54806)).
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes. `SET PATH = SYSTEM_PATH` previously expanded to two entries 
(`system.builtin` and `system.session`); it now expands to one 
(`system.builtin`). Users who relied on the old behavior should write `SET PATH 
= SYSTEM_PATH, system.session, ...` or use `DEFAULT_PATH`, which is unchanged.
    
    The feature is gated by `spark.sql.path.enabled` (default `false`) and has 
only been available on master, so no released Spark version is affected.
    
    ### How was this patch tested?
    
    - Updated/extended `SetPathSuite` to pin the new contracts:
      - `SET PATH = SYSTEM_PATH` expands to exactly `[system.builtin]`.
      - `SET PATH = DEFAULT_PATH` still includes `system.builtin`, 
`system.session`, and the current schema.
      - `SET PATH = SYSTEM_PATH, CURRENT_SCHEMA` composes to `[system.builtin, 
current schema]`.
    - Regenerated the `sql-path.sql` analyzer-results and results golden files; 
added a `SYSTEM_PATH, CURRENT_SCHEMA` example to the input.
    - Ran `SetPathSuite` (62/62), `SQLQueryTestSuite -- -z sql-path` (2/2), 
`RelationQualificationSuite` + `FunctionQualificationSuite` (82/82), 
`SessionCatalogSuite` (103/103), `CatalogManagerSuite` (16/16), `SQLViewSuite` 
+ `SQLFunctionSuite` (70/70), and `SQLQueryTestSuite -- -z keywords` (6/6). All 
passing locally.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Cursor (Claude Opus 4.7)
    
    Closes #56161 from srielau/SPARK-57109-remove-session-from-system.
    
    Lead-authored-by: Serge Rielau <[email protected]>
    Co-authored-by: Wenchen Fan <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
    (cherry picked from commit 734a19bfd124abcea3f74ff5dc0c14b4b788da61)
    Signed-off-by: Wenchen Fan <[email protected]>
---
 docs/sql-ref-syntax-aux-conf-mgmt-set-path.md      | 13 ++++-
 .../sql/catalyst/catalog/SessionCatalog.scala      | 12 +++--
 .../org/apache/spark/sql/internal/SQLConf.scala    | 14 ++++--
 .../sql-tests/analyzer-results/sql-path.sql.out    | 26 ++++++++++
 .../test/resources/sql-tests/inputs/sql-path.sql   | 11 +++++
 .../resources/sql-tests/results/sql-path.sql.out   | 34 ++++++++++++-
 .../scala/org/apache/spark/sql/SetPathSuite.scala  | 56 ++++++++++++++++++++--
 7 files changed, 149 insertions(+), 17 deletions(-)

diff --git a/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md 
b/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md
index 0f8ddf410294..64e698fa193a 100644
--- a/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md
+++ b/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md
@@ -101,7 +101,9 @@ path_element
 
 * **`SYSTEM_PATH`**
 
-  Expands to the two system namespaces, `system.builtin` and `system.session`.
+  Expands to the system-managed namespaces under the `system` catalog. Today 
this is just
+  `system.builtin`, but it is reserved for future system-managed schemas (for 
example, hosting
+  built-in AI, geospatial, or ML functions).
 
 * **`PATH`**
 
@@ -169,9 +171,16 @@ path_element
 
 -- DEFAULT_PATH and SYSTEM_PATH shortcuts.
 > SET PATH = DEFAULT_PATH;
+> SELECT current_path();
+ system.builtin,system.session,spark_catalog.default
 > SET PATH = SYSTEM_PATH;
 > SELECT current_path();
- system.builtin,system.session
+ system.builtin
+
+-- SYSTEM_PATH composes naturally with the working schema.
+> SET PATH = SYSTEM_PATH, CURRENT_SCHEMA;
+> SELECT current_path();
+ system.builtin,spark_catalog.default
 
 -- Append an entry by referring to the current path.
 > SET PATH = spark_catalog.default, system.builtin;
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
index 69af5ab80eec..1b36a29438c9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
@@ -114,7 +114,7 @@ class SessionCatalog(
 
   /**
    * When set, unqualified builtin/temp function resolution uses this fixed 
kind order instead of
-   * [[catalogManagerForSessionFunctionKinds]] / [[SQLConf.systemPathOrder]]. 
For unit tests only;
+   * [[catalogManagerForSessionFunctionKinds]] / [[SQLConf.defaultPathOrder]]. 
For unit tests only;
    * production relies on the catalog manager binding.
    */
   @volatile private var sessionFunctionKindsTestOverride: 
Option[Seq[SessionFunctionKind]] = None
@@ -128,8 +128,9 @@ class SessionCatalog(
    * applied).
    *
    * When unset (e.g. standalone [[SessionCatalog]] in tests), kinds derive 
from
-   * [[SQLConf.systemPathOrder]] -- the seeded default path -- without 
assuming other legacy
-   * resolution-order conf beyond seeding `defaultPathOrder`.
+   * [[SQLConf.defaultPathOrder]] with no catalog entries -- equivalent to the 
system-namespace
+   * entries of the spark-built-in default path. This includes both 
`system.builtin` and
+   * `system.session` so unqualified temp functions are still resolvable in 
test setups.
    */
   @volatile private var catalogManagerForSessionFunctionKinds: 
Option[CatalogManager] = None
 
@@ -154,7 +155,8 @@ class SessionCatalog(
   /**
    * Session function kinds in resolution order for unqualified lookups: test 
override if set,
    * else live PATH from [[catalogManagerForSessionFunctionKinds]], else
-   * [[SQLConf.systemPathOrder]].
+   * [[SQLConf.defaultPathOrder]] with no catalog entries (so `system.builtin` 
and
+   * `system.session` are both reachable in standalone test mode).
    *
    * MUST NOT be called while holding [[SessionCatalog]]'s intrinsic lock (see 
SPARK-56939):
    * the path-driven branch delegates to [[CatalogManager]], which has its own 
intrinsic lock
@@ -169,7 +171,7 @@ class SessionCatalog(
           // (currentCatalog, currentNamespace, path) triple in a single 
critical section.
           cm.sessionFunctionKindsForUnqualifiedResolution()
         case None =>
-          CatalogManager.systemFunctionKindsFromPath(conf.systemPathOrder)
+          
CatalogManager.systemFunctionKindsFromPath(conf.defaultPathOrder(Seq.empty))
       }
     }
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 351a676ef921..80c4f0accb2d 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -8601,8 +8601,9 @@ class SQLConf extends Serializable with Logging with 
SqlApiConf {
 
   /**
    * Orders the given catalog path entries by 
[[sessionFunctionResolutionOrder]], inserting
-   * system.session and system.builtin. Used by both the legacy single-schema 
resolution and
-   * by SET PATH's DEFAULT_PATH / SYSTEM_PATH expansion to keep ordering in 
sync.
+   * system.session and system.builtin. Used by the legacy single-schema 
resolution and by
+   * SET PATH's DEFAULT_PATH expansion (when `spark.sql.defaultPath` is empty) 
to keep
+   * ordering in sync. SYSTEM_PATH no longer flows through here -- see 
[[systemPathOrder]].
    *
    * @param catalogEntries persistent catalog path entries (may be empty).
    */
@@ -8623,8 +8624,13 @@ class SQLConf extends Serializable with Logging with 
SqlApiConf {
     }
   }
 
-  /** System-only path (builtin + session) ordered by 
[[sessionFunctionResolutionOrder]]. */
-  def systemPathOrder: Seq[Seq[String]] = defaultPathOrder(Seq.empty)
+  /**
+   * System-only path used by `SET PATH = SYSTEM_PATH`. Contains the 
system-managed namespaces
+   * under the `system` catalog whose contents are wholly defined by Spark 
itself; today that
+   * is only `system.builtin`, but the shortcut is reserved for future 
system-managed schemas
+   * (e.g. AI, geospatial, ML).
+   */
+  def systemPathOrder: Seq[Seq[String]] = Seq(Seq("system", "builtin"))
 
   override def legacyParameterSubstitutionConstantsOnly: Boolean =
     getConf(SQLConf.LEGACY_PARAMETER_SUBSTITUTION_CONSTANTS_ONLY)
diff --git 
a/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out 
b/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out
index 3a494d1cd3b7..793b38a6172a 100644
--- a/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out
+++ b/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out
@@ -77,6 +77,32 @@ Project [current_path() AS current_path()#x]
 +- OneRowRelation
 
 
+-- !query
+USE spark_catalog.default
+-- !query analysis
+SetCatalogAndNamespace
++- ResolvedNamespace V2SessionCatalog(spark_catalog), [default]
+
+
+-- !query
+SET PATH = SYSTEM_PATH, CURRENT_SCHEMA
+-- !query analysis
+SetPathCommand [SystemPath, CurrentSchema]
+
+
+-- !query
+SELECT current_path()
+-- !query analysis
+Project [current_path() AS current_path()#x]
++- OneRowRelation
+
+
+-- !query
+SET PATH = DEFAULT_PATH
+-- !query analysis
+SetPathCommand [DefaultPath]
+
+
 -- !query
 SET PATH = spark_catalog.default, system.builtin
 -- !query analysis
diff --git a/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql 
b/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql
index e9d1d149e7fa..1137c747320c 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql
@@ -97,10 +97,21 @@ SELECT current_path();
 
 
 -- 2.3 SYSTEM_PATH shortcut 
----------------------------------------------------
+--
+-- SYSTEM_PATH expands to the system-managed namespaces under the `system`
+-- catalog. Today that is just `system.builtin`; the shortcut is reserved for
+-- future system-managed schemas.
 
 SET PATH = SYSTEM_PATH;
 SELECT current_path();
 
+-- SYSTEM_PATH composes naturally with CURRENT_SCHEMA to give "system functions
+-- plus my working schema".
+USE spark_catalog.default;
+SET PATH = SYSTEM_PATH, CURRENT_SCHEMA;
+SELECT current_path();
+SET PATH = DEFAULT_PATH;
+
 
 -- 2.4 PATH keyword (append to live path) 
--------------------------------------
 
diff --git a/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out 
b/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out
index 52d01ccb80ba..20d74b155258 100644
--- a/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out
@@ -92,7 +92,39 @@ SELECT current_path()
 -- !query schema
 struct<current_path():string>
 -- !query output
-system.builtin,system.session
+system.builtin
+
+
+-- !query
+USE spark_catalog.default
+-- !query schema
+struct<>
+-- !query output
+
+
+
+-- !query
+SET PATH = SYSTEM_PATH, CURRENT_SCHEMA
+-- !query schema
+struct<>
+-- !query output
+
+
+
+-- !query
+SELECT current_path()
+-- !query schema
+struct<current_path():string>
+-- !query output
+system.builtin,spark_catalog.default
+
+
+-- !query
+SET PATH = DEFAULT_PATH
+-- !query schema
+struct<>
+-- !query output
+
 
 
 -- !query
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala
index 238b52ab7cd9..b40e3799a9c0 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala
@@ -274,14 +274,60 @@ class SetPathSuite extends SharedSparkSession {
     }
   }
 
-  test("PATH enabled: SET PATH = SYSTEM_PATH includes system.builtin and 
system.session") {
+  test("PATH enabled: SET PATH = SYSTEM_PATH expands to system-managed 
namespaces") {
+    // SPARK-57109: SYSTEM_PATH expands to the system-managed namespaces under 
the `system`
+    // catalog. Today that is just `system.builtin`; the shortcut is reserved 
for future
+    // system-managed schemas.
     withPathEnabled {
       sql("SET PATH = SYSTEM_PATH")
       val entries = pathEntries(currentPath())
-      assert(entries.contains("system.builtin"),
-        s"SYSTEM_PATH should include system.builtin; got: $entries")
-      assert(entries.contains("system.session"),
-        s"SYSTEM_PATH should include system.session; got: $entries")
+      assert(entries === Seq("system.builtin"),
+        s"SYSTEM_PATH should expand to exactly [system.builtin]; got: 
$entries")
+    }
+  }
+
+  test("PATH enabled: SET PATH = DEFAULT_PATH includes system.builtin, 
system.session, " +
+    "and the current schema") {
+    // SPARK-57109: pin the spark-built-in default ordering used when 
`spark.sql.defaultPath`
+    // is empty, so a future change to SYSTEM_PATH cannot silently drift the 
DEFAULT_PATH
+    // contract. The default `sessionFunctionResolutionOrder` is "second" 
(builtin first, then
+    // session, then catalog entries); ordering tests for the other modes live 
below.
+    withPathEnabled {
+      sql("USE spark_catalog.default")
+      sql("SET PATH = DEFAULT_PATH")
+      val entries = pathEntries(currentPath())
+      assert(entries === Seq("system.builtin", "system.session", 
"spark_catalog.default"),
+        s"DEFAULT_PATH should expand to system.builtin, system.session, and 
the current " +
+          s"schema; got: $entries")
+    }
+  }
+
+  test("PATH enabled: SET PATH = SYSTEM_PATH, CURRENT_SCHEMA composes 
cleanly") {
+    // SPARK-57109: SYSTEM_PATH plus CURRENT_SCHEMA is the canonical "system 
functions plus my
+    // working schema" path; verify the expansion is exactly those two entries 
in order.
+    withPathEnabled {
+      sql("USE spark_catalog.default")
+      sql("SET PATH = SYSTEM_PATH, CURRENT_SCHEMA")
+      val entries = pathEntries(currentPath())
+      assert(entries === Seq("system.builtin", "spark_catalog.default"),
+        s"SYSTEM_PATH, CURRENT_SCHEMA should expand to [system.builtin, " +
+          s"current schema]; got: $entries")
+    }
+  }
+
+  test("PATH enabled: SET PATH = SYSTEM_PATH, system.session is the documented 
migration form") {
+    // SPARK-57109: callers who relied on the old SYSTEM_PATH expansion 
(system.builtin +
+    // system.session) can name system.session explicitly. Because SYSTEM_PATH 
now expands to
+    // only system.builtin, listing system.session alongside it is legal and 
yields
+    // [system.builtin, system.session]. If SYSTEM_PATH ever re-expanded to 
carry system.session
+    // again, this entry would collide and raise DUPLICATE_SQL_PATH_ENTRY -- 
which is the
+    // regression this test guards against.
+    withPathEnabled {
+      sql("SET PATH = SYSTEM_PATH, system.session")
+      val entries = pathEntries(currentPath())
+      assert(entries === Seq("system.builtin", "system.session"),
+        s"SYSTEM_PATH, system.session should expand to [system.builtin, 
system.session]; " +
+          s"got: $entries")
     }
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to