This is an automated email from the ASF dual-hosted git repository.
cloud-fan pushed a commit to branch branch-4.x
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.x by this push:
new d75c74f0060f [SPARK-57109][SQL] SYSTEM.SESSION should not be part of
SYSTEM_PATH
d75c74f0060f is described below
commit d75c74f0060f98d8a01ca9e131dbcde7675912db
Author: Serge Rielau <[email protected]>
AuthorDate: Sun May 31 09:00:02 2026 +0800
[SPARK-57109][SQL] SYSTEM.SESSION should not be part of SYSTEM_PATH
### What changes were proposed in this pull request?
Remove `system.session` from the expansion of the `SYSTEM_PATH` virtual
schema in `SET PATH`.
- `SET PATH = SYSTEM_PATH` now expands to just `system.builtin` (the
system-managed namespaces under the `system` catalog).
- `DEFAULT_PATH` is unchanged: it still expands to `system.builtin`,
`system.session`, and the current schema (when `spark.sql.defaultPath` is
empty).
Mechanically this decouples `SQLConf.systemPathOrder` from
`SQLConf.defaultPathOrder` -- the former is now a fixed `[system.builtin]`,
while the latter still seeds both system entries plus any catalog entries. The
standalone-test fallback in
`SessionCatalog.sessionFunctionKindsInResolutionOrder` switches to
`defaultPathOrder(Seq.empty)` so unqualified temp functions remain resolvable
in tests with no `CatalogManager` bound.
Docs (`docs/sql-ref-syntax-aux-conf-mgmt-set-path.md`) and the SQL golden
test (`sql-path.sql`) are updated accordingly. A new `SET PATH = SYSTEM_PATH,
CURRENT_SCHEMA` example is added, since that combination is the canonical
"system functions plus my working schema" path.
### Why are the changes needed?
`SYSTEM_PATH` is meant to make access to system-managed objects easy and to
scale to additional system schemas in the future (e.g. `system.ai`,
`system.geo`, `system.ml`). The contents of `system.session`, however, are
user-defined (temporary views, functions, and variables), so it does not fit
that bucket. It also encourages users to qualify references to temporary
objects with `session.` so they cannot be silently confused with or shadowed by
persistent objects.
See [SPARK-57109](https://issues.apache.org/jira/browse/SPARK-57109)
(subtask of [SPARK-54806](https://issues.apache.org/jira/browse/SPARK-54806)).
### Does this PR introduce _any_ user-facing change?
Yes. `SET PATH = SYSTEM_PATH` previously expanded to two entries
(`system.builtin` and `system.session`); it now expands to one
(`system.builtin`). Users who relied on the old behavior should write `SET PATH
= SYSTEM_PATH, system.session, ...` or use `DEFAULT_PATH`, which is unchanged.
The feature is gated by `spark.sql.path.enabled` (default `false`) and has
only been available on master, so no released Spark version is affected.
### How was this patch tested?
- Updated/extended `SetPathSuite` to pin the new contracts:
- `SET PATH = SYSTEM_PATH` expands to exactly `[system.builtin]`.
- `SET PATH = DEFAULT_PATH` still includes `system.builtin`,
`system.session`, and the current schema.
- `SET PATH = SYSTEM_PATH, CURRENT_SCHEMA` composes to `[system.builtin,
current schema]`.
- Regenerated the `sql-path.sql` analyzer-results and results golden files;
added a `SYSTEM_PATH, CURRENT_SCHEMA` example to the input.
- Ran `SetPathSuite` (62/62), `SQLQueryTestSuite -- -z sql-path` (2/2),
`RelationQualificationSuite` + `FunctionQualificationSuite` (82/82),
`SessionCatalogSuite` (103/103), `CatalogManagerSuite` (16/16), `SQLViewSuite`
+ `SQLFunctionSuite` (70/70), and `SQLQueryTestSuite -- -z keywords` (6/6). All
passing locally.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor (Claude Opus 4.7)
Closes #56161 from srielau/SPARK-57109-remove-session-from-system.
Lead-authored-by: Serge Rielau <[email protected]>
Co-authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 734a19bfd124abcea3f74ff5dc0c14b4b788da61)
Signed-off-by: Wenchen Fan <[email protected]>
---
docs/sql-ref-syntax-aux-conf-mgmt-set-path.md | 13 ++++-
.../sql/catalyst/catalog/SessionCatalog.scala | 12 +++--
.../org/apache/spark/sql/internal/SQLConf.scala | 14 ++++--
.../sql-tests/analyzer-results/sql-path.sql.out | 26 ++++++++++
.../test/resources/sql-tests/inputs/sql-path.sql | 11 +++++
.../resources/sql-tests/results/sql-path.sql.out | 34 ++++++++++++-
.../scala/org/apache/spark/sql/SetPathSuite.scala | 56 ++++++++++++++++++++--
7 files changed, 149 insertions(+), 17 deletions(-)
diff --git a/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md
b/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md
index 0f8ddf410294..64e698fa193a 100644
--- a/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md
+++ b/docs/sql-ref-syntax-aux-conf-mgmt-set-path.md
@@ -101,7 +101,9 @@ path_element
* **`SYSTEM_PATH`**
- Expands to the two system namespaces, `system.builtin` and `system.session`.
+ Expands to the system-managed namespaces under the `system` catalog. Today
this is just
+ `system.builtin`, but it is reserved for future system-managed schemas (for
example, hosting
+ built-in AI, geospatial, or ML functions).
* **`PATH`**
@@ -169,9 +171,16 @@ path_element
-- DEFAULT_PATH and SYSTEM_PATH shortcuts.
> SET PATH = DEFAULT_PATH;
+> SELECT current_path();
+ system.builtin,system.session,spark_catalog.default
> SET PATH = SYSTEM_PATH;
> SELECT current_path();
- system.builtin,system.session
+ system.builtin
+
+-- SYSTEM_PATH composes naturally with the working schema.
+> SET PATH = SYSTEM_PATH, CURRENT_SCHEMA;
+> SELECT current_path();
+ system.builtin,spark_catalog.default
-- Append an entry by referring to the current path.
> SET PATH = spark_catalog.default, system.builtin;
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
index 69af5ab80eec..1b36a29438c9 100644
---
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
+++
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
@@ -114,7 +114,7 @@ class SessionCatalog(
/**
* When set, unqualified builtin/temp function resolution uses this fixed
kind order instead of
- * [[catalogManagerForSessionFunctionKinds]] / [[SQLConf.systemPathOrder]].
For unit tests only;
+ * [[catalogManagerForSessionFunctionKinds]] / [[SQLConf.defaultPathOrder]].
For unit tests only;
* production relies on the catalog manager binding.
*/
@volatile private var sessionFunctionKindsTestOverride:
Option[Seq[SessionFunctionKind]] = None
@@ -128,8 +128,9 @@ class SessionCatalog(
* applied).
*
* When unset (e.g. standalone [[SessionCatalog]] in tests), kinds derive
from
- * [[SQLConf.systemPathOrder]] -- the seeded default path -- without
assuming other legacy
- * resolution-order conf beyond seeding `defaultPathOrder`.
+ * [[SQLConf.defaultPathOrder]] with no catalog entries -- equivalent to the
system-namespace
+ * entries of the spark-built-in default path. This includes both
`system.builtin` and
+ * `system.session` so unqualified temp functions are still resolvable in
test setups.
*/
@volatile private var catalogManagerForSessionFunctionKinds:
Option[CatalogManager] = None
@@ -154,7 +155,8 @@ class SessionCatalog(
/**
* Session function kinds in resolution order for unqualified lookups: test
override if set,
* else live PATH from [[catalogManagerForSessionFunctionKinds]], else
- * [[SQLConf.systemPathOrder]].
+ * [[SQLConf.defaultPathOrder]] with no catalog entries (so `system.builtin`
and
+ * `system.session` are both reachable in standalone test mode).
*
* MUST NOT be called while holding [[SessionCatalog]]'s intrinsic lock (see
SPARK-56939):
* the path-driven branch delegates to [[CatalogManager]], which has its own
intrinsic lock
@@ -169,7 +171,7 @@ class SessionCatalog(
// (currentCatalog, currentNamespace, path) triple in a single
critical section.
cm.sessionFunctionKindsForUnqualifiedResolution()
case None =>
- CatalogManager.systemFunctionKindsFromPath(conf.systemPathOrder)
+
CatalogManager.systemFunctionKindsFromPath(conf.defaultPathOrder(Seq.empty))
}
}
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 0aed28e92558..41a511666af8 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -8657,8 +8657,9 @@ class SQLConf extends Serializable with Logging with
SqlApiConf {
/**
* Orders the given catalog path entries by
[[sessionFunctionResolutionOrder]], inserting
- * system.session and system.builtin. Used by both the legacy single-schema
resolution and
- * by SET PATH's DEFAULT_PATH / SYSTEM_PATH expansion to keep ordering in
sync.
+ * system.session and system.builtin. Used by the legacy single-schema
resolution and by
+ * SET PATH's DEFAULT_PATH expansion (when `spark.sql.defaultPath` is empty)
to keep
+ * ordering in sync. SYSTEM_PATH no longer flows through here -- see
[[systemPathOrder]].
*
* @param catalogEntries persistent catalog path entries (may be empty).
*/
@@ -8679,8 +8680,13 @@ class SQLConf extends Serializable with Logging with
SqlApiConf {
}
}
- /** System-only path (builtin + session) ordered by
[[sessionFunctionResolutionOrder]]. */
- def systemPathOrder: Seq[Seq[String]] = defaultPathOrder(Seq.empty)
+ /**
+ * System-only path used by `SET PATH = SYSTEM_PATH`. Contains the
system-managed namespaces
+ * under the `system` catalog whose contents are wholly defined by Spark
itself; today that
+ * is only `system.builtin`, but the shortcut is reserved for future
system-managed schemas
+ * (e.g. AI, geospatial, ML).
+ */
+ def systemPathOrder: Seq[Seq[String]] = Seq(Seq("system", "builtin"))
override def legacyParameterSubstitutionConstantsOnly: Boolean =
getConf(SQLConf.LEGACY_PARAMETER_SUBSTITUTION_CONSTANTS_ONLY)
diff --git
a/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out
b/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out
index 3a494d1cd3b7..793b38a6172a 100644
--- a/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out
+++ b/sql/core/src/test/resources/sql-tests/analyzer-results/sql-path.sql.out
@@ -77,6 +77,32 @@ Project [current_path() AS current_path()#x]
+- OneRowRelation
+-- !query
+USE spark_catalog.default
+-- !query analysis
+SetCatalogAndNamespace
++- ResolvedNamespace V2SessionCatalog(spark_catalog), [default]
+
+
+-- !query
+SET PATH = SYSTEM_PATH, CURRENT_SCHEMA
+-- !query analysis
+SetPathCommand [SystemPath, CurrentSchema]
+
+
+-- !query
+SELECT current_path()
+-- !query analysis
+Project [current_path() AS current_path()#x]
++- OneRowRelation
+
+
+-- !query
+SET PATH = DEFAULT_PATH
+-- !query analysis
+SetPathCommand [DefaultPath]
+
+
-- !query
SET PATH = spark_catalog.default, system.builtin
-- !query analysis
diff --git a/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql
b/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql
index e9d1d149e7fa..1137c747320c 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/sql-path.sql
@@ -97,10 +97,21 @@ SELECT current_path();
-- 2.3 SYSTEM_PATH shortcut
----------------------------------------------------
+--
+-- SYSTEM_PATH expands to the system-managed namespaces under the `system`
+-- catalog. Today that is just `system.builtin`; the shortcut is reserved for
+-- future system-managed schemas.
SET PATH = SYSTEM_PATH;
SELECT current_path();
+-- SYSTEM_PATH composes naturally with CURRENT_SCHEMA to give "system functions
+-- plus my working schema".
+USE spark_catalog.default;
+SET PATH = SYSTEM_PATH, CURRENT_SCHEMA;
+SELECT current_path();
+SET PATH = DEFAULT_PATH;
+
-- 2.4 PATH keyword (append to live path)
--------------------------------------
diff --git a/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out
b/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out
index 52d01ccb80ba..20d74b155258 100644
--- a/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/sql-path.sql.out
@@ -92,7 +92,39 @@ SELECT current_path()
-- !query schema
struct<current_path():string>
-- !query output
-system.builtin,system.session
+system.builtin
+
+
+-- !query
+USE spark_catalog.default
+-- !query schema
+struct<>
+-- !query output
+
+
+
+-- !query
+SET PATH = SYSTEM_PATH, CURRENT_SCHEMA
+-- !query schema
+struct<>
+-- !query output
+
+
+
+-- !query
+SELECT current_path()
+-- !query schema
+struct<current_path():string>
+-- !query output
+system.builtin,spark_catalog.default
+
+
+-- !query
+SET PATH = DEFAULT_PATH
+-- !query schema
+struct<>
+-- !query output
+
-- !query
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala
b/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala
index 238b52ab7cd9..b40e3799a9c0 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SetPathSuite.scala
@@ -274,14 +274,60 @@ class SetPathSuite extends SharedSparkSession {
}
}
- test("PATH enabled: SET PATH = SYSTEM_PATH includes system.builtin and
system.session") {
+ test("PATH enabled: SET PATH = SYSTEM_PATH expands to system-managed
namespaces") {
+ // SPARK-57109: SYSTEM_PATH expands to the system-managed namespaces under
the `system`
+ // catalog. Today that is just `system.builtin`; the shortcut is reserved
for future
+ // system-managed schemas.
withPathEnabled {
sql("SET PATH = SYSTEM_PATH")
val entries = pathEntries(currentPath())
- assert(entries.contains("system.builtin"),
- s"SYSTEM_PATH should include system.builtin; got: $entries")
- assert(entries.contains("system.session"),
- s"SYSTEM_PATH should include system.session; got: $entries")
+ assert(entries === Seq("system.builtin"),
+ s"SYSTEM_PATH should expand to exactly [system.builtin]; got:
$entries")
+ }
+ }
+
+ test("PATH enabled: SET PATH = DEFAULT_PATH includes system.builtin,
system.session, " +
+ "and the current schema") {
+ // SPARK-57109: pin the spark-built-in default ordering used when
`spark.sql.defaultPath`
+ // is empty, so a future change to SYSTEM_PATH cannot silently drift the
DEFAULT_PATH
+ // contract. The default `sessionFunctionResolutionOrder` is "second"
(builtin first, then
+ // session, then catalog entries); ordering tests for the other modes live
below.
+ withPathEnabled {
+ sql("USE spark_catalog.default")
+ sql("SET PATH = DEFAULT_PATH")
+ val entries = pathEntries(currentPath())
+ assert(entries === Seq("system.builtin", "system.session",
"spark_catalog.default"),
+ s"DEFAULT_PATH should expand to system.builtin, system.session, and
the current " +
+ s"schema; got: $entries")
+ }
+ }
+
+ test("PATH enabled: SET PATH = SYSTEM_PATH, CURRENT_SCHEMA composes
cleanly") {
+ // SPARK-57109: SYSTEM_PATH plus CURRENT_SCHEMA is the canonical "system
functions plus my
+ // working schema" path; verify the expansion is exactly those two entries
in order.
+ withPathEnabled {
+ sql("USE spark_catalog.default")
+ sql("SET PATH = SYSTEM_PATH, CURRENT_SCHEMA")
+ val entries = pathEntries(currentPath())
+ assert(entries === Seq("system.builtin", "spark_catalog.default"),
+ s"SYSTEM_PATH, CURRENT_SCHEMA should expand to [system.builtin, " +
+ s"current schema]; got: $entries")
+ }
+ }
+
+ test("PATH enabled: SET PATH = SYSTEM_PATH, system.session is the documented
migration form") {
+ // SPARK-57109: callers who relied on the old SYSTEM_PATH expansion
(system.builtin +
+ // system.session) can name system.session explicitly. Because SYSTEM_PATH
now expands to
+ // only system.builtin, listing system.session alongside it is legal and
yields
+ // [system.builtin, system.session]. If SYSTEM_PATH ever re-expanded to
carry system.session
+ // again, this entry would collide and raise DUPLICATE_SQL_PATH_ENTRY --
which is the
+ // regression this test guards against.
+ withPathEnabled {
+ sql("SET PATH = SYSTEM_PATH, system.session")
+ val entries = pathEntries(currentPath())
+ assert(entries === Seq("system.builtin", "system.session"),
+ s"SYSTEM_PATH, system.session should expand to [system.builtin,
system.session]; " +
+ s"got: $entries")
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]