wangzhigang1999 commented on code in PR #7385:
URL: https://github.com/apache/kyuubi/pull/7385#discussion_r3063133983
##########
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/EngineRef.scala:
##########
@@ -113,6 +116,17 @@ private[kyuubi] class EngineRef(
@VisibleForTesting
private[kyuubi] val subdomain: String =
conf.get(ENGINE_SHARE_LEVEL_SUBDOMAIN) match {
+ // DATA_AGENT engines must be isolated by datasource (JDBC URL). This
takes precedence over
Review Comment:
Thanks for pushing on this — "must" really was under-motivated, let me walk
through the thinking.
Starting from what's actually in this PR, the provider SPI is
single-datasource by contract:
- `DataAgentProvider.load(KyuubiConf)` is called once per engine from
engine-level conf.
- `ENGINE_DATA_AGENT_JDBC_URL` is a single `OptionalConfigEntry[String]`.
- `ProviderRunRequest` carries `question` / `modelName` / `approvalMode` —
no datasource field.
So a provider instance is bound to exactly one JDBC URL when it's loaded,
and a caller has no way to dispatch a request against a different URL. The
`EngineRef.subdomain` branch is just the routing side of that same contract,
keeping the two halves in sync.
As for why the SPI is shaped that way: I tried the multi-datasource version
in a previous project and ran into real pain — SQL dialect divergence (tools
built around one dialect don't transfer cleanly to another), connection-pool
and credential isolation, schema introspection cost, and the general
awkwardness of one process quietly talking to several production databases. The
takeaway for me was that binding a process to a single datasource makes tool
design, prompt construction, and credential handling a lot simpler, so I
carried that lesson into the SPI here.
Multi-datasource isn't ruled out forever — it would just mean rethinking the
SPI (per-request datasource, or a provider keyed by datasource), the config
surface, and the routing together, which feels like its own discussion. Happy
to revisit if a concrete use case comes up.
On the comment itself, I'll tighten it to say the *why* directly:
```scala
// A DATA_AGENT engine is bound 1:1 to a JDBC datasource because the
provider SPI
// is constructed once per engine from a single
kyuubi.engine.data.agent.jdbc.url,
// and ProviderRunRequest carries no datasource field — there is no way to
dispatch
// a request against a different JDBC URL than the one the provider was
loaded
// with. Sessions targeting different datasources must therefore route to
distinct
// engines.
```
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]