Hi EJ,

In the current state of PR [3960], the DataSourceResolver interface enables
customizing per-realm DataSource resolution.

However, Apache Polaris code is not complicated with any new per-realm
logic related to DataSources. The OSS side keeps working as before.

I'm not sure whether Sunham intended to delegate this logic to custom
implementations or not, but as the PR stands the code looks pretty
reasonable to me.

In the current state of the codebase, I do not think we can completely
avoid dealing with realms as they relate to DataSources. The decision about
how to cross-link them has to be made somewhere. IMHO, the
proposed DefaultDataSourceResolver looks like a good place to make that
decision. It is certainly subject to further evolution if we need to make
adjustments later.

Before [3960] all realms implicitly received the default DataSource. This
is now explicit in the code.

[3960] https://github.com/apache/polaris/pull/3960

Cheers,
Dmitri.

On Thu, Mar 12, 2026 at 2:29 AM EJ Wang <[email protected]>
wrote:

> Hi Subham, Dmitri, Yufei, JB,
>
> I’m generally aligned with the direction here, and I left a few more
> detailed comments on the PR.
> At a high level, my main concern is that the current proposal may still be
> a bit ahead of the current story/scope. I’d lean toward keeping the first
> step narrower, preferring purpose-based routing over per-realm routing for
> v1, and making the supported model/config+migration story more explicit
> before the broader contract hardens.
>
> -ej
>
> On Wed, Mar 11, 2026 at 12:37 PM Jean-Baptiste Onofré <[email protected]>
> wrote:
>
> > Hi Subham,
> >
> > Thanks for this contribution. It's an interesting feature.
> >
> > As mentioned in the GitHub issues, I am fine with moving forward with a
> PR
> > as long as it remains a Draft PR to help drive the discussion. I suggest
> > linking a GitHub Discussion or a Design Doc within the PR to help build
> > consensus.
> >
> > That said, I have a few initial comments:
> >
> > 1. I like the SPI approach used in the PR. This should become a standard
> in
> > Polaris to facilitate custom implementations.
> > 2. I agree that having a data source "per purpose" is a good idea. The
> main
> > question is how we should handle the split:
> > - Very granular (per entity)
> > - By table "meaning"
> > - By realm (this may not be granular enough)
> >
> > From a user standpoint, I believe we should keep it simple. It would be a
> > first great step forward. For example, in the configuration (e.g.,
> > application.properties), it could look like:
> > - polaris.datasource.entities=
> > - polaris.datasource.events=
> > - polaris.datasource.grants=
> >
> > Regards,
> > JB
> >
> > On Tue, Mar 10, 2026 at 11:19 PM Yufei Gu <[email protected]> wrote:
> >
> > > Hi Subham,
> > >
> > > Thanks for working on this. Given the complexity and long term
> > implications
> > > discussed in https://github.com/apache/polaris/issues/3890, I think a
> > > short
> > > design doc could still be helpful to capture the intended architecture
> > and
> > > future evolution. Here are a few questions listed in the issue. I
> believe
> > > these should be answered before jumping to an implementation.
> > >
> > >
> > >    1. Should we split each potential noisy table into its own dedicated
> > >    data source. For example, one data source for events, one for
> metrics,
> > > and
> > >    one for idempotency.
> > >    2. Should we allow flexible grouping. For example, events and
> > >    idempotency tables sharing one data source, while metrics uses
> > another.
> > >    3. Should we consider different DS per realm instead of table-level
> > >    spliting?
> > >    4. How should schema version information be managed. If tables live
> in
> > >    different data sources, how do we track and coordinate schema
> > evolution.
> > >    5. Should different data sources be allowed to point to different
> > >    schemas or databases. This likely aligns with the isolation goal,
> but
> > it
> > >    implies that cross table joins become difficult or impossible at the
> > >    database level, leaving only in memory joins as an option.
> > >    6. Should different data sources be allowed to point to the same
> > schema.
> > >    If not, we need validation logic to detect and prevent
> > misconfiguration.
> > >
> > >
> > > Yufei
> > >
> > >
> > > On Tue, Mar 10, 2026 at 7:33 AM Dmitri Bourlatchkov <[email protected]>
> > > wrote:
> > >
> > > > Hi Subham,
> > > >
> > > > Thanks again for your contribution!
> > > >
> > > > I believe PR 3960 moves in the right direction by establishing an SPI
> > to
> > > > delegate DataSource resolution logic to the runtime environment.
> > > >
> > > > It immediately allows custom implementations in downstream projects
> (if
> > > > people wish to do that) and opens a way for supporting multiple
> > > DataSources
> > > > in Apache Polaris (in follow-up PRs),
> > > >
> > > > I think the PR is pretty clear in itself and does not require any
> extra
> > > > design docs. Let's review it in GH and merge when we have consensus.
> > > >
> > > > Cheers,
> > > > Dmitri.
> > > >
> > > > On Tue, Mar 10, 2026 at 8:27 AM Subham Sangwan <
> > > > [email protected]>
> > > > wrote:
> > > >
> > > > > Hi Polaris Dev Team I have opened PR #3960 [1] to introduce the
> > > > > foundational groundwork for multi-datasource support in JDBC
> > > persistence,
> > > > > addressing Issue #3890 [2].The goal is to enable physical isolation
> > of
> > > > > different persistence workloads (METASTORE, METRICS, EVENTS) into
> > > > dedicated
> > > > > connection pools or databases. This will allow Polaris to better
> > handle
> > > > > high-traffic environments by preventing "noisy neighbor" effects on
> > the
> > > > > core entity tables.
> > > > >
> > > > > Key Highlights:
> > > > >
> > > > >    - DataSourceResolver: A new pluggable interface for routing JDBC
> > > > >    connections based on RealmContext and StoreType.
> > > > >    - Modular Design: Decoupled the resolution implementation into
> the
> > > > >    runtime-common module.
> > > > >    - Consistency: Utilizes a type-safe StoreType enum and aligns
> with
> > > > >    existing RealmContext patterns.
> > > > >
> > > > > The PR has been refined with feedback from @dimas-b and is now
> ready
> > > for
> > > > > community review. I'd appreciate any feedback on the overall
> > approach.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Subham Sangwan
> > > > > GitHub: Subham-KRLX
> > > > >
> > > > > [1] https://github.com/apache/polaris/pull/3960
> > > > > [2] https://github.com/apache/polaris/issues/3890
> > > > >
> > > >
> > >
> >
>

Reply via email to