Hi Subham, Thanks for this contribution. It's an interesting feature.
As mentioned in the GitHub issues, I am fine with moving forward with a PR as long as it remains a Draft PR to help drive the discussion. I suggest linking a GitHub Discussion or a Design Doc within the PR to help build consensus. That said, I have a few initial comments: 1. I like the SPI approach used in the PR. This should become a standard in Polaris to facilitate custom implementations. 2. I agree that having a data source "per purpose" is a good idea. The main question is how we should handle the split: - Very granular (per entity) - By table "meaning" - By realm (this may not be granular enough) >From a user standpoint, I believe we should keep it simple. It would be a first great step forward. For example, in the configuration (e.g., application.properties), it could look like: - polaris.datasource.entities= - polaris.datasource.events= - polaris.datasource.grants= Regards, JB On Tue, Mar 10, 2026 at 11:19 PM Yufei Gu <[email protected]> wrote: > Hi Subham, > > Thanks for working on this. Given the complexity and long term implications > discussed in https://github.com/apache/polaris/issues/3890, I think a > short > design doc could still be helpful to capture the intended architecture and > future evolution. Here are a few questions listed in the issue. I believe > these should be answered before jumping to an implementation. > > > 1. Should we split each potential noisy table into its own dedicated > data source. For example, one data source for events, one for metrics, > and > one for idempotency. > 2. Should we allow flexible grouping. For example, events and > idempotency tables sharing one data source, while metrics uses another. > 3. Should we consider different DS per realm instead of table-level > spliting? > 4. How should schema version information be managed. If tables live in > different data sources, how do we track and coordinate schema evolution. > 5. Should different data sources be allowed to point to different > schemas or databases. This likely aligns with the isolation goal, but it > implies that cross table joins become difficult or impossible at the > database level, leaving only in memory joins as an option. > 6. Should different data sources be allowed to point to the same schema. > If not, we need validation logic to detect and prevent misconfiguration. > > > Yufei > > > On Tue, Mar 10, 2026 at 7:33 AM Dmitri Bourlatchkov <[email protected]> > wrote: > > > Hi Subham, > > > > Thanks again for your contribution! > > > > I believe PR 3960 moves in the right direction by establishing an SPI to > > delegate DataSource resolution logic to the runtime environment. > > > > It immediately allows custom implementations in downstream projects (if > > people wish to do that) and opens a way for supporting multiple > DataSources > > in Apache Polaris (in follow-up PRs), > > > > I think the PR is pretty clear in itself and does not require any extra > > design docs. Let's review it in GH and merge when we have consensus. > > > > Cheers, > > Dmitri. > > > > On Tue, Mar 10, 2026 at 8:27 AM Subham Sangwan < > > [email protected]> > > wrote: > > > > > Hi Polaris Dev Team I have opened PR #3960 [1] to introduce the > > > foundational groundwork for multi-datasource support in JDBC > persistence, > > > addressing Issue #3890 [2].The goal is to enable physical isolation of > > > different persistence workloads (METASTORE, METRICS, EVENTS) into > > dedicated > > > connection pools or databases. This will allow Polaris to better handle > > > high-traffic environments by preventing "noisy neighbor" effects on the > > > core entity tables. > > > > > > Key Highlights: > > > > > > - DataSourceResolver: A new pluggable interface for routing JDBC > > > connections based on RealmContext and StoreType. > > > - Modular Design: Decoupled the resolution implementation into the > > > runtime-common module. > > > - Consistency: Utilizes a type-safe StoreType enum and aligns with > > > existing RealmContext patterns. > > > > > > The PR has been refined with feedback from @dimas-b and is now ready > for > > > community review. I'd appreciate any feedback on the overall approach. > > > > > > Best regards, > > > > > > Subham Sangwan > > > GitHub: Subham-KRLX > > > > > > [1] https://github.com/apache/polaris/pull/3960 > > > [2] https://github.com/apache/polaris/issues/3890 > > > > > >
