flyrain opened a new issue, #3890: URL: https://github.com/apache/polaris/issues/3890
### Is your feature request related to a problem? Please describe. https://github.com/apache/polaris/pull/3385 took a stub on multi-datasource support for the metrics table. It turns out to be pretty complicated. There are similar requirements emerging for event tables and idempotency tables. This suggests that we need a more holistic design rather than handling each table type independently. The motivation to support multi-DS is to isolate different persistence workload. For example, events or metrics persistence may bring a lot of traffic to impact the other tables, esp. the critical entity table. Isolated data sources can mitigate the noisy neighbor effect. However, there are open questions and design considerations: 1. Should we split each potential noisy table into its own dedicated data source. For example, one data source for events, one for metrics, and one for idempotency. 2. Should we allow flexible grouping. For example, events and idempotency tables sharing one data source, while metrics uses another. 3. Should we consider different DS per realm instead of table-level spliting? 4. How should schema version information be managed. If tables live in different data sources, how do we track and coordinate schema evolution. 5. Should different data sources be allowed to point to different schemas or databases. This likely aligns with the isolation goal, but it implies that cross table joins become difficult or impossible at the database level, leaving only in memory joins as an option. 6. Should different data sources be allowed to point to the same schema. If not, we need validation logic to detect and prevent misconfiguration. Given the number of architectural implications, this likely requires a dedicated design discussion before proceeding with incremental changes. Without a coherent design, we risk introducing inconsistencies across metrics, events, and idempotency handling. ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
