Hi Nandor, Good point! The new metrics and events APIs should definitely be marked as "experimental" or "beta" initially.
That's a common practice for new APIs in Polaris. Cheers, Dmitri. On Thu, Mar 19, 2026 at 4:30 PM Nándor Kollár <[email protected]> wrote: > Hi All, > > I don’t see a problem with having two REST APIs for catalog events. We can > mark the local API as experimental (if there’s a way to do that in > Polaris), document that it isn’t stable, and deprecate it once the Iceberg > API is released. Alternatively, we could keep and continue improving it > with features that are missing from the Iceberg REST spec but are relevant > for Polaris. > > Nandor > > Dmitri Bourlatchkov <[email protected]> ezt írta (időpont: 2026. márc. 19., > Cs, 17:44): > > > Hi All, > > > > Polaris can support both a local API and new endpoints in the IRC API > once > > the Iceberg community adopts the latter. > > > > What is the concern with having different APIs to access the same data? > > > > Cheers, > > Dmitri. > > > > On Tue, Mar 17, 2026 at 3:49 PM EJ Wang <[email protected]> > > wrote: > > > > > I share the same feeling with Yifei as reviewing the PRs. I want to > avoid > > > creating discrepancies if the IRC side ended up supporting it. > > > > > > -ej > > > > > > On Wed, Mar 11, 2026 at 2:47 PM Yufei Gu <[email protected]> wrote: > > > > > > > Thanks Anand for working on this. > > > > > > > > Given the IRC event endpoint is WIP, I think it'd be best to be > > > consistent > > > > with the IRC endpoint. For more context on the IRC event, the event > > > > endpoint is not finalized yet. I'd recommend speeding up the IRC side > > > work > > > > to avoid any inconsistencies between Polaris and IRC spec. > > > > > > > > I have a few questions about the metrics endpoint: > > > > 1. Do we need to expose them via Polaris REST endpoint? Can users > grab > > > the > > > > metrics from the backend directly? I understand the RBAC won't be > > there, > > > > but it provides flexibility for users. For example, some users may > > > choose a > > > > different persistence model such as a KV store or storing the metrics > > as > > > > objects in S3, which usually scales better than an RDBMS like > Postgres. > > > My > > > > understanding is that Polaris is not intended to be a full metrics > > system > > > > anyway, but rather to provide a way for downstream systems to consume > > > these > > > > data. > > > > 2. Should we consider it as an IRC endpoint? Given the metrics report > > > > endpoint in IRC, the Iceberg community might also be interested in > > > serving > > > > back metrics, similar to the event endpoint. In that case, there is a > > > risk > > > > of fragmentation if we create a Polaris endpoint now. We should avoid > > > that > > > > if possible. It might be worth checking with the Iceberg community > > first. > > > > > > > > Happy to hear others’ thoughts on this. > > > > > > > > Yufei > > > > > > > > > > > > On Wed, Mar 11, 2026 at 9:27 AM Nándor Kollár <[email protected]> > > > wrote: > > > > > > > > > I agree with Dmitri. The direction outlined in the proposal looks > > good > > > to > > > > > me, and the finer details can be worked out as implementation gets > > > > > underway. We can adjust the design doc accordingly, later on. > > > > > > > > > > > > > > > Thanks, > > > > > Nandor > > > > > > > > > > Dmitri Bourlatchkov <[email protected]> ezt írta (időpont: 2026. > > márc. > > > > 10., > > > > > K, 20:43): > > > > > > > > > > > Hi Anand and All, > > > > > > > > > > > > The proposal LGTM in its current form. > > > > > > > > > > > > My personal approach is that a proposal does not have to be as > > > > "polished" > > > > > > as the final API spec. As long as we have consensus on the > general > > > > > > approach and the basic API principles, I think we can proceed to > > > > > > implementation and iron out final wrinkles during actual API spec > > and > > > > > code > > > > > > PRs. > > > > > > > > > > > > Would this approach work for everyone? > > > > > > > > > > > > Re-posting my comment from GH [1] here for visibility, in case > > people > > > > > have > > > > > > different opinions and wish to discuss this in more depth: > > > > > > > > > > > > From my POV, Polaris is a platform, indeed. In this sense, I > think > > it > > > > is > > > > > > > critical to enable users to control what features are at play > in > > > > > runtime, > > > > > > > since different users have different use cases. This is why I > > > > > originally > > > > > > > advocated for isolating Metrics Persistence from MetaStore > > > > Persistence. > > > > > > > > > > > > > > If a user decides to leverage the "native" Polaris > (scan/commit) > > > > > Metrics > > > > > > > Persistence, I do not see any disadvantage in also exposing an > > > > > (optional) > > > > > > > REST API for loading these metrics from Polaris Persistence. > > > > > > > > > > > > > > The degree of support and sophistication that goes into this > > > > sub-system > > > > > > is > > > > > > > up to the community. If we have contributors (like @obelix74 ) > > who > > > > are > > > > > > > willing to evolve it, I see no harm in some functionality > overlap > > > > with > > > > > > more > > > > > > > focused metrics platforms. Again, the key point is for all > users > > to > > > > > have > > > > > > > control and be able to opt in or out of this feature in their > > > > specific > > > > > > > deployments. > > > > > > > > > > > > > > Of course, offloading scan/commit metrics storage to a > > specialized > > > > > > > observability system is possible too (assuming someone develops > > > > > > integration > > > > > > > code for that, which is very welcome). > > > > > > > > > > > > > > > > > > [1] > > > https://github.com/apache/polaris/pull/3924#discussion_r2913947744 > > > > > > > > > > > > Thanks, > > > > > > Dmitri. > > > > > > > > > > > > On Tue, Mar 10, 2026 at 12:37 PM Anand Kumar Sankaran via dev < > > > > > > [email protected]> wrote: > > > > > > > > > > > > > Hi EJ Wang and Dmitri, > > > > > > > > > > > > > > I addressed all your concerns about the proposal, in particular > > > > > > > > > https://github.com/apache/polaris/pull/3924#discussion_r2908317696 > > > . > > > > > > > > > > > > > > Does this address your concerns? > > > > > > > > > > > > > > - > > > > > > > Anand > > > > > > > > > > > > > > From: Dmitri Bourlatchkov <[email protected]> > > > > > > > Date: Monday, March 9, 2026 at 1:17 PM > > > > > > > To: [email protected] <[email protected]> > > > > > > > Subject: Re: Proposal for REST endpoints for table metrics and > > > events > > > > > > > > > > > > > > This Message Is From an External Sender > > > > > > > This message came from outside your organization. > > > > > > > Report Suspicious< > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZAGr2cumYdtJ_UTm0gR9PfI_-PwSpR_GtNr1uVQ_xo-s2AskvUmbkLZ-C5V8eOKN-omus47On4k4hfFo-0G7CHMLwVjEego-rZrPuepAybX7DP8Ua0VNSrsZ83C4$ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi EJ, > > > > > > > > > > > > > > You make good points about the metrics API extensibility and > > > > evolution. > > > > > > > > > > > > > > However, we need to consider practical aspects too. Anand > appears > > > to > > > > > have > > > > > > > some specific use cases in mind, and I assume his proposal > > > addresses > > > > > > them. > > > > > > > > > > > > > > Starting with an API + implementation that works for some real > > > > > > > world applications will validate the feature's usability. > > > > > > > > > > > > > > We can revamp the API completely in its v2 after v1 is merged. > > New > > > > > major > > > > > > > API versions do not have to be backward-compatible with older > > > > versions > > > > > of > > > > > > > the same API [1]. > > > > > > > > > > > > > > In my personal experience, a v1 API can hardly be expected to > > cover > > > > all > > > > > > use > > > > > > > cases and extensions well. We can certainly take a bit more > time > > to > > > > > > polish > > > > > > > it, but from my POV it might be best to iterate in terms of API > > > > > versions > > > > > > > rather than on unmerged commits in the initial proposal. Just > my > > 2 > > > > > cents > > > > > > :) > > > > > > > > > > > > > > That said, we should flag the new APIs in this proposal as > > > "beta"... > > > > at > > > > > > > least initially (which is the usual practice in Polaris). > > > > > > > > > > > > > > > I wonder if it would help to evaluate the Events API and > > Metrics > > > > API > > > > > a > > > > > > > bit more independently. > > > > > > > > > > > > > > That makes sense to me. However, the current proposal > progressed > > a > > > > lot > > > > > > > since its initial submission and contained both APIs. I would > not > > > > want > > > > > to > > > > > > > lose this momentum. > > > > > > > > > > > > > > It might still be advisable to implement the events and metrics > > > APIs > > > > > > > separately and gather additional feedback at that time. > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://polaris.apache.org/in-dev/unreleased/evolution/__;!!Iz9xO38YGHZK!8KJ0uv4jK3mxZP4nYFrL1hZ0fMkQvoVEAJa8t9LBCzVtm_PWVFGQfIcZp-ykn3_F9_ph6EYyu3dUZjPAcQ$ > > > > > > > > > > > > > > Cheers, > > > > > > > Dmitri. > > > > > > > > > > > > > > On Mon, Mar 9, 2026 at 3:48 PM EJ Wang < > > > > [email protected] > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Anand, > > > > > > > > > > > > > > > > I think the proposal is moving in a better direction, > > especially > > > on > > > > > the > > > > > > > > Events side, and I appreciate the iteration so far. That > said, > > I > > > > > still > > > > > > > have > > > > > > > > some concerns about the Metrics side, but they are less about > > > > > > individual > > > > > > > > parameters or endpoint shape, and more about product > boundary. > > > > > > > > > > > > > > > > 2 cents: I wonder if it would help to evaluate the Events API > > and > > > > > > Metrics > > > > > > > > API a bit more independently. > > > > > > > > > > > > > > > > The Events side feels relatively close to Polaris' > > > > catalog/change-log > > > > > > > > scope. It is easier to justify as part of the core/community > > > > surface, > > > > > > > > especially if the goal is to expose completed catalog > mutations > > > in > > > > a > > > > > > way > > > > > > > > that aligns with Iceberg-style events. > > > > > > > > > > > > > > > > The Metrics side feels different to me. Once we start adding > > more > > > > and > > > > > > > more > > > > > > > > type-specific filters, query semantics, and schema shape for > > > > > individual > > > > > > > > metric families, it seems easy for Polaris to drift toward a > > > > built-in > > > > > > > > observability backend. My bias would be for Polaris to > support > > a > > > > > > smaller > > > > > > > > set of community-recognized built-in metrics well, while > > > providing > > > > > good > > > > > > > > extensibility points for deployments that want richer > querying, > > > > > > > > visualization, or use-case-specific metrics. > > > > > > > > > > > > > > > > Related to that, I am not yet convinced the current metrics > > model > > > > is > > > > > > > > generic enough as a long-term direction. Even after > > consolidating > > > > to > > > > > a > > > > > > > > single endpoint, the design still feels fairly tied to the > > > current > > > > > > > > scan/commit shape. I worry that otherwise each new metric > > family > > > > will > > > > > > > keep > > > > > > > > pulling us into more storage/schema/API reshaping inside > > Polaris > > > > > core. > > > > > > > > So the framing question I would suggest is something like: > > > > > > > > > What is the minimal built-in metrics surface Polaris should > > own > > > > in > > > > > > > core, > > > > > > > > and where should we instead rely on extensibility / > > sink-export / > > > > > > > > plugin-style integration? > > > > > > > > > > > > > > > > For me, getting that boundary right matters more than > settling > > > > every > > > > > > > query > > > > > > > > parameter detail first. > > > > > > > > > > > > > > > > -ej > > > > > > > > > > > > > > > > On Tue, Mar 3, 2026 at 12:29 PM Anand Kumar Sankaran via dev > < > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > Hi Yufei and Dmitri, > > > > > > > > > > > > > > > > > > Here is a proposal for the REST endpoints for metrics and > > > events. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://github.com/apache/polaris/pull/3924/changes__;!!Iz9xO38YGHZK!8KJ0uv4jK3mxZP4nYFrL1hZ0fMkQvoVEAJa8t9LBCzVtm_PWVFGQfIcZp-ykn3_F9_ph6EYyu3fXDjRYWA$ > > > > > > > > > > > > > > > > > > I did not see any precursors for raising a PR for > proposals, > > so > > > > > > trying > > > > > > > > > this. Please let me know what you think. > > > > > > > > > > > > > > > > > > - > > > > > > > > > Anand > > > > > > > > > > > > > > > > > > From: Anand Kumar Sankaran <[email protected]> > > > > > > > > > Date: Monday, March 2, 2026 at 10:25 AM > > > > > > > > > To: [email protected] <[email protected]> > > > > > > > > > Subject: Re: Polaris Telemetry and Audit Trail > > > > > > > > > > > > > > > > > > About the REST API, based on my use cases: > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > I want to be able to query commit metrics to track files > > added > > > / > > > > > > > removed > > > > > > > > > per commit, along with record counts. The ingestion > pipeline > > > that > > > > > > > writes > > > > > > > > > this data is owned by us and we are guaranteed to write > this > > > > > > > information > > > > > > > > > for each write. > > > > > > > > > 2. > > > > > > > > > I want to be able to query scan metrics for read. I > > understand > > > > > > clients > > > > > > > do > > > > > > > > > not fulfill this requirement. > > > > > > > > > 3. > > > > > > > > > I want to be able to query the events table (events are > > > > persisted) > > > > > - > > > > > > > this > > > > > > > > > may supersede #2, I am not sure yet. > > > > > > > > > > > > > > > > > > All this information is in the JDBC based persistence model > > and > > > > is > > > > > > > > > persisted in the metastore. I currently don’t have a need > to > > > > query > > > > > > > > > prometheus or open telemetry. I do publish some events to > > > > > Prometheus > > > > > > > and > > > > > > > > > they are forwarded to our dashboards elsewhere. > > > > > > > > > > > > > > > > > > About the CLI utilities, I meant the admin user utilities. > In > > > one > > > > > of > > > > > > > the > > > > > > > > > earliest drafts of my proposal, Prashant mentioned that the > > > > metrics > > > > > > > > tables > > > > > > > > > can grow indefinitely and that a similar problem exists > with > > > the > > > > > > events > > > > > > > > > table as well. We discussed that cleaning up of old records > > > from > > > > > both > > > > > > > > > metrics tables and events tables can be done via a CLI > > utility. > > > > > > > > > > > > > > > > > > I see that Yufei has covered the discussion about > > datasources. > > > > > > > > > > > > > > > > > > - > > > > > > > > > Anand > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Yufei Gu <[email protected]> > > > > > > > > > Date: Friday, February 27, 2026 at 9:54 PM > > > > > > > > > To: [email protected] <[email protected]> > > > > > > > > > Subject: Re: Polaris Telemetry and Audit Trail > > > > > > > > > > > > > > > > > > This Message Is From an External Sender > > > > > > > > > This message came from outside your organization. > > > > > > > > > Report Suspicious< > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZABkHi1B699ote2uMwpOZw8i0QMCGO2Szc-HshuABGhGvwPJcymE6G2oUUxtS8xDkSrtGTPm_I3QnVDHoLMk50m9v8z_nZKTkd-bnVUbreF1u0WnfV_X5eYevZl_$ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As I mentioned in > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://github.com/apache/polaris/issues/3890__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKOxrvDU0$ > > > > > > > >, > > > > > > > > > supporting > > > > > > > > > multiple data sources is not a trivial change. I would > > strongly > > > > > > > recommend > > > > > > > > > starting with a design document to carefully evaluate the > > > > > > architectural > > > > > > > > > implications and long term impact. > > > > > > > > > > > > > > > > > > A REST endpoint to query metrics seems reasonable given the > > > > current > > > > > > > JDBC > > > > > > > > > based persistence model. That said, we may also consider > > > > > alternative > > > > > > > > > storage models. For example, if we later adopt a time > series > > > > system > > > > > > > such > > > > > > > > as > > > > > > > > > Prometheus to store metrics, the query model and access > > > patterns > > > > > > would > > > > > > > be > > > > > > > > > fundamentally different. Designing the REST API without > > > > considering > > > > > > > these > > > > > > > > > potential evolutions may limit flexibility. I'd suggest to > > > start > > > > > with > > > > > > > the > > > > > > > > > use case. > > > > > > > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 27, 2026 at 3:42 PM Dmitri Bourlatchkov < > > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Anand, > > > > > > > > > > > > > > > > > > > > Sharing my view... subject to discussion: > > > > > > > > > > > > > > > > > > > > 1. Adding non-IRC REST API to Polaris is perfectly fine. > > > > > > > > > > > > > > > > > > > > Figuring out specific endpoint URIs and payloads might > > > require > > > > a > > > > > > few > > > > > > > > > > roundtrips, so opening a separate thread for that might > be > > > > best. > > > > > > > > > > Contributors commonly create Google Docs for new API > > > proposals > > > > > too > > > > > > > > (they > > > > > > > > > > fairly easy to update as the email discussion > progresses). > > > > > > > > > > > > > > > > > > > > There was a suggestion to try Markdown (with PRs) for > > > proposals > > > > > [1] > > > > > > > ... > > > > > > > > > > feel free to give it a try if you are comfortable with > > that. > > > > > > > > > > > > > > > > > > > > 2. Could you clarify whether you mean end user utilities > or > > > > admin > > > > > > > user > > > > > > > > > > utilities? In the latter case those might be more > suitable > > > for > > > > > the > > > > > > > > Admin > > > > > > > > > > CLI (java) not the Python CLI, IMHO. > > > > > > > > > > > > > > > > > > > > Why would these utilities be common with events? IMHO, > > event > > > > use > > > > > > > cases > > > > > > > > > are > > > > > > > > > > distinct from scan/commit metrics. > > > > > > > > > > > > > > > > > > > > 3. I'd prefer separating metrics persistence from > MetaStore > > > > > > > persistence > > > > > > > > > at > > > > > > > > > > the code level, so that they could be mixed and matched > > > > > > > independently. > > > > > > > > > The > > > > > > > > > > separate datasource question will become a non-issue with > > > that > > > > > > > > approach, > > > > > > > > > I > > > > > > > > > > guess. > > > > > > > > > > > > > > > > > > > > The rationale for separating scan metrics and metastore > > > > > persistence > > > > > > > is > > > > > > > > > that > > > > > > > > > > "cascading deletes" between them are hardly ever > required. > > > > > > > Furthermore, > > > > > > > > > the > > > > > > > > > > data and query patterns are very different so different > > > > > > technologies > > > > > > > > > might > > > > > > > > > > be beneficial in each case. > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__https://lists.apache.org/thread/yto2wp982t43h1mqjwnslswhws5z47cy__;!!Iz9xO38YGHZK!5EuyFFkk3vhRWVIRvQAWBSQfpJkTMA9HxugzDwXmN0LPPqhEFxYkFRGVhtb8AqUwXtDh2OplcMnbMDHKxYDakNU$ > > > > > > > >> > > > > > > > > > > > Cheers, > > > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > > > > > On Fri, Feb 27, 2026 at 6:19 PM Anand Kumar Sankaran via > > dev > > > < > > > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > Thanks all. This PR is merged now. > > > > > > > > > > > > > > > > > > > > > > Here are the follow-up features / work needed. These > > were > > > > all > > > > > > part > > > > > > > > of > > > > > > > > > > the > > > > > > > > > > > merged PR at some point in time and were removed to > > reduce > > > > > scope. > > > > > > > > > > > > > > > > > > > > > > Please let me know what you think. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. A REST API to paginate through table metrics. > This > > > will > > > > > be > > > > > > > > > non-IRC > > > > > > > > > > > standard addition. > > > > > > > > > > > 2. Utilities for managing old records, should be > > common > > > > with > > > > > > > > events. > > > > > > > > > > > There was some discussion that it belongs to the CLI. > > > > > > > > > > > 3. Separate datasource (metrics, events, even other > > > > > tables?). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Anything else? > > > > > > > > > > > > > > > > > > > > > > - > > > > > > > > > > > Anand > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
