Re: [Proposal] Table-Level Storage Credential Overrides

Dmitri Bourlatchkov Wed, 11 Mar 2026 14:59:01 -0700

Hi Srinivas,

Adnan brings up a good point about consensus. Could you summarize how you
see the rough plan for implementing this feature?


Thanks,
Dmitri.

On Tue, Mar 3, 2026 at 1:19 AM Adnan Hemani via dev <[email protected]>
wrote:

> Hey, I just took a quick look. Sorry, since it's a large PR, I do not have
> time to review in depth at the moment, but I'm not sure that we're
> completely on the same page based on the conversation in this thread.
>
> I mentioned #3409 because it (and you also mentioned this regarding Option
> 2 in your message on 2/10) does not need any new CRUD endpoints - so I'm
> not sure why those are being introduced in this PR. Personally, I don't
> think the PR as it stands now accurately reflects the community consensus
> on the ML.
>
> Best,
> Adnan Hemani
>
> On Mon, Mar 2, 2026 at 10:25 AM Srinivas Rishindra <[email protected]
> >
> wrote:
>
> > Hi All,
> >
> > I have created a draft pull request to share the progress on this feature
> > and gather early feedback:
> > https://github.com/apache/polaris/pull/3923/changes .
> >
> > Please note that this is still a work in progress; additional efforts are
> > required for comprehensive testing and code cleanup. I am sharing this
> > draft now to ensure the current implementation aligns with the
> community's
> > expectations and the general direction.
> >
> > I look forward to your thoughts and suggestions.
> >
> > Best regards,
> > Srinivas Rishindra
> >
> > On Mon, Feb 23, 2026 at 8:26 AM Srinivas Rishindra <
> [email protected]>
> > wrote:
> >
> >>
> >> Hi Sung and Adnan,
> >>
> >> Thank you for your comments.
> >>
> >> *To Sung:*
> >>
> >> While I don't have concrete production workflows available to me at the
> >> moment, I can offer an illustrative use case to highlight the broader
> >> vision. The general idea is to make the catalog abstraction much more
> of a
> >> logical construct, rather than one that tightly couples to a physical
> >> storage configuration or an IAM policy. Currently, a catalog is
> restricted
> >> to a single cloud provider or IAM role, forcing users into
> >> infrastructure-driven boundaries.
> >>
> >> Consider an organization with multiple departments like Sales,
> Marketing,
> >> and Engineering, where each gets its own catalog. Within the Sales
> catalog,
> >> data governance mandates that US data resides in AWS, European data in
> GCP,
> >> and Chinese data in Alibaba Cloud. Currently, these differing storage
> >> configurations would force the admin to artificially create separate
> >> catalogs per region. By decoupling storage from the catalog level, a
> sales
> >> associate can interact with their accounts as a unified logical unit
> (e.g.,
> >> a namespace per associate, tables per account), while the admin handles
> the
> >> underlying geographic storage complexity behind the scenes.
> >>
> >> *To Adnan:*
> >>
> >> I understand your concerns regarding the implementation complexity of
> >> Option 1, particularly how it would impact APIs like CreateTable. I
> >> agree that starting with Option 2 is a pragmatic first step to make
> >> progress, and we can evaluate migrating to Option 1 in the future as
> user
> >> needs evolve.
> >>
> >> I also reviewed PR #3409 <https://github.com/apache/polaris/pull/3409>
> >> and its corresponding issue, #2970 (Support Per-Catalog AWS Credentials
> >> in MinIO Deployments) <https://github.com/apache/polaris/issues/2970>.
> >> The discussion in that issue correctly highlighted the security risks of
> >> persisting raw secrets directly in the configuration object. By
> leveraging
> >> the approach from PR #3409—where named storage credentials are
> predefined
> >> in the server config and referenced by a storageName property—we can
> >> cleanly implement Option 2. Embedding just the storageName reference at
> >> the table or namespace level elegantly resolves the primary drawbacks I
> >> initially listed for Option 2: it prevents duplicating sensitive
> >> credentials, allows admins to rotate credentials centrally, and offers
> >> reusability without requiring a new top-level entity.
> >>
> >> Unless there are any objections, I will work on implementing option2 and
> >> publish a PR. Please let me know if this sounds like a reasonable path
> >> forward.
> >>
> >> Best regards,
> >>
> >> Srinivas
> >>
> >> On Fri, Feb 20, 2026 at 3:22 AM Adnan Hemani via dev <
> >> [email protected]> wrote:
> >>
> >>> Hi all,
> >>>
> >>> Sorry for the late reply. I still have some concerns about Option 1's
> >>> implementation details, which IMO may render it unusable or
> functionally
> >>> handicapped - my comments are on the original design document. If we
> >>> choose
> >>> Option 1 in the future, I think we will eventually need further scoping
> >>> or
> >>> discussion on how APIs like CreateTable will work.
> >>>
> >>> Could we potentially implement Option 2 in the short-term using the
> >>> approach in #3409 <https://github.com/apache/polaris/pull/3409>? Maybe
> >>> that
> >>> will help us keep more of the storage configs in alignment with each
> >>> other
> >>> (resolving the con about re-usability and solving some of the
> credential
> >>> rotation concerns as well).
> >>>
> >>> Best,
> >>> Adnan Hemani
> >>>
> >>> On Thu, Feb 19, 2026 at 8:58 AM Sung Yun <[email protected]> wrote:
> >>>
> >>> > Hi Srinivas,
> >>> >
> >>> > Thanks for the recap.
> >>> >
> >>> > I generally agree that Option 1 is the most semantically sound long
> >>> term
> >>> > approach, assuming credentials themselves live in a secrets manager
> >>> and the
> >>> > storage configuration only holds references. That feels like the most
> >>> > extensible direction as Polaris evolves.
> >>> >
> >>> > I also agree with Dmitri that there are really two different concerns
> >>> > here. One is how storage configuration is modeled and persisted in
> >>> Polaris
> >>> > as an Entity. The other is how the effective configuration is
> resolved
> >>> for
> >>> > a given table across catalog, namespace, and table boundaries. Those
> >>> do not
> >>> > have to be solved by the same abstraction.
> >>> >
> >>> > From that perspective, Option 4 is appealing from an implementation
> >>> > standpoint, but I share the concern about semantic confusion. Reusing
> >>> the
> >>> > resolution and inheritance logic that Policy already has makes sense,
> >>> but
> >>> > using the Policy entity itself to represent storage connectivity
> feels
> >>> > unintuitive and potentially confusing for future users and
> developers.
> >>> >
> >>> > Option 1 is IMHO probably the most correct model, but it also
> requires
> >>> the
> >>> > most upfront investment. Building on Yufei’s point, it would really
> >>> help to
> >>> > ground this in concrete user workflows. I think seeking answers to
> how
> >>> > common storage configuration reuse is across many tables, and how
> they
> >>> are
> >>> > typically managed (at the namespace level, or at table level)  would
> >>> help
> >>> > us decide whether to invest in Option 1 now or phase toward it over
> >>> time.
> >>> >
> >>> > Cheers,
> >>> > Sung
> >>> >
> >>> > On 2026/02/17 23:44:23 Srinivas Rishindra wrote:
> >>> > > I agree with YuFei. Until we identify more concrete use cases, the
> >>> > *inline
> >>> > > model* seems to be the best starting point. It is particularly
> >>> > well-suited
> >>> > > for sparse configurations, where only a few tables in a namespace
> >>> require
> >>> > > overrides while the rest remain unchanged.
> >>> > >
> >>> > > *Next Steps:* Unless there are any objections, I will update the
> >>> design
> >>> > doc
> >>> > > to reflect this approach. Once approved, I will proceed with
> >>> > implementation.
> >>> > >
> >>> > > On Wed, Feb 11, 2026 at 3:49 PM Yufei Gu <[email protected]>
> >>> wrote:
> >>> > >
> >>> > > > I’d suggest we start from concrete use cases.
> >>> > > >
> >>> > > > If the inline model(Option 2) works well for the primary
> scenarios,
> >>> > e.g.,
> >>> > > > relatively sparse table level storage overrides, we could adopt
> it
> >>> as a
> >>> > > > first phase. It keeps the implementation simple and lets us
> >>> validate
> >>> > real
> >>> > > > world needs before introducing additional abstractions.
> >>> > > >
> >>> > > > However, if we anticipate frequent configuration rotation or
> strong
> >>> > reuse
> >>> > > > requirements across many tables, Option 1 is more compelling. In
> >>> that
> >>> > case,
> >>> > > > I'd recommend reusing the existing policy framework where
> possible,
> >>> > since
> >>> > > > it already provides inheritance and attachment semantics. That
> >>> could
> >>> > help
> >>> > > > us avoid introducing significant new complexity into Polaris
> while
> >>> > still
> >>> > > > supporting the richer model.
> >>> > > > Yufei
> >>> > > >
> >>> > > >
> >>> > > > On Wed, Feb 11, 2026 at 9:12 AM Dmitri Bourlatchkov <
> >>> [email protected]>
> >>> > > > wrote:
> >>> > > >
> >>> > > > > Hi Srinivas,
> >>> > > > >
> >>> > > > > Thanks for the discussion recap! It's very useful to keep the
> dev
> >>> > thread
> >>> > > > > and meetings aligned.
> >>> > > > >
> >>> > > > > Option 1:
> >>> > > > > Credential Rotation: Highly efficient. Because the
> configuration
> >>> is
> >>> > > > > referenced by ID, rotating a cloud IAM role or secret requires
> >>> > updating
> >>> > > > > only the single StorageConfiguration entity. [...]
> >>> > > > >
> >>> > > > >
> >>> > > > > This seems to imply that credentials are stored as part of the
> >>> > Storage
> >>> > > > > Configuration Entity. If so, I do not think this approach is
> >>> ideal. I
> >>> > > > > believe the secret data should ideally be accessed via the
> >>> Secrets
> >>> > > > Manager
> >>> > > > > [1]. While that discussion is still in progress, I believe it
> >>> > > > interconnects
> >>> > > > > with this proposal.
> >>> > > > >
> >>> > > > > [...] All thousands of downstream
> >>> > > > > tables referencing it would immediately use the new credentials
> >>> > without
> >>> > > > > metadata updates.
> >>> > > > >
> >>> > > > >
> >>> > > > > Immediacy is probably from the end-user's perspective.
> >>> Internally,
> >>> > > > > different Polaris processes may switch to the updated config at
> >>> > > > > different moments in time... I do not think it is a problem in
> >>> this
> >>> > case,
> >>> > > > > just wanted to highlight it to make sure distributed system
> >>> aspects
> >>> > are
> >>> > > > not
> >>> > > > > left out :)
> >>> > > > >
> >>> > > > > Option 2:
> >>> > > > > Credential Rotation: Credential rotation is difficult [...]
> >>> > > > >
> >>> > > > >
> >>> > > > > Again, I believe actual credentials should be accessed via the
> >>> > Secrets
> >>> > > > > Manager [1] so some indirection will be present.
> >>> > > > >
> >>> > > > > Config updates will need to happen individually in each case,
> but
> >>> > actual
> >>> > > > > secrets could be shared and updated centrally via the Secrets
> >>> > Manager.
> >>> > > > >
> >>> > > > > ATM, given the complexity points about option 1 that were
> >>> brought up
> >>> > in
> >>> > > > the
> >>> > > > > community sync, I tend to favour this option for implementing
> >>> this
> >>> > > > > proposal. However, this is not a strong requirement by any
> means,
> >>> > just my
> >>> > > > > personal opinion. Other opinions are welcome.
> >>> > > > >
> >>> > > > > Depending on how secret references are handled in code (needs a
> >>> POC,
> >>> > I
> >>> > > > > guess), there could be some synergy with Tornike's approach
> from
> >>> > [3699].
> >>> > > > >
> >>> > > > > Option 3: Named Catalog-Level Configurations (Hybrid) [...]
> >>> > > > >
> >>> > > > >
> >>> > > > > I would like to clarify the UX story in this case. Do we expect
> >>> end
> >>> > users
> >>> > > > > to manage Storage Configuration in this case or the Polaris
> >>> owner?
> >>> > > > >
> >>> > > > > In the latter case, it seems similar to Tornike's proposal in
> >>> [3699]
> >>> > but
> >>> > > > > generalized to all storage types. The Polaris Admin / Owner
> could
> >>> > use a
> >>> > > > > non-public API to work with this configuration (e.g. plain
> >>> Quarkus
> >>> > > > > configuration or possibly Admin CLI).
> >>> > > > >
> >>> > > > > Option 4: Leverage Existing Policy Framework [...]
> >>> > > > >
> >>> > > > >
> >>> > > > > I tend to agree with the "semantic confusion" point.
> >>> > > > >
> >>> > > > > It should be fine to reuse policy-related code in the
> >>> implementation
> >>> > (if
> >>> > > > > possible), but I believe Storage Configuration and related
> >>> credential
> >>> > > > > management form a distinct use case / feature and deserve
> >>> dedicated
> >>> > > > > handling in Polaris and the API / UX level.
> >>> > > > >
> >>> > > > > [1]
> >>> https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
> >>> > > > >
> >>> > > > > [3699] https://github.com/apache/polaris/pull/3699
> >>> > > > >
> >>> > > > > Thanks,
> >>> > > > > Dmitri.
> >>> > > > >
> >>> > > > > On Tue, Feb 10, 2026 at 10:19 PM Srinivas Rishindra <
> >>> > > > > [email protected]>
> >>> > > > > wrote:
> >>> > > > >
> >>> > > > > > Hi Everyone,
> >>> > > > > >
> >>> > > > > > We had an opportunity to discuss this feature and my recent
> >>> > proposal at
> >>> > > > > > the last community sync meeting. I would like to summarize
> our
> >>> > > > > discussion
> >>> > > > > > and enumerate the various options we considered to help us
> >>> reach a
> >>> > > > > > consensus.
> >>> > > > > >
> >>> > > > > > To recap, storage configuration is currently restricted at
> the
> >>> > catalog
> >>> > > > > > level. This limits flexibility for users who need to organize
> >>> > tables
> >>> > > > > across
> >>> > > > > > different storage configurations or cloud providers within a
> >>> single
> >>> > > > > > catalog. There appears to be general agreement on the utility
> >>> of
> >>> > this
> >>> > > > > > feature; however, we still need to align on the specific
> >>> > implementation
> >>> > > > > > approach.
> >>> > > > > >
> >>> > > > > > Here are the various options that were considered.
> >>> > > > > > *Option 0: Make Credentials available as part of table
> >>> properties.
> >>> > > > *(This
> >>> > > > > > was my original proposal, but abandoned after becoming aware
> >>> of the
> >>> > > > > > security implications.)
> >>> > > > > >
> >>> > > > > > *Option 1: First-Class Storage Configuration Entity *
> >>> > > > > >
> >>> > > > > > This approach proposes elevating StorageConfiguration to a
> >>> > standalone,
> >>> > > > > > top-level resource in the Polaris backend (similar to a
> >>> Principal,
> >>> > > > > > Namespace or Table), independent of the Catalog or Table.
> This
> >>> is
> >>> > the
> >>> > > > > > approach in my most recent proposal doc.
> >>> > > > > > -
> >>> > > > > >
> >>> > > > > > Data Model: A new StorageConfiguration entity is created with
> >>> its
> >>> > own
> >>> > > > > > unique identifier and lifecycle. Tables and Namespaces would
> >>> store
> >>> > a
> >>> > > > > > reference ID pointing to this entity rather than embedding
> the
> >>> > > > > credentials
> >>> > > > > > directly.
> >>> > > > > > -
> >>> > > > > >
> >>> > > > > > Security: This model offers the cleanest security boundary.
> We
> >>> can
> >>> > > > > > introduce a specific USAGE privilege on the configuration
> >>> entity. A
> >>> > > > user
> >>> > > > > > would need both CREATE_TABLE on the Namespace *and* USAGE on
> >>> the
> >>> > > > specific
> >>> > > > > > StorageConfiguration to link them.
> >>> > > > > > -
> >>> > > > > >
> >>> > > > > > Credential Rotation: Highly efficient. Because the
> >>> configuration is
> >>> > > > > > referenced by ID, rotating a cloud IAM role or secret
> requires
> >>> > updating
> >>> > > > > > only the single StorageConfiguration entity. All thousands of
> >>> > > > downstream
> >>> > > > > > tables referencing it would immediately use the new
> credentials
> >>> > without
> >>> > > > > > metadata updates.
> >>> > > > > > -
> >>> > > > > >
> >>> > > > > > Inheritance: The reference could be set at the Catalog,
> >>> Namespace,
> >>> > or
> >>> > > > > Table
> >>> > > > > > level. If a Table does not specify a reference, it would
> >>> inherit
> >>> > the
> >>> > > > > > reference from its parent Namespace (and so on), preserving
> the
> >>> > current
> >>> > > > > > hierarchical behavior while adding granularity.
> >>> > > > > >
> >>> > > > > > • Pros: Maximum flexibility and reusability (Many-to-Many).
> >>> > Updating
> >>> > > > one
> >>> > > > > > config object propagates to all associated tables.
> >>> > > > > > -
> >>> > > > > >
> >>> > > > > > • Cons: Highest engineering cost. Requires new CRUD APIs, DB
> >>> schema
> >>> > > > > changes
> >>> > > > > > (mapping tables), and complex authorization logic (two-stage
> >>> auth
> >>> > > > > checks).
> >>> > > > > > Risk of accumulating "orphaned" configs
> >>> > > > > >
> >>> > > > > > Option 2: The "Embedded Field" Model
> >>> > > > > > -
> >>> > > > > >
> >>> > > > > > This approach extends the existing Table and Namespace
> >>> entities to
> >>> > > > > include
> >>> > > > > > a storageConfig field. The parameter can be defaulted to
> 'null'
> >>> > and use
> >>> > > > > > parent's storageConfig at runtime.
> >>> > > > > >
> >>> > > > > > *Data Model:* No new top-level entity is created. The storage
> >>> > details
> >>> > > > > > (e.g., roleArn) are stored directly into a new, dedicated
> >>> column or
> >>> > > > > > structure within the existing Table/Namespace entity.
> >>> > > > > >
> >>> > > > > > Complexity: This could reduce the engineering overhead
> >>> > significantly.
> >>> > > > > There
> >>> > > > > > are no new CRUD endpoints for configuration objects, no
> >>> referential
> >>> > > > > > integrity checks (e.g., preventing the deletion of a config
> >>> used by
> >>> > > > > active
> >>> > > > > > tables).
> >>> > > > > >
> >>> > > > > > Credential Rotation: Credential rotation is difficult. If an
> >>> IAM
> >>> > role
> >>> > > > > > changes, an administrator must identify and issue UPDATE
> >>> > operations for
> >>> > > > > > every individual table or namespace that uses that specific
> >>> > > > > configuration,
> >>> > > > > > potentially affecting thousands of objects.
> >>> > > > > >
> >>> > > > > > • Pros: Lowest engineering cost. No new entities or complex
> >>> > mappings
> >>> > > > are
> >>> > > > > > required. Easy to reason about authorization (auth is tied
> >>> > strictly to
> >>> > > > > the
> >>> > > > > > entity).
> >>> > > > > >
> >>> > > > > > • Cons: No reusability. Configs must be duplicated across
> >>> tables;
> >>> > > > > rotating
> >>> > > > > > credentials for 1,000 tables could require 1,000 update
> calls.
> >>> > > > > >
> >>> > > > > > Option 3: Named Catalog-Level Configurations (Hybrid)
> >>> > > > > >
> >>> > > > > > This can be a combination of Option1 and Option 2
> >>> > > > > > Admin can define a registry of "Named Storage Configurations"
> >>> > stored
> >>> > > > > within
> >>> > > > > > the Catalog. Sub-entities (Namespaces/Tables) reference these
> >>> > configs
> >>> > > > by
> >>> > > > > > name (e.g., storage-config: "finance-secure-role").
> >>> > > > > >
> >>> > > > > > *Data Model:* No separate top level entity is created. The
> >>> Catalog
> >>> > > > Entity
> >>> > > > > > potentially needs to be modified to accommodate named storage
> >>> > > > > > configurations.
> >>> > > > > >
> >>> > > > > > Credential Rotation: Credential Rotation can be done at the
> >>> catalog
> >>> > > > level
> >>> > > > > > for each named Storage Configuration.
> >>> > > > > >
> >>> > > > > > Inheritance: Works pretty much similar as proposed in option
> 1
> >>> &
> >>> > > > option2.
> >>> > > > > >
> >>> > > > > > Security: Not as secure as option1 but still useful. A
> >>> principal
> >>> > with
> >>> > > > > > proper access can attach any named storage configuration
> >>> defined
> >>> > at the
> >>> > > > > > catalog level to any arbitrary entity within the catalog.
> >>> > > > > >
> >>> > > > > > • Pros: Good balance of reusability and simplicity. Allows
> >>> > updating a
> >>> > > > > > config in one place (the Catalog definition) without needing
> a
> >>> > > > full-blown
> >>> > > > > > global entity system.
> >>> > > > > >
> >>> > > > > > • Cons: Scope is limited to the Catalog (cannot share configs
> >>> > across
> >>> > > > > > catalogs)
> >>> > > > > > Option 4: Leverage Existing Policy Framework
> >>> > > > > >
> >>> > > > > > This approach leverages the existing Apache Polaris Policy
> >>> > Framework
> >>> > > > > > (currently used for features like snapshot expiry) to manage
> >>> > storage
> >>> > > > > > settings.
> >>> > > > > >
> >>> > > > > > Data Model: Storage configurations are defined as "Policies"
> >>> at the
> >>> > > > > Catalog
> >>> > > > > > level. These Policies contain the credential details and can
> be
> >>> > > > attached
> >>> > > > > to
> >>> > > > > > Namespaces or Tables using the existing policy attachment
> APIs.
> >>> > > > > >
> >>> > > > > > Inheritance:  This aligns naturally with Polaris's existing
> >>> > > > architecture,
> >>> > > > > > where policies cascade from Catalog → Namespace → Table. The
> >>> > vending
> >>> > > > > logic
> >>> > > > > > would simply resolve the "effective" storage policy for a
> >>> table at
> >>> > > > query
> >>> > > > > > time.
> >>> > > > > >
> >>> > > > > > Security: This utilizes the existing Polaris Privileges and
> >>> > attachment
> >>> > > > > > privileges. Administrators can define authorized storage
> >>> policies
> >>> > > > > > centrally, and users can only select from these pre-approved
> >>> > policies,
> >>> > > > > > preventing them from inputting arbitrary or insecure role
> ARNs.
> >>> > > > > >
> >>> > > > > > • Pros:
> >>> > > > > >   . Zero New Infrastructure: Reuses the existing "Policy"
> >>> entity,
> >>> > > > > > persistence layer, and inheritance logic, significantly
> >>> reducing
> >>> > > > > > engineering effort
> >>> > > > > >   . Proven Inheritance: The logic for resolving policies from
> >>> > child to
> >>> > > > > > parent is already implemented and tested
> >>> > > > > >
> >>> > > > > > • Cons:
> >>> > > > > >   . Semantic Confusion: Policies are typically used for
> >>> "governance
> >>> > > > > rules"
> >>> > > > > > (e.g., snapshot expiry, compaction) rather than "connectivity
> >>> > > > > > configuration." Using them for credentials might be
> unintuitive
> >>> > > > > >   . Authorization Complexity: The authorizer would need to
> >>> load and
> >>> > > > > > evaluate policies to determine how to access data,
> potentially
> >>> > coupling
> >>> > > > > > governance logic with data access paths
> >>> > > > > >
> >>> > > > > > We can potentially start with one of the options initially
> and
> >>> as
> >>> > the
> >>> > > > > > feature and user needs develop we can migrate to other
> options
> >>> as
> >>> > well.
> >>> > > > > > Please let me know your thoughts about the various options
> >>> above
> >>> > or if
> >>> > > > on
> >>> > > > > > anything that I might have missed so that we can work
> towards a
> >>> > > > consensus
> >>> > > > > > on how to implement this feature.
> >>> > > > > >
> >>> > > > > >
> >>> > > > > > On Thu, Feb 5, 2026 at 8:08 AM Tornike Gurgenidze <
> >>> > > > > [email protected]>
> >>> > > > > > wrote:
> >>> > > > > >
> >>> > > > > > > Hi,
> >>> > > > > > >
> >>> > > > > > > To follow up on Dmitri's point about credentials, there's
> >>> > already a
> >>> > > > PR
> >>> > > > > > > <https://github.com/apache/polaris/pull/3409> up that is
> >>> going
> >>> > to
> >>> > > > > allow
> >>> > > > > > > predefining named storage credentials in polaris config
> like
> >>> the
> >>> > > > > > following:
> >>> > > > > > >
> >>> > > > > > >    - polaris.storage.aws.<storage-name>.access-key
> >>> > > > > > >    - polaris.storage.aws.<storage-name>.secret-key
> >>> > > > > > >
> >>> > > > > > > then storage configuration will simply refer to it by name
> >>> and
> >>> > > > > > > inherit credentials.
> >>> > > > > > >
> >>> > > > > > > I think that can go hand in hand with table-level
> overrides.
> >>> > > > Overriding
> >>> > > > > > > each and every aws property for every table doesn't sound
> >>> ideal.
> >>> > > > > > Defining a
> >>> > > > > > > storage configuration upfront and referring to it by name
> >>> should
> >>> > be a
> >>> > > > > > > simpler solution. I can extend the scope of the PR above to
> >>> allow
> >>> > > > > > > predefining other aws properties as well like endpoint-url
> >>> and
> >>> > > > region.
> >>> > > > > > >
> >>> > > > > > > Another point that came up in the discussion surrounding
> >>> extra
> >>> > > > > > credentials
> >>> > > > > > > is how to make sure anyone can't just hijack pre configured
> >>> > > > > credentials.
> >>> > > > > > > The simplest solution I see there is to ship off properties
> >>> to
> >>> > OPA
> >>> > > > > during
> >>> > > > > > > catalog (and table) creation and allow users to write
> >>> policies
> >>> > based
> >>> > > > on
> >>> > > > > > > them. If we want to enable internal rbac to have a similar
> >>> > capability
> >>> > > > > we
> >>> > > > > > > can go further and move from config based storage
> definition
> >>> to a
> >>> > > > > > separate
> >>> > > > > > > `/storage-config` rest resource in management API that will
> >>> come
> >>> > with
> >>> > > > > > > necessary grants and permissions.
> >>> > > > > > >
> >>> > > > > > > On Thu, Feb 5, 2026 at 5:43 AM Dmitri Bourlatchkov <
> >>> > [email protected]
> >>> > > > >
> >>> > > > > > > wrote:
> >>> > > > > > >
> >>> > > > > > > > Hi Srinivas,
> >>> > > > > > > >
> >>> > > > > > > > Thanks for the proposal. It looks good to me overall, a
> >>> very
> >>> > timely
> >>> > > > > > > feature
> >>> > > > > > > > to add to Polaris.
> >>> > > > > > > >
> >>> > > > > > > > I added some comments in the doc and I see this topic on
> >>> the
> >>> > > > > Community
> >>> > > > > > > Sync
> >>> > > > > > > > agenda for Feb 5. Looking forward to discussing it
> online.
> >>> > > > > > > >
> >>> > > > > > > > I have three points to highlight:
> >>> > > > > > > >
> >>> > > > > > > > * Dealing with passwords probably connects to the Secrets
> >>> > Manager
> >>> > > > > > > > discussion [1]
> >>> > > > > > > >
> >>> > > > > > > > * Persistence needs to consider non-RDBMS backends. OSS
> >>> code
> >>> > has
> >>> > > > both
> >>> > > > > > > > PostgreSQL and MongoDB, but private Persistence
> >>> > implementations are
> >>> > > > > > > > possible too. I believe we need a proper SPI for this,
> not
> >>> > just a
> >>> > > > > > > > relational schema example.
> >>> > > > > > > >
> >>> > > > > > > > * Associating entities (tables, namespaces) to Storage
> >>> > > > Configuration
> >>> > > > > is
> >>> > > > > > > > likely a plugin point that downstream projects may want
> to
> >>> > > > customize.
> >>> > > > > > I'd
> >>> > > > > > > > propose making another SPI for this. This SPI is probably
> >>> > different
> >>> > > > > > from
> >>> > > > > > > > the new Persistence SPI mentioned above since the concern
> >>> here
> >>> > is
> >>> > > > not
> >>> > > > > > > > persistence per se, but the logic of finding the right
> >>> storage
> >>> > > > > config.
> >>> > > > > > > >
> >>> > > > > > > > [1]
> >>> > > > https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
> >>> > > > > > > >
> >>> > > > > > > > Cheers,
> >>> > > > > > > > Dmitri.
> >>> > > > > > > >
> >>> > > > > > > > On Mon, Feb 2, 2026 at 4:18 PM Srinivas Rishindra <
> >>> > > > > > > [email protected]>
> >>> > > > > > > > wrote:
> >>> > > > > > > >
> >>> > > > > > > > > Hi all,
> >>> > > > > > > > >
> >>> > > > > > > > > We had an opportunity to discuss the community sprint
> >>> last
> >>> > week.
> >>> > > > > > Based
> >>> > > > > > > on
> >>> > > > > > > > > that discussion, I have created a new design doc which
> I
> >>> am
> >>> > > > > attaching
> >>> > > > > > > > here.
> >>> > > > > > > > > In this design instead of passing credentials via table
> >>> > > > properties,
> >>> > > > > > > this
> >>> > > > > > > > > design introduces Inheritable Storage Configurations
> as a
> >>> > > > > first-class
> >>> > > > > > > > > feature. Please let me know your thoughts on the
> >>> document.
> >>> > > > > > > > >
> >>> > > > > > > > >
> >>> > > > > > > > >
> >>> > > > > > > >
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> >
> >>>
> https://docs.google.com/document/d/1hbDkE-w84Pn_112iW2vCnlDKPDtyg8flaYcFGjvD120/edit?usp=sharing
> >>> > > > > > > > >
> >>> > > > > > > > >
> >>> > > > > > > > > On Mon, Jan 26, 2026 at 10:42 PM Yufei Gu <
> >>> > [email protected]>
> >>> > > > > > > wrote:
> >>> > > > > > > > >
> >>> > > > > > > > > > Hi Srinivas,
> >>> > > > > > > > > >
> >>> > > > > > > > > > Thanks for sharing this proposal. Persisting long
> lived
> >>> > > > > credentials
> >>> > > > > > > > such
> >>> > > > > > > > > as
> >>> > > > > > > > > > an S3 secret access key directly in table properties
> >>> raises
> >>> > > > > > > significant
> >>> > > > > > > > > > security concerns. Here is an alternative approach
> >>> > previously
> >>> > > > > > > > discussed,
> >>> > > > > > > > > > which enables storage configuration at the table or
> >>> > namespace
> >>> > > > > > level,
> >>> > > > > > > > and
> >>> > > > > > > > > it
> >>> > > > > > > > > > is probably a more secure and promising direction
> >>> overall.
> >>> > > > > > > > > >
> >>> > > > > > > > > > Yufei
> >>> > > > > > > > > >
> >>> > > > > > > > > >
> >>> > > > > > > > > > On Mon, Jan 26, 2026 at 8:18 PM Srinivas Rishindra <
> >>> > > > > > > > > [email protected]
> >>> > > > > > > > > > >
> >>> > > > > > > > > > wrote:
> >>> > > > > > > > > >
> >>> > > > > > > > > > > Dear All,
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > I have developed a design proposal for Table-Level
> >>> > Storage
> >>> > > > > > > Credential
> >>> > > > > > > > > > > Overrides in Apache Polaris.
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > The core objective is to allow specific storage
> >>> > properties to
> >>> > > > > be
> >>> > > > > > > > > defined
> >>> > > > > > > > > > at
> >>> > > > > > > > > > > the table level rather than the catalog level,
> >>> enabling a
> >>> > > > > single
> >>> > > > > > > > > logical
> >>> > > > > > > > > > > catalog to support tables across disparate storage
> >>> > systems.
> >>> > > > > > > > Crucially,
> >>> > > > > > > > > > the
> >>> > > > > > > > > > > implementation ensures these overrides participate
> >>> in the
> >>> > > > > > > credential
> >>> > > > > > > > > > > vending process to maintain secure, scoped access.
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > I have also implemented a Proof of Concept (POC)
> pull
> >>> > request
> >>> > > > > to
> >>> > > > > > > > > > > demonstrate the idea. While the current MVP focuses
> >>> on
> >>> > S3, I
> >>> > > > > > intend
> >>> > > > > > > > to
> >>> > > > > > > > > > > expand scope to include Azure and GCS pending
> >>> community
> >>> > > > > feedback.
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > I look forward to your thoughts and suggestions on
> >>> this
> >>> > > > > proposal.
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > Links:
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > - Design Doc: Table-Level Storage Credential
> >>> Overrides (
> >>> > > > > > > > > > >
> >>> > > > > > > > > > >
> >>> > > > > > > > > >
> >>> > > > > > > > >
> >>> > > > > > > >
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> >
> >>>
> https://docs.google.com/document/d/1tf4N8GKeyAAYNoP0FQ1zT1Ba3P1nVGgdw3nmnhSm-u0/edit?usp=sharing
> >>> > > > > > > > > > > )
> >>> > > > > > > > > > > - POC PR:
> >>> https://github.com/apache/polaris/pull/3563 (
> >>> > > > > > > > > > > https://github.com/apache/polaris/pull/3563)
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > Best regards,
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > Srinivas Rishindra Pothireddi
> >>> > > > > > > > > > >
> >>> > > > > > > > > >
> >>> > > > > > > > >
> >>> > > > > > > >
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
>

Re: [Proposal] Table-Level Storage Credential Overrides

Reply via email to