XJDKC commented on code in PR #1506:
URL: https://github.com/apache/polaris/pull/1506#discussion_r2087269395


##########
spec/polaris-management-service.yml:
##########
@@ -938,6 +940,38 @@ components:
           format: password
           description: Bearer token (input-only)
 
+    SigV4AuthenticationParameters:
+      type: object
+      description: AWS Signature Version 4 authentication
+      allOf:
+        - $ref: '#/components/schemas/AuthenticationParameters'
+      properties:
+        roleArn:
+          type: string
+          description: The aws IAM role arn assumed by polaris userArn when 
signing requests
+          example: 
"arn:aws:iam::123456789001:role/role-that-has-remote-catalog-access"
+        roleSessionName:
+          type: string
+          description: The role session name to be used by the SigV4 protocol 
for signing requests
+          example: "polaris-remote-catalog-access"
+        externalId:
+          type: string
+          description: An optional external id used to establish a trust 
relationship with AWS in the trust policy
+          example: "external-id-1234"
+        signingRegion:
+          type: string
+          description: Region to be used by the SigV4 protocol for signing 
requests
+          example: "us-west-2"
+        signingName:
+          type: string
+          description: The service name to be used by the SigV4 protocol for 
signing requests, the default signing name is "execute-api" is if not provided
+          example: "glue"
+        serviceIdentity:
+          $ref: '#/components/schemas/ServiceIdentityInfo'

Review Comment:
   Hey Dmitri, I discussed the proposal of adding a separate set of management 
APIs for handling service identity info in my design doc (topic 2, proposal 3): 
[Apache Polaris Creds Management 
Proposal](https://docs.google.com/document/d/1MAW87DtyHWPPNIEkUCRVUKBGjhh5bPn0GbtV7fifm30/edit?usp=sharing)
   
   Here is the pros and cons:
   * Pros
     * **Clean separation of concerns**: Identity info isn’t mixed into catalog 
or storage config anymore.
     * **Simplifies catalog schema**: Keeps it focused strictly on user input.
     * **More extensible**: We can evolve the service identity model without 
touching the catalog schema or breaking clients.
     * **Flexible**: Works well for both self-managed and SaaS-style 
deployments.
   * Cons
     * **Requires extra coordination**: Users must make an additional API call 
to fetch identity info and coordinate across two APIs for full setup.
     * **Less cohesive**: Identity context is no longer colocated with the 
catalog entity. This assumes **Polaris uses the same identity across all 
catalogs in a given realm**.
     * **Not aligned with current behavior**: Today, identity fields are 
visible in the catalog response, this approach would change that.
     * **Dynamic fields challenge**: Some service-managed fields like 
consentUrl depend on user-provided input (e.g., Azure tenant ID) and can’t be 
precomputed globally, they need to be generated after Polaris receives the 
config.
   
   Also, some specific vendors may want to use different service identities to 
access different external services.
   **e.g. use SIGV4 auth to access Glue/Amazon API Gateway (host polaris behind 
the API Gateway), but the table is stored in azure blob (or s3 comp storage)**.
   
   In that case, would it be better to include serviceIdentityInfo as a 
top-level field in both the storage config and connection config? That way, the 
scope is clearly limited to the relevant config, reducing the blast radius and 
keeping things more modular.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to