XJDKC commented on code in PR #1506:
URL: https://github.com/apache/polaris/pull/1506#discussion_r2093410457
##########
spec/polaris-management-service.yml:
##########
@@ -938,6 +940,38 @@ components:
format: password
description: Bearer token (input-only)
+ SigV4AuthenticationParameters:
+ type: object
+ description: AWS Signature Version 4 authentication
+ allOf:
+ - $ref: '#/components/schemas/AuthenticationParameters'
+ properties:
+ roleArn:
+ type: string
+ description: The aws IAM role arn assumed by polaris userArn when
signing requests
+ example:
"arn:aws:iam::123456789001:role/role-that-has-remote-catalog-access"
+ roleSessionName:
+ type: string
+ description: The role session name to be used by the SigV4 protocol
for signing requests
+ example: "polaris-remote-catalog-access"
+ externalId:
+ type: string
+ description: An optional external id used to establish a trust
relationship with AWS in the trust policy
+ example: "external-id-1234"
+ signingRegion:
+ type: string
+ description: Region to be used by the SigV4 protocol for signing
requests
+ example: "us-west-2"
+ signingName:
+ type: string
+ description: The service name to be used by the SigV4 protocol for
signing requests, the default signing name is "execute-api" is if not provided
+ example: "glue"
+ serviceIdentity:
+ $ref: '#/components/schemas/ServiceIdentityInfo'
Review Comment:
Yeah, that matches my idea as well! I will update the spec soon.
**This isn't directly related to the spec changes, but I want to clarify how
Polaris retrieves connection credentials at runtime. We can also discuss it in
the follow-up PR!**
The `serviceIdentity` field is intended to be surfaced to Polaris users, for
example, so they can update their IAM role's trust policy to allow Polaris to
assume it. This is the user-facing identity info.
But in a multi-tenant Polaris deployment, each catalog may be assigned a
different service identity behind the scenes. Polaris needs to know where to
retrieve the actual credentials tied to that identity, e.g., AWS user
credentials stored in Secrets Manager.
So how does Polaris know where to find those credentials?
For storage config, we already follow this pattern. There's a hidden field
in `internalProperties` called
[storage_integration_identifier](https://github.com/apache/polaris/blob/apache-polaris-0.10.0-beta-incubating-rc2/polaris-core/src/main/java/org/apache/polaris/core/entity/PolarisEntityConstants.java#L49-L50),
which points to the service identity's credentials. Polaris loads a
`PolarisStorageIntegration` based on this value and uses it to retrieve
credentials and generate subscoped credentials as needed: [example
here](https://github.com/apache/polaris/blob/apache-polaris-0.10.0-beta-incubating-rc2/polaris-core/src/main/java/org/apache/polaris/core/persistence/transactional/TransactionalMetaStoreManagerImpl.java#L2002-L2012).
**Option 1: We could apply the same pattern for connection config:**
* Use serviceIdentity to expose the Polaris identity to the user
* Use a hidden field (e.g., connection_integration_identifier) to locate the
actual credentials Polaris needs to access external services
At runtime, Polaris would use that identifier to fetch the real credential
(e.g. from a secret manager), then assume the user-provided IAM role and
generate temporary, subscoped credentials.
**Option 2: Use a reference field in the persistence model
`ServiceIdentityDpo`**
Alternative, instead of introducing another hidden sibling field like
`connection_integration_identifier`, we could embed a reference directly inside
the service identity model (`ServiceIdentityDpo`), similar to how
`UserSecretReference`
([OAuthClientCredentialsParametersDpo.java#L64-L65](https://github.com/apache/polaris/blob/apache-polaris-0.10.0-beta-incubating-rc2/polaris-core/src/main/java/org/apache/polaris/core/connection/OAuthClientCredentialsParametersDpo.java#L64-L65))
works.
For example, we could add a `serviceInfoAuthLocatorUrn` field to the
internal `ServiceIdentityDpo`. **The spec would remain unchanged, but the
persistence model would carry the locator used to find the backing
credentials.**
```
{
"name": "aws-catalog",
"connectionConfigInfo": {
"authenticationParameters": { /* ... */ },
"serviceIdentity": {
"type": "AWS_IAM",
"serviceArn": "arn:aws:iam::111122223333:user/polaris-service-user",
"serviceInfoAuthLocatorUrn":
"urn:polaris-service-identities:realm-id:catalog-id:connection"
}
},
"storageConfigInfo": {
"serviceIdentity": { /* ... */ }
}
}
```
Personally, I would prefer option 2. WDYT?
cc: @dimas-b @adutra @dennishuo
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]