dennishuo commented on code in PR #1506:
URL: https://github.com/apache/polaris/pull/1506#discussion_r2074711395


##########
spec/polaris-management-service.yml:
##########
@@ -938,6 +940,40 @@ components:
           format: password
           description: Bearer token (input-only)
 
+    SigV4AuthenticationParameters:
+      type: object
+      description: AWS Signature Version 4 authentication
+      allOf:
+        - $ref: '#/components/schemas/AuthenticationParameters'
+      properties:
+        roleArn:
+          type: string
+          description: The aws IAM role arn assumed by polaris userArn when 
signing requests
+          example: 
"arn:aws:iam::123456789001:role/role-that-has-remote-catalog-access"
+        roleSessionName:
+          type: string
+          description: The role session name to be used by the SigV4 protocol 
for signing requests
+          example: "polaris-remote-catalog-access"
+        externalId:
+          type: string
+          description: An optional external id used to establish a trust 
relationship with AWS in the trust policy
+          example: "external-id-1234"
+        signingRegion:
+          type: string
+          description: Region to be used by the SigV4 protocol for signing 
requests
+          example: "us-west-2"
+        signingName:
+          type: string
+          description: The service name to be used by the SigV4 protocol for 
signing requests, the default signing name is "execute-api" is if not provided
+          example: "glue"
+        userArn:
+          type: string
+          description: The aws user arn used to assume the aws role, this 
represents the polaris service itself
+          example: "arn:aws:iam::123456789001:user/polaris-service-user"

Review Comment:
   Right, that flow is indeed how the `StorageConfigInfo` works today for 
managed Polaris services that bind their own service-owned IAM User at the time 
of the `CreateCatalog` call; in general the binding of the IAM User to a 
StorageConfig (or a ConnectionConfig) may not be known until the 
`CreateCatalog` actually runs.
   
   I think the problems with trying to do it at step 1 are:
   
   - The binding might be dynamic and transactionally done performed during the 
CreateCatalog
   - There's no separate API to express pre-planning for intending to create a 
Federated Catalog
   
   It's true that if the service provider has very static base IAM Users or if 
the user interaction is very manual (e.g. if it's a small-scale/informal use 
case where an internal "user" DMs the "service provider" on Slack), then the 
userArn can be known already before creating the catalog.
   
   Looks like there was some discussion about this already in 
https://github.com/apache/polaris/pull/1191
   
   I know @XJDKC has proposed having a stronger delineation between 
user-provided and service-provided properties, and I agree it's worth cleaning 
up.
   
   The "not used by Polaris" might just be a shortcoming of the current default 
implementations -- in a way we *should* be enforcing consistency of the userArn 
when setting up the `stsClient` in 
https://github.com/apache/polaris/blob/40aea3ac39ac485a76f3aaa930694eeab8ec14d8/service/common/src/main/java/org/apache/polaris/service/storage/StorageConfiguration.java#L63
 
   
   
       default Supplier<StsClient> stsClientSupplier() {
         return Suppliers.memoize(
             () -> {
               StsClientBuilder stsClientBuilder = StsClient.builder();
               if (awsAccessKey().isPresent() && awsSecretKey().isPresent()) {
                 LoggerFactory.getLogger(StorageConfiguration.class)
                     .warn("Using hard-coded AWS credentials - this is not 
recommended for production");
                 StaticCredentialsProvider awsCredentialsProvider =
                     StaticCredentialsProvider.create(
                         AwsBasicCredentials.create(awsAccessKey().get(), 
awsSecretKey().get()));
                 stsClientBuilder.credentialsProvider(awsCredentialsProvider);
               }
               return stsClientBuilder.build();
             });
       }
   
   
   But the default implementation is just kept simple by basically taking a 
single root IAM User from the runtime environment that ends up being the 
userArn used implicitly for *all* StorageConfigurations in the service.
   
   One step to expand on that architecture in a way that better illustrates how 
the fields actually should be used by the code would be if we allow something 
like setting per-realm root IAM Users or IAM Roles in the Quarkus config for 
being the "service identity" that will `assumeRole` on the *user*-specified 
catalog-level roleArn. Then the stsClient Supplier would be initially 
authenticating as the realm-specific "service identity" that matches what's in 
the Catalog's properties.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to