Re: [PR] AWS: Support multiple storage credential prefixes [iceberg]

via GitHub Wed, 16 Apr 2025 09:51:42 -0700


singhpk234 commented on code in PR #12799:
URL: https://github.com/apache/iceberg/pull/12799#discussion_r2047326144



##########
aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java:
##########
@@ -98,14 +100,12 @@ public class S3FileIO
   private String credential = null;
   private SerializableSupplier<S3Client> s3;
   private SerializableSupplier<S3AsyncClient> s3Async;
-  private S3FileIOProperties s3FileIOProperties;
   private SerializableMap<String, String> properties = null;
-  private transient volatile S3Client client;
-  private transient volatile S3AsyncClient asyncClient;
   private MetricsContext metrics = MetricsContext.nullMetrics();
   private final AtomicBoolean isResourceClosed = new AtomicBoolean(false);
   private transient StackTraceElement[] createStack;
   private List<StorageCredential> storageCredentials = ImmutableList.of();
+  private transient volatile Cache<String, PrefixedS3Client> s3ClientCache;

Review Comment:
   Here is what i had in mind and have seen people doing in past 
   1. Execution interceptor can be configured as part of 
ClientOverrideConfiguration when we are creating a S3 SDK client, I am assuming 
we wire the 
[VendedCredProvider](https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java)
 for the same way but giving the creds map to the class constructor, so we can 
do the same way here for the ExecutionInterceptor and pass it in SDK client 
creation 
   2. Now how to provide / have  refreshed creds is this `ExecutionInterceptor` 
will give us the handle of modify request and you can meddle with SDK header 
   something like this 
   ```
       @Override
       public SdkHttpRequest modifyHttpRequest(Context.BeforeTransmission 
context, ExecutionAttributes executionAttributes) {
           // you can also inspect from the request which path it has in this 
request  
           // and then based on the path you can wire in the creds which suits 
best for the path here :
           SdkHttpRequest.Builder builder = context.httpRequest().toBuilder();
           AwsCredentials credentials = credentialsProvider.get();
   
           if (credentials != null && credentials.accessKeyId() != null && 
credentials.secretAccessKey() != null) {
               builder.putHeader("X-Amz-Access-Key", credentials.accessKeyId());
               builder.putHeader("X-Amz-Secret-Key", 
credentials.secretAccessKey());
               if (credentials.sessionToken().isPresent()) {
                   builder.putHeader("X-Amz-Security-Token", 
credentials.sessionToken().get());
               }
           } else {
               System.err.println("Warning: AWS credentials not found.");
           }
           return builder.build();
       }
    ```
    
    ----- 
    
    The main reason for not having one client per path is each client will have 
a connection pool and we may have to tune it like what is the min pool size for 
HTTP, if we can by gaurantee it will < 5 (which based on my understanding we 
can't right now) i think then we can think more. 
    
    Would love to know your thoughts for the same considering above 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] AWS: Support multiple storage credential prefixes [iceberg]

Reply via email to