singhpk234 commented on code in PR #12799: URL: https://github.com/apache/iceberg/pull/12799#discussion_r2047326144
########## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ########## @@ -98,14 +100,12 @@ public class S3FileIO private String credential = null; private SerializableSupplier<S3Client> s3; private SerializableSupplier<S3AsyncClient> s3Async; - private S3FileIOProperties s3FileIOProperties; private SerializableMap<String, String> properties = null; - private transient volatile S3Client client; - private transient volatile S3AsyncClient asyncClient; private MetricsContext metrics = MetricsContext.nullMetrics(); private final AtomicBoolean isResourceClosed = new AtomicBoolean(false); private transient StackTraceElement[] createStack; private List<StorageCredential> storageCredentials = ImmutableList.of(); + private transient volatile Cache<String, PrefixedS3Client> s3ClientCache; Review Comment: Here is what i had in mind and have seen people doing in past 1. Execution interceptor can be configured as part of ClientOverrideConfiguration when we are creating a S3 SDK client, I am assuming we wire the [VendedCredProvider](https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java) for the same way but giving the creds map to the class constructor, so we can do the same way here for the ExecutionInterceptor and pass it in SDK client creation 2. Now how to provide / have refreshed creds is this `ExecutionInterceptor` will give us the handle of modify request and you can meddle with SDK header something like this ``` @Override public SdkHttpRequest modifyHttpRequest(Context.BeforeTransmission context, ExecutionAttributes executionAttributes) { // you can also inspect from the request which path it has in this request // and then based on the path you can wire in the creds which suits best for the path here : SdkHttpRequest.Builder builder = context.httpRequest().toBuilder(); AwsCredentials credentials = credentialsProvider.get(); if (credentials != null && credentials.accessKeyId() != null && credentials.secretAccessKey() != null) { builder.putHeader("X-Amz-Access-Key", credentials.accessKeyId()); builder.putHeader("X-Amz-Secret-Key", credentials.secretAccessKey()); if (credentials.sessionToken().isPresent()) { builder.putHeader("X-Amz-Security-Token", credentials.sessionToken().get()); } } else { System.err.println("Warning: AWS credentials not found."); } return builder.build(); } ``` ----- The main reason for not having one client per path is each client will have a connection pool and we may have to tune it like what is the min pool size for HTTP, if we can by gaurantee it will < 5 (which based on my understanding we can't right now) i think then we can think more. Would love to know your thoughts for the same considering above -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org