Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-12 Thread via GitHub
danielcweeks commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2718328147 @SanjayMarreddi and @jackye1995 This PR added a new dependency, but it doesn't look like the LICENSE file was updated. This needs to be taken care of before the next release (cc @

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 merged PR #12299: URL: https://github.com/apache/iceberg/pull/12299 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2716569441 Looks like all comments are addressed, thanks @SanjayMarreddi for all the work! Let us know when you have the follow up PRs for async client configs and doc update! -- This is an

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990534076 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -121,23 +136,49 @@ public S3FileIO(SerializableSupplier s3) { * @param s3FileIOProperties S

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990522002 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -36,20 +37,53 @@ public static S3InputFile fromLocation( MetricsContext metrics) {

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
HonahX commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990498449 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -36,20 +37,53 @@ public static S3InputFile fromLocation( MetricsContext metrics) { r

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715732944 @HonahX could you take a look? Given the fact that we plan to refactor the HTTPClientProperties and other related classes as the next step, it's probably good for you to take a look

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990064772 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputFile.java: ## @@ -75,7 +95,7 @@ public PositionOutputStream createOrOverwrite() { @Override public I

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715593232 > Are there plans to replace the current s3 client with the async client? Maybe after many versions, once we have enough confidence that it is stable. But probably not in the s

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990058529 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputFile.java: ## @@ -75,7 +95,7 @@ public PositionOutputStream createOrOverwrite() { @Override public I

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715582900 > It would also be great to outline the migration path going forward. Yes, I think in general there is data point supporting using async client & CRT client makes the performa

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990053279 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputFile.java: ## @@ -75,7 +95,7 @@ public PositionOutputStream createOrOverwrite() { @Override public I

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715574083 I verified that the async client should only affect S3 FileIO when the feature flag is enabled. `s3Async()` is the factory function that returns a `S3AsyncClient`. It is ca

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989998534 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -72,6 +72,32 @@ public class S3FileIOProperties implements Serializable { pu

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990004839 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -255,6 +256,48 @@ public void testNewInputStreamWithMultiRegionAccessPoin

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715494642 > If you are using the normal code path today with the feature off, with all the separated code paths, you should not be affected at all. yea looking at `aws/src/main/java/org/

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989992832 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -255,6 +256,48 @@ public void testNewInputStreamWithMultiRegionAccess

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989988680 ## kafka-connect/build.gradle: ## Review Comment: Yeah sure, noted. Thanks -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989988126 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientFactories.java: ## @@ -118,6 +119,14 @@ public S3Client s3() { .build(); } +@Overrid

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989985092 ## kafka-connect/build.gradle: ## Review Comment: We are not adding this to the aws-bundle yet, so it should be fine, but @SanjayMarreddi we should probably

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715451404 > Have you posted this on the iceberg devlist? Not really, I did not really expect it to be a community discussion since this is a very vendor specific integration for S3 (alth

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989925583 ## gradle/libs.versions.toml: ## @@ -22,6 +22,7 @@ [versions] activation = "1.1.1" aliyun-sdk-oss = "3.10.2" +analyticsaccelerator = "1.0.0" Review Comment:

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989966659 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientFactories.java: ## @@ -118,6 +119,14 @@ public S3Client s3() { .build(); } +@Override +

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
geruh commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989952400 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientFactories.java: ## @@ -118,6 +119,14 @@ public S3Client s3() { .build(); } +@Override +pu

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-28 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1975861782 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -640,12 +663,21 @@ public S3FileIOProperties(Map properties) { prop

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-28 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1975440490 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -31,7 +31,7 @@ * * Note: Path-style access is deprecated and not supported by this implem

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-27 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1974106727 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -640,12 +663,21 @@ public S3FileIOProperties(Map properties) { properti

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-27 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1974106727 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -640,12 +663,21 @@ public S3FileIOProperties(Map properties) { properti

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-27 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1974098939 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -31,7 +31,7 @@ * * Note: Path-style access is deprecated and not supported by this implementa

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-25 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1970198866 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -72,6 +72,25 @@ public class S3FileIOProperties implements Serializable { pu

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-24 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1968300773 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -640,12 +663,21 @@ public S3FileIOProperties(Map properties) { prop

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-24 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1968199157 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -82,6 +97,27 @@ public long getLength() { @Override public SeekableInputStream n

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-24 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1968093949 ## aws/src/main/java/org/apache/iceberg/aws/s3/analyticsaccelerator/S3SeekableInputStreamFactorySupplier.java: ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apac

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-20 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1964165208 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -82,6 +97,27 @@ public long getLength() { @Override public SeekableInputStream n

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-19 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1962198159 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -82,6 +97,27 @@ public long getLength() { @Override public SeekableInputStream newSt

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-19 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1962195167 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -82,6 +97,27 @@ public long getLength() { @Override public SeekableInputStream newSt

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-19 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1962192776 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -72,6 +72,16 @@ public class S3FileIOProperties implements Serializable { public

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-19 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1962192384 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -640,6 +654,11 @@ public S3FileIOProperties(Map properties) { properties

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-19 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1962191718 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -754,6 +773,14 @@ public boolean isRemoteSigningEnabled() { return this.isRemote

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-19 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1962190553 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -640,6 +654,11 @@ public S3FileIOProperties(Map properties) { properties

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-19 Thread via GitHub
SanjayMarreddi commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2668573039 Hey @geruh, thanks for the review! - Yes, we will be adding some integration tests for it. - We have some internal benchmark results, we need some more time to validate the

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-18 Thread via GitHub
geruh commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1960726499 ## aws/src/main/java/org/apache/iceberg/aws/s3/analyticsaccelerator/S3SeekableInputStreamFactorySupplier.java: ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-18 Thread via GitHub
geruh commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2667150596 Hey @SanjayMarreddi, I think this is a cool addition!! I was wondering if your team has any benchmark numbers or performance metrics for this? Meanwhile I'll ping some others to ge

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-18 Thread via GitHub
geruh commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1960726499 ## aws/src/main/java/org/apache/iceberg/aws/s3/analyticsaccelerator/S3SeekableInputStreamFactorySupplier.java: ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-18 Thread via GitHub
geruh commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1960724511 ## aws/src/main/java/org/apache/iceberg/aws/s3/analyticsaccelerator/S3SeekableInputStreamFactorySupplier.java: ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-18 Thread via GitHub
geruh commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1960709819 ## build.gradle: ## @@ -470,6 +470,8 @@ project(':iceberg-aws') { compileOnly("software.amazon.awssdk:dynamodb") compileOnly("software.amazon.awssdk:lakeforma

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-18 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1959792119 ## aws/src/main/java/org/apache/iceberg/aws/s3/analyticsaccelerator/S3SeekableInputStreamFactorySupplier.java: ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apac

[PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-17 Thread via GitHub
SanjayMarreddi opened a new pull request, #12299: URL: https://github.com/apache/iceberg/pull/12299 Description - This is an initial PR for integrating [analytics-accelerator-s3](https://github.com/awslabs/analytics-accelerator-s3) into `iceberg/aws` - In short, our library a