[ 
https://issues.apache.org/jira/browse/HADOOP-17197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174996#comment-17174996
 ] 

Steve Loughran commented on HADOOP-17197:
-----------------------------------------

-1. no. sorry.

Yes it's weighty but it's allowed us to support things like STS token issue 
(delegation token) without forcing all applications to add new jars to their 
CP. IT also guarantees consistency between those core libraries (say, aws sdk 
core) and modules (spark-kinesis) which want to talk to the other services

bq. The aws-java-sdk-bundle jar file is shaded as well, so it includes all 
transitive dependencies.

This is by design. We switched to the integrated shaded JAR as we were fed up 
of dealing with problems related to JSON parser versions, incompatible 
httpclient versions, (HADOOP-13044), joda time and versions of JDK, etc etc. 
The big shaded JAR avoid all these problems and delegates the task of 
guaranteeing a consistent set of dependencies to the AWS team,

Qualifying an AWS SDK Is often hard enough. 
https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/testing.html#Qualifying_an_AWS_SDK_Update

you don't get to see it impala, but we have to deal with shaded JARs still 
exporting things, error messages telling you off when you call abort(), a 
warning message every time you open a file about no metrics being passed in, 
etc, etc. There is no way I'd consider making things harder than they already 
are.

> Decrease size of s3a dependencies
> ---------------------------------
>
>                 Key: HADOOP-17197
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17197
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sahil Takiar
>            Priority: Major
>
> S3A currently has a dependency on the aws-java-sdk-bundle, which includes the 
> SDKs for all AWS services. The jar file for the current version is about 120 
> MB, but continues to grow (the latest is about 170 MB). Organic growth is 
> expected as more and more AWS services are created.
> The aws-java-sdk-bundle jar file is shaded as well, so it includes all 
> transitive dependencies.
> It would be nice if S3A could depend on smaller jar files in order to 
> decrease the size of jar files pulled in transitively by clients. Decreasing 
> the size of dependencies is particularly important for Docker files, where 
> image pull times can be affected by image size.
> One solution here would be for S3A to publish its own shaded jar which 
> includes the SDKs for all needed AWS Services (e.g. S3, DynamoDB, etc.) along 
> with the transitive dependencies for the individual SDKs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to