[ 
https://issues.apache.org/jira/browse/HADOOP-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17234612#comment-17234612
 ] 

Asier Arambarri Beldarrain edited comment on HADOOP-14661 at 11/18/20, 1:25 PM:
--------------------------------------------------------------------------------

This is the solution for my case, regarding this error with S3A and Spark (but 
I believe it can be replicated in other envs).

The key is the  *fs.s3a.s3.client.factory.impl* properties' value. By default, 
this value is set to *DefaultS3ClientFactory.*

So what's wrong with this factory? Well, it doesn't include any requester-pays 
related header, as seen in its source code:

{color:#ff0000}[https://github.com/apache/hadoop/blob/e02b102/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java].]{color}

{color:#172b4d}The solution is to implement a custom client factory extending 
the default one but *adding the request-payer header* in the awsConfig{color}.

{color:#172b4d}  ~public class RequestPayerS3ClientFactory extends 
DefaultS3ClientFactory {~
            ~@Override~
           ~protected AmazonS3 newAmazonS3Client(AWSCredentialsProvider 
credentials, ClientConfiguration awsConf)~
           ~{~
               ~awsConf.addHeader("x-amz-request-payer","requester");~
               ~return new AmazonS3Client(credentials, awsConf);~{color}

         }

}
 
 _This is just a simplification, as you could also check if the value is set to 
true before adding the header. This factory asssumes all request will be payed 
by the requester if needed._ 

 

{color:#172b4d}Once you *compile the class and add it to your classpath*, set 
the new hadoopConfiguration:{color}

*{color:#172b4d}~spark.sparkContext{color}.hadoopConfiguration.set("fs.s3a.s3.client.factory.impl",
 "your.package.~~RequestPayerS3ClientFactory~ ~")~*
 
 This way the S3 requests will call the overriden newAmazonS3Client method, now 
including the x-amz-request-payer header.

 

 


was (Author: asierarambarribeldarrain):
This is the solution for my case, regarding this error with S3A and Spark (but 
I believe it can be replicated in other envs).

The key is the  *fs.s3a.s3.client.factory.impl* properties' value. By default, 
this value is set to *DefaultS3ClientFactory.*

So what's wrong with this factory? Well, it doesn't include any requester-pays 
related header, as seen in its source code:

{color:#FF0000}[https://github.com/apache/hadoop/blob/e02b102/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java|https://github.com/apache/hadoop/blob/e02b102/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java].]{color}

{color:#172b4d}The solution is to implement a custom client factory extending 
the default one but {color:#de350b}*adding the request-payer header* 
{color:#172b4d}in the awsConfig{color}{color}.{color}

{color:#172b4d}  ~public class RequestPayerS3ClientFactory extends 
DefaultS3ClientFactory {~
           ~@Override~
          ~protected AmazonS3 newAmazonS3Client(AWSCredentialsProvider 
credentials, ClientConfiguration awsConf)~
          ~{~
              ~awsConf.addHeader("x-amz-request-payer","requester");~
              ~return new AmazonS3Client(credentials, awsConf);~{color}

          ~{color:#172b4d}}
 }{color}~

_{color:#172b4d}This is just a simplification, as you could also check if the 
value is set to true before adding the header. This factory asssumes all 
request will be payed by the requester if needed.{color}_ 

 

{color:#172b4d}Once you *compile the class and add it to your classpath*, set 
the new hadoopConfiguration:{color}

*{color:#172b4d}~{color:#57d9a3}spark.sparkContext{color}.hadoopConfiguration.set("fs.s3a.s3.client.factory.impl",
 "your.package.~~RequestPayerS3ClientFactory~ ~")~{color}*

{color:#172b4d}This way the S3 requests will call the overriden 
newAmazonS3Client method, now including the x-amz-request-payer header.{color}

 

 

> S3A to support Requester Pays Buckets
> -------------------------------------
>
>                 Key: HADOOP-14661
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14661
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: common, util
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Mandus Momberg
>            Assignee: Mandus Momberg
>            Priority: Minor
>         Attachments: HADOOP-14661.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Amazon S3 has the ability to charge the requester for the cost of accessing 
> S3. This is called Requester Pays Buckets. 
> In order to access these buckets, each request needs to be signed with a 
> specific header. 
> http://docs.aws.amazon.com/AmazonS3/latest/dev/RequesterPaysBuckets.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to