[jira] [Commented] (HADOOP-11183) Memory-based S3AOutputstream

Steve Loughran (JIRA) Thu, 12 Feb 2015 04:11:54 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318086#comment-14318086
 ]


Steve Loughran commented on HADOOP-11183:
-----------------------------------------

I don't know enough about the AWS library to comment on that aspect; others 
will have to.

One thing I would like to see (And which presumably could be applied to other 
bits of the S3a code), is better translation of AWS exceptions into common IOE 
subclasses. Eg. auth exception, file already exists exception, ....

There's enough places where things are being caught and wrapped you could have 
a generic static {{IOException convertException(AmazonClientException}} method 
somewhere to do the conversion. IOEs on their own are fairly uninformative.

Also: when wrapping exceptions, always include the{{toString()}} value of the 
nested exception in the extended text. That way, even if the nested exceptions 
get lost or stack traces don't get printed, the underlying problem is there for 
someone to look at.

> Memory-based S3AOutputstream
> ----------------------------
>
>                 Key: HADOOP-11183
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11183
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Thomas Demoor
>            Assignee: Thomas Demoor
>         Attachments: HADOOP-11183.001.patch, HADOOP-11183.002.patch, 
> HADOOP-11183.003.patch, info-003.md, info-S3AFastOutputStream-sync.md
>
>
> Currently s3a buffers files on disk(s) before uploading. This JIRA 
> investigates adding a memory-based upload implementation.
> The motivation is evidently performance: this would be beneficial for users 
> with high network bandwidth to S3 (EC2?) or users that run Hadoop directly on 
> an S3-compatible object store (FYI: my contributions are made in name of 
> Amplidata). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11183) Memory-based S3AOutputstream

Reply via email to