[jira] [Commented] (HADOOP-13600) S3a rename() to copy files in a directory in parallel

ASF GitHub Bot (JIRA) Wed, 13 Sep 2017 21:02:26 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-13600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165701#comment-16165701
 ]


ASF GitHub Bot commented on HADOOP-13600:
-----------------------------------------

Github user sahilTakiar commented on the issue:

    https://github.com/apache/hadoop/pull/157
  
    Updates:
    * Moved the parallel rename logic into a dedicated class called 
`ParallelDirectoryRenamer`
    * A few other bug fixes, the core logic remains the same
    
    @steveloughran your last comment on HADOOP-13786 suggested you may move the 
retry logic out into a separate patch? Are you planning to do that? If not, do 
you think this patch requires waiting for all the work in HADOOP-13786 to be 
completed?
    
    If there are concerns with retry behavior, we could also set the default 
value of the copy thread pool to be 1, that way this feature is essentially off 
by default.
    
    Also what do you mean by "isn't going to be resilient to large copies where 
you are much more likely to hit parallel IO"? What parallel IO are you 
referring to?


> S3a rename() to copy files in a directory in parallel
> -----------------------------------------------------
>
>                 Key: HADOOP-13600
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13600
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.7.3
>            Reporter: Steve Loughran
>            Assignee: Sahil Takiar
>         Attachments: HADOOP-13600.001.patch
>
>
> Currently a directory rename does a one-by-one copy, making the request 
> O(files * data). If the copy operations were launched in parallel, the 
> duration of the copy may be reducable to the duration of the longest copy. 
> For a directory with many files, this will be significant



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-13600) S3a rename() to copy files in a directory in parallel

Reply via email to