[ 
https://issues.apache.org/jira/browse/HADOOP-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936822#comment-14936822
 ] 

Thomas Demoor commented on HADOOP-11684:
----------------------------------------

One has to take into account that s3a runs within the  "Hadoop container" 
(Mapper / Reducer / ...). The new defaults allow for 3 (active uploads = 
threads.max) + 1 (queued upload = max.total.tasks) + 1 (active upload = in 
calling thread due to CallerRuns) = 5 concurrent uploads *per Hadoop container* 
on the node. This should easily fill up the network pipe of the node, whereas, 
on my setup, the current (much higher) defaults cause starvation.

Thus, with CallerRuns (003.patch), if extra upload attempts are made by *other 
threads* they will cause concurrent upload 6,7,8,..., likely running the JVM 
out of memory. [[email protected]], do you agree we need the approach from 
002.patch, which is robust against this behaviour?

We've been running MR-style workflows on our test-cluster with 002.patch for a 
while now (~ 2 months) and haven't run into any issues. Of course, additional 
testing (more workflows) would be welcome.



> S3a to use thread pool that blocks clients
> ------------------------------------------
>
>                 Key: HADOOP-11684
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11684
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.7.0
>            Reporter: Thomas Demoor
>            Assignee: Thomas Demoor
>         Attachments: HADOOP-11684-001.patch, HADOOP-11684-002.patch, 
> HADOOP-11684-003.patch
>
>
> Currently, if fs.s3a.max.total.tasks are queued and another (part)upload 
> wants to start, a RejectedExecutionException is thrown. 
> We should use a threadpool that blocks clients, nicely throtthling them, 
> rather than throwing an exception. F.i. something similar to 
> https://github.com/apache/incubator-s4/blob/master/subprojects/s4-comm/src/main/java/org/apache/s4/comm/staging/BlockingThreadPoolExecutorService.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to