[
https://issues.apache.org/jira/browse/HADOOP-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709695#comment-15709695
]
Thomas Demoor commented on HADOOP-13826:
----------------------------------------
I think [~mackrorysd]'s implementation is heading in the right direction.
Some questions / suggestions:
* The {{controlTypes}} do not have a large memory and bandwidth impact as they
carry little payload. Consequently, I think we can allow a lot of active
threads here and the waiting room can be unbounded. I hope this would fix the
issues [~mackrorysd] is still encountering. (In contrast to my earlier thinking
above, I don't think the number of active threads needs to be shared between
the two types, it seems unlikely that {{controlTypes}} will use significant
resources)
* The {{subTaskTypes}} have the potential to overwhelm memory and bandwidth
usage and should thus be run from the bounded threadpool. We need to take care
that all relevant classes are captured here.
* I am not 100% sure if what I propose here would eliminate all deadlocks. I do
not understand the deadlock scenario entirely (yet) from the discussion above.
If you would have more insight please help me out.
> S3A Deadlock in multipart copy due to thread pool limits.
> ---------------------------------------------------------
>
> Key: HADOOP-13826
> URL: https://issues.apache.org/jira/browse/HADOOP-13826
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.7.3
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
> Priority: Critical
> Attachments: HADOOP-13826.001.patch, HADOOP-13826.002.patch
>
>
> In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The
> TransferManager javadocs
> (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html)
> explain how this is possible:
> {quote}It is not recommended to use a single threaded executor or a thread
> pool with a bounded work queue as control tasks may submit subtasks that
> can't complete until all sub tasks complete. Using an incorrectly configured
> thread pool may cause a deadlock (I.E. the work queue is filled with control
> tasks that can't finish until subtasks complete but subtasks can't execute
> because the queue is filled).{quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]