[
https://issues.apache.org/jira/browse/HADOOP-19664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016496#comment-18016496
]
Ahmar Suhail commented on HADOOP-19664:
---------------------------------------
Sync.
It's using async currently. But the readVectored + AAL use case is not ideal
for the async client. As we already have our own thread pool, and each thread
is responsible for making a single S3 request, and start reading data from that
input stream immediately to fill the internal buffers..
With the async client, this means you need to join() immediately, and when at a
higher concurrency things get stuck in the Netty thread pool and the
AsyncResponseTransformer.toBlockingInputStream() of
s3AsyncClient
.getObject(builder.build(), AsyncResponseTransformer.toBlockingInputStream()).
S3Async client works well (I think) when you have high concurrency but don't
need to join on the data immediately, so the netty io pool is sufficient to
satisfy those requests.
> S3A Analytics-Accelerator: Move AAL to use Java sync client
> -----------------------------------------------------------
>
> Key: HADOOP-19664
> URL: https://issues.apache.org/jira/browse/HADOOP-19664
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.5.0
> Reporter: Ahmar Suhail
> Priority: Major
>
> Java sync client is giving the best performance for our use case, especially
> for readVectored() where a large number of requests can be made concurrently.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]