[
https://issues.apache.org/jira/browse/HADOOP-18889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768038#comment-17768038
]
Steve Loughran commented on HADOOP-18889:
-----------------------------------------
+if the hostname is valid but no service there, sdk will retry a lot, as will
s3a (going to cut back on the sdk retries).
if you do this in a {{fs -list)) then try to interrupt the operation you get an
npe
{code}
^C^C^C^C^C^C^C^C^C^C^C^C2023-09-22 15:09:31,237 [main] DEBUG fs.FsShell
(FsShell.java:run(344)) - Error
java.lang.NullPointerException
at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$13(S3AFileSystem.java:2904)
at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2895)
at
org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2514)
at
org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
-ls: Fatal internal error
java.lang.NullPointerException
{code}
this is happening because the connection refused retry logic is happening in
the async list, which doesn't check for the fs closed.
the process has effectively hung. even cutting back on retries its a slow
failure
{code}
minimums=((object_list_request.failures.min=0)
(op_get_file_status.min=1)
(op_glob_status.min=5)
(op_list_status.failures.min=35624));
maximums=((object_list_request.failures.max=140)
(op_get_file_status.max=1)
(op_glob_status.max=5)
(op_list_status.failures.max=35624));
means=((object_list_request.failures.mean=(samples=6, sum=164, mean=27.3333))
(op_get_file_status.mean=(samples=1, sum=1, mean=1.0000))
(op_glob_status.mean=(samples=1, sum=5, mean=5.0000))
(op_list_status.failures.mean=(samples=1, sum=35624, mean=35624.0000)));
{code}
conclusion: need to review retry and timeouts, and make sure there's no
excessive retry-round-retry stuff
> S3A: V2 SDK client does not work with third-party store
> -------------------------------------------------------
>
> Key: HADOOP-18889
> URL: https://issues.apache.org/jira/browse/HADOOP-18889
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Critical
>
> testing against an external store without specifying region now blows up
> because the region is queried off eu-west-1.
> What are we do to here? require the region setting *which wasn't needed
> before? what even region do we provide for third party stores?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]