[ 
https://issues.apache.org/jira/browse/HADOOP-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18024107#comment-18024107
 ] 

ASF GitHub Bot commented on HADOOP-19712:
-----------------------------------------

steveloughran opened a new pull request, #8006:
URL: https://github.com/apache/hadoop/pull/8006

   
   Reworked how entrySet() and values() work, using .forEach() iterators after 
reviewing what ConcurrentHashMap does internally; it does a (safe) traverse.
   
   Add EvaluatingStatisticsMap.forEach() implementation which maps the passed 
in BiConsumer down to the evaluators.forEach, evaluating each value as it goes.
   
   Use that in IOStatisticsBinding.snapshot() code.
   
   Tests for all this.
   
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [X] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> S3A: Deadlock observed in IOStatistics EvaluatingStatisticsMap.entryset()
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-19712
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19712
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, fs/s3
>    Affects Versions: 3.3.6, 3.4.2
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>
> We have evidence that `IOStatisticsSupport.snapshotIOStatistics()` can hang, 
> specifically on the statistics collected by an S3AInputStream, whose 
> statistics are merged in to the FS stats in close();
> {code}
> jdk.internal.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
> java.util.concurrent.ForkJoinTask.awaitDone(ForkJoinTask.java:468)
> java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:687)
> java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:927)
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
> org.apache.hadoop.fs.statistics.impl.EvaluatingStatisticsMap.entrySet(EvaluatingStatisticsMap.java:166)
> java.util.Collections$UnmodifiableMap.entrySet(Collections.java:1529)
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.copyMap(IOStatisticsBinding.java:172)
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.snapshotMap(IOStatisticsBinding.java:216)
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.snapshotMap(IOStatisticsBinding.java:199)
> org.apache.hadoop.fs.statistics.IOStatisticsSnapshot.snapshot(IOStatisticsSnapshot.java:165)
> org.apache.hadoop.fs.statistics.IOStatisticsSnapshot.<init>(IOStatisticsSnapshot.java:125)
> org.apache.hadoop.fs.statistics.IOStatisticsSupport.snapshotIOStatistics(IOStatisticsSupport.java:49)
> {code}
> the code in question is calling `parallelStream()`, which uses a fixed pool 
> of threads shared by all uses of the API
> {code}
>     Set<Entry<String, E>> r = evalEntries.parallelStream().map((e) ->
>         new EntryImpl<>(e.getKey(), e.getValue().apply(e.getKey())))
>         .collect(Collectors.toSet());
> {code}
> Proposed: 
> * move off parallelStream() to stream()
> * review code to if there is any other way this iteration can lead to a 
> deadlock, e.g. the apply() calls.
> * could we do the merge more efficiently?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to