[
https://issues.apache.org/jira/browse/HADOOP-18635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693226#comment-17693226
]
ASF GitHub Bot commented on HADOOP-18635:
-----------------------------------------
steveloughran commented on PR #5402:
URL: https://github.com/apache/hadoop/pull/5402#issuecomment-1443788914
I really don't like how the results come back.
I'm going to propose adding IOStatistics support to distcp so lined up for
future work and to not modify the source config to suddenly become two way
exchange of data
1. DistCp to implement IOStatisticsSource
1. until job finishes, getIOStatistics() to return null
3. when job finished,
```
// to create a builder
IOStatisticsStore iostats = IOStatisticsBinding.iostatisticsStore()
.withCounter(DISTCP_TOTAL_BYTES_COPIED)
.build()
// then set the counter to the retrieved value
iostats.setCounter(DISTCP_TOTAL_BYTES_COPIED, <counter>)
```
This is extra work and you have to learn a new api, but
* IOStatisticsAssertions has the asserts
* IOStatisticsLogging has pretty printing
* you can take an IOStatisticsSnapshot and send over the wire as json or
java serialized object
* lines it up perfectly for us collecting more detailed stats, not just from
the workers (trickier...) but also cost of directory scanning, cleanup etc.
> Expose distcp counters to user via config parameter and distcp contants
> -----------------------------------------------------------------------
>
> Key: HADOOP-18635
> URL: https://issues.apache.org/jira/browse/HADOOP-18635
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools/distcp
> Affects Versions: 3.4.0
> Reporter: Amit Saonerkar
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Currently users or application such as Hive cannot access directly the distcp
> counters such as total number of bytes copied by distcp operation.
> This Jira is to enable this functionality in distcp tool.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]