[ 
https://issues.apache.org/jira/browse/HADOOP-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285257#comment-15285257
 ] 

Colin Patrick McCabe commented on HADOOP-13065:
-----------------------------------------------

bq. I filed HADOOP-13140 to track the effort.

Thanks, [~liuml07].

bq. [~mingma] wrote: BTW, is the network distance metrics something general for 
all file system or it is more specific to HDFS? For example, local file system 
doesn't need that. If it is more HDFS specific, wonder if we should move it to 
HDFS specific metrics.

I agree that it's more conceptually consistent to put the distance-related 
metrics in HDFS-specific code.  However, we would have to develop an optimized 
thread-local mechanism to do this, to avoid causing a performance regression in 
HDFS stream performance.  Perhaps it would be better to simply move this to 
HDFS's existing per-stream ReadStatistics for now.

> Add a new interface for retrieving FS and FC Statistics
> -------------------------------------------------------
>
>                 Key: HADOOP-13065
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13065
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Ram Venkatesh
>            Assignee: Mingliang Liu
>             Fix For: 2.8.0
>
>         Attachments: HADOOP-13065-007.patch, HADOOP-13065.008.patch, 
> HADOOP-13065.009.patch, HADOOP-13065.010.patch, HADOOP-13065.011.patch, 
> HADOOP-13065.012.patch, HADOOP-13065.013.patch, HDFS-10175.000.patch, 
> HDFS-10175.001.patch, HDFS-10175.002.patch, HDFS-10175.003.patch, 
> HDFS-10175.004.patch, HDFS-10175.005.patch, HDFS-10175.006.patch, 
> TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. 
> There is logic within DfsClient to map operations to these counters that can 
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, 
> createSymlink, delete, exists, mkdirs, rename and expose them as new 
> properties on the Statistics object. The operation-specific counters can be 
> used for analyzing the load imposed by a particular job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large 
> number of files.
> Once this information is available in the Statistics object, the app 
> frameworks like MapReduce can expose them as additional counters to be 
> aggregated and recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to