[
https://issues.apache.org/jira/browse/HADOOP-16830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121005#comment-17121005
]
Steve Loughran commented on HADOOP-16830:
-----------------------------------------
Interesting thought. We don't actually care what metrics get collected -what I
do want is to have its extensible and unique to each object instance being
used, rather than shared across all instances or to a single thread.
A key goal is to allow applications to aggregate statistics; for counters this
is a simple addition. And I've been avoiding quantiles/ metrics which can go
down as well as up because they don't really aggregate.
Performance metrics are interesting though: they can be aggregated...an
aggregate mean can be recalcuated if the size of each set of values is known; a
max value is simply the largest, isn't it.
Maybe the trick to do here is for each value to be more than just an integer
but a type (counter, mean-perfomance, min-perf, max-perf) and at least two
values (needed for the mean recalculation). Initially we'd just have that
integer enum and either enough fields to cover all eventualities or an array.
We need the results to be marshallable (protobuf, json, serializable) and
stable enough for apps to use.
yes, this will need changes to the initial design -but it's better to have
something extensible now rather than realise later we missed an opportunty.
(that points to an array of values), doesn't it, maybe a unit too for something
like (type, unit, long values[]), plus public helper methods to combine two
fields of a specific type. Would this work? Or are we overengineering it?
The S3A committees do actually aggregate file systems StorageStatisics, but as
that is per-fs, and spark has many workers sharing it, it's not that useful.
this API is intended to be something spark/tez/impala can adopt and aggregate
for meaningful reporting. What would suit best here?
> Add public IOStatistics API; S3A to support
> -------------------------------------------
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs, fs/s3
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
>
> Applications like to collect the statistics which specific operations take,
> by collecting exactly those operations done during the execution of FS API
> calls by their individual worker threads, and returning these to their job
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one;
> Impala &c can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread,
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]