[PATCH] D78429: [clangd] Metric tracking through Tracer

Sam McCall via Phabricator via cfe-commits Sun, 19 Apr 2020 16:02:13 -0700

sammccall added a comment.

Thanks for tackling this! Mostly API stuff, obviously :-)

================
Comment at: clang-tools-extra/clangd/Trace.h:35
+/// itself to active tracer on destruction.
+struct Metric {
+  enum Type {
----------------
Conceptually there's a difference between a metric (the thing being measured, 
such as "latency distribution by method") and the measurement (such as "code 
completion took 90ms this time").
The attributes such as name, type, permitted labels are associated with the 
metric - these are schema. Whereas the measurement is a tuple `(metric, actual 
label, value)` - it doesn't make sense to have measurements for the same metric 
with different types. And the metric is immutable while the measurement is 
transient.

For the API (to emit data) it seems natural to model this split using constexpr 
objects for the metrics, and methods for the measurement:
```
constexpr Metric IncomingLatencyMS("/clangd/latency", Distribution);
...
void handleCodeCompletion() {
  ...
  IncomingLatencyMS.record(90);
}
```
(IOW I think our favorite metrics framework got this right)

For the SPI (to consume data), things are a bit muddier. For I think we want to 
avoid having a protocol to explicitly register metrics, so there's an 
attraction to just reporting the flat `(name, type, label, value)` tuple. I 
also quite like the idea of reusing the type above and reporting this as 
`(const Metric&, label, value)` which makes extending Metric more natural.

================
Comment at: clang-tools-extra/clangd/Trace.h:36
+struct Metric {
+  enum Type {
+    /// Occurence of an event, e.g. cache access.
----------------
I think we're missing an important type: a value directly provided by the 
measurement, such as memory usage.

================
Comment at: clang-tools-extra/clangd/Trace.h:36
+struct Metric {
+  enum Type {
+    /// Occurence of an event, e.g. cache access.
----------------
sammccall wrote:
> I think we're missing an important type: a value directly provided by the 
> measurement, such as memory usage.
This enum is about defining the relationship between the metric and the 
measurements, but also about the (related) nature of the metric itself.

The current names are measurement-centric, which is good for the former but not 
so much for the latter, and a bit confusing if we set this enum when 
initializing the *metric*. I'd consider having this be metric-centric and 
documenting the measurements instead. e.g.

```
enum MetricType {
  /// A number whose value is meaningful, and may vary over time.
  /// Each measurement replaces the current value.
  Value,

  /// An aggregate number whose rate of change over time is meaningful.
  /// Each measurement is an increment for the counter.
  Counter,

  /// A distribution of values with a meaningful mean and count.
  /// Each measured value is a sample for the distribution.
  /// The distribution is assumed not to vary, samples are aggregated over time.
  Distribution,
};
```

================
Comment at: clang-tools-extra/clangd/Trace.h:42
+  };
+  /// Uniquely identifies the metric.
+  const llvm::StringLiteral Name;
----------------
An example to show the naming scheme?
I'd suggest just "method_latency" or so for most things - we can add "foo/bar" 
or "foo.bar" for hierarchy if needed, but I don't think a leading slash or a 
"clangd" in the path add anything. 

================
Comment at: clang-tools-extra/clangd/Trace.h:46
+  /// of a cache access.
+  std::vector<std::string> Labels;
+  double Value = 0;
----------------
(Not clear whether this is the allowed set of labels for a metric, bag of tags 
for a measurement, tuple of coordinates for a measurement...)

What are the design goals around labels? Can we make them as simple as possible 
(or remove them) for now?

Take the example of cache hit/miss.
There are at least two ways to model this without labels:
 - `cache.hit` and `cache.miss` counters
 - `cache_hit` distribution: measure 1 for hit, 0 for miss. Mean is the hit 
rate, count is the number of attempts.

The latter seems to capture the structure pretty well, but doesn't generalize 
(what if there are three categories?). The former seems almost equivalent to 
labels though. 

Benefits of labels:
 - part of the measurement, therefore can be dynamic (like filenames)
 - explicit that it's meaningful to aggregate across labels (naming conventions 
are OK for this too for one dimension)
 - generalizes sensibly to multiple dimensions (schema/docs become important)
 - quoting/escaping is annoying for labels with funny characters (e.g. file 
paths)
 - can bridge to other systems without knowledge of metric structure/semantics 
(this seems like a bad idea)

The first benefit seems pretty strong actually. I would suggest only supporting 
a single label though - it's a way simpler interface, we avoid complicated 
metrics whose schema might need documentation and enforcement,  and we avoid 
any hint of memory allocation in the common case (just passing around a 
stringref that points to a literal, a filename we own, etc).

================
Comment at: clang-tools-extra/clangd/Trace.h:48
+  double Value = 0;
+  Type Ty;
+
----------------
nit: since you have an unscoped enum, give the good name to the member and the 
bad name to the enum (which users need rarely spell)

================
Comment at: clang-tools-extra/clangd/Trace.h:57
+/// Creates a context that will report its duration as a metric on destruction.
+Context latencyTrackingContext(llvm::StringRef OpName);
+
----------------
Events we measure the duration of seem so closely related to trace spans that I 
don't think we can provide both without making it clear how they should relate 
to each other.

How horrible do you think it would be to have a second `Metric* Measure = 
nullptr` in the `Span` constructor?

It's hopelessly coupling together these two things that could otherwise be 
split, but...

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78429/new/

https://reviews.llvm.org/D78429

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78429: [clangd] Metric tracking through Tracer

Reply via email to