sammccall added a comment.
Thanks for tackling this! Mostly API stuff, obviously :-)
================
Comment at: clang-tools-extra/clangd/Trace.h:35
+/// itself to active tracer on destruction.
+struct Metric {
+ enum Type {
----------------
Conceptually there's a difference between a metric (the thing being measured,
such as "latency distribution by method") and the measurement (such as "code
completion took 90ms this time").
The attributes such as name, type, permitted labels are associated with the
metric - these are schema. Whereas the measurement is a tuple `(metric, actual
label, value)` - it doesn't make sense to have measurements for the same metric
with different types. And the metric is immutable while the measurement is
transient.
For the API (to emit data) it seems natural to model this split using constexpr
objects for the metrics, and methods for the measurement:
```
constexpr Metric IncomingLatencyMS("/clangd/latency", Distribution);
...
void handleCodeCompletion() {
...
IncomingLatencyMS.record(90);
}
```
(IOW I think our favorite metrics framework got this right)
For the SPI (to consume data), things are a bit muddier. For I think we want to
avoid having a protocol to explicitly register metrics, so there's an
attraction to just reporting the flat `(name, type, label, value)` tuple. I
also quite like the idea of reusing the type above and reporting this as
`(const Metric&, label, value)` which makes extending Metric more natural.
================
Comment at: clang-tools-extra/clangd/Trace.h:36
+struct Metric {
+ enum Type {
+ /// Occurence of an event, e.g. cache access.
----------------
I think we're missing an important type: a value directly provided by the
measurement, such as memory usage.
================
Comment at: clang-tools-extra/clangd/Trace.h:36
+struct Metric {
+ enum Type {
+ /// Occurence of an event, e.g. cache access.
----------------
sammccall wrote:
> I think we're missing an important type: a value directly provided by the
> measurement, such as memory usage.
This enum is about defining the relationship between the metric and the
measurements, but also about the (related) nature of the metric itself.
The current names are measurement-centric, which is good for the former but not
so much for the latter, and a bit confusing if we set this enum when
initializing the *metric*. I'd consider having this be metric-centric and
documenting the measurements instead. e.g.
```
enum MetricType {
/// A number whose value is meaningful, and may vary over time.
/// Each measurement replaces the current value.
Value,
/// An aggregate number whose rate of change over time is meaningful.
/// Each measurement is an increment for the counter.
Counter,
/// A distribution of values with a meaningful mean and count.
/// Each measured value is a sample for the distribution.
/// The distribution is assumed not to vary, samples are aggregated over time.
Distribution,
};
```
================
Comment at: clang-tools-extra/clangd/Trace.h:42
+ };
+ /// Uniquely identifies the metric.
+ const llvm::StringLiteral Name;
----------------
An example to show the naming scheme?
I'd suggest just "method_latency" or so for most things - we can add "foo/bar"
or "foo.bar" for hierarchy if needed, but I don't think a leading slash or a
"clangd" in the path add anything.
================
Comment at: clang-tools-extra/clangd/Trace.h:46
+ /// of a cache access.
+ std::vector<std::string> Labels;
+ double Value = 0;
----------------
(Not clear whether this is the allowed set of labels for a metric, bag of tags
for a measurement, tuple of coordinates for a measurement...)
What are the design goals around labels? Can we make them as simple as possible
(or remove them) for now?
Take the example of cache hit/miss.
There are at least two ways to model this without labels:
- `cache.hit` and `cache.miss` counters
- `cache_hit` distribution: measure 1 for hit, 0 for miss. Mean is the hit
rate, count is the number of attempts.
The latter seems to capture the structure pretty well, but doesn't generalize
(what if there are three categories?). The former seems almost equivalent to
labels though.
Benefits of labels:
- part of the measurement, therefore can be dynamic (like filenames)
- explicit that it's meaningful to aggregate across labels (naming conventions
are OK for this too for one dimension)
- generalizes sensibly to multiple dimensions (schema/docs become important)
- quoting/escaping is annoying for labels with funny characters (e.g. file
paths)
- can bridge to other systems without knowledge of metric structure/semantics
(this seems like a bad idea)
The first benefit seems pretty strong actually. I would suggest only supporting
a single label though - it's a way simpler interface, we avoid complicated
metrics whose schema might need documentation and enforcement, and we avoid
any hint of memory allocation in the common case (just passing around a
stringref that points to a literal, a filename we own, etc).
================
Comment at: clang-tools-extra/clangd/Trace.h:48
+ double Value = 0;
+ Type Ty;
+
----------------
nit: since you have an unscoped enum, give the good name to the member and the
bad name to the enum (which users need rarely spell)
================
Comment at: clang-tools-extra/clangd/Trace.h:57
+/// Creates a context that will report its duration as a metric on destruction.
+Context latencyTrackingContext(llvm::StringRef OpName);
+
----------------
Events we measure the duration of seem so closely related to trace spans that I
don't think we can provide both without making it clear how they should relate
to each other.
How horrible do you think it would be to have a second `Metric* Measure =
nullptr` in the `Span` constructor?
It's hopelessly coupling together these two things that could otherwise be
split, but...
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D78429/new/
https://reviews.llvm.org/D78429
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits