ilya-biryukov added a comment.

Hi Jan,

Sure! And sorry for posting these metrics for a while (we had other patches 
mentioning them) without proper explanation.

We simulate a bunch of completions at random points in random files from our 
internal codebase.  We assume the desired completion item is the one written in 
the code.
Intuitively, the higher it's ranked the better. In an attempt to measure this, 
we compute the following metrics:

- MRR <https://en.wikipedia.org/wiki/Mean_reciprocal_rank>
- `Top-N` - percentage of completions where the searched element is among the 
first `n` items.

We also independently calculate those metrics for interesting groups of 
completions:

- `OVERALL`. All completions.
- `INITIALISMS`. Completions with query (what the user typed) matching first 
characters of each segment in the desired completion item, e.g. `SI` or `SIC` 
for `SomeInterestingClass`.
- `EXPLICIT_MEMBER_ACCESS`. Desired completion item is a class member and the 
completion is in a member access expression, e.g. `vector().^push_back()`.
- `WANT_LOCAL`. Desired completion item is in the same file as the completion 
itself.
- `CROSS_NAMESPACE`. Simulated completion removes the namespace prefix, in 
addition to the identifier, e.g. we expect to complete `std::vector` not just 
`vector`.
- `WITH EXPECTED_TYPE`. Only completions in a context where expected type is 
available, e.g. `int* a = ^`.

For each of the picked positions in a file, we try to complete a prefix of the 
desired completion item of length up to `5` and the full identifier (except 
initialisms, more on them below).
E.g. for the following source code:

  int test() {
    std::vector<int> vec;
    vec.^push_back(10); // say, simulation runs here
  }

We would try run simulation for the following completions: `vec.^`, `vec.p^`, 
`vec.pu^`, `vec.pus^`, `vec.push^` and `vec.push_^`.
You can see the breakdown of the metrics for each of the prefix lengths in each 
of the completion groups.
Individual metrics for a fixed length of the prefix are written in the `Filter 
length 0-5` sections.
We also try completion with the full identifier (e.g. `vec.push_back^`), 
metrics for those are written in the `Full identifiers` section.
Aggregated metrics for all completions in a group are written in the `All 
measurements` section.

The "initialisms" groups is special, for those we use first chars of the 
segments inside the desired completion item rather than the prefix, e.g. 
`vec.p^`, `vec.pb^`.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59300/new/

https://reviews.llvm.org/D59300



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D59300: [clangd] Tun... Ilya Biryukov via Phabricator via cfe-commits

Reply via email to