sammccall added inline comments.

================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:44
 
+// Returns true if \p Query can be found as a sub-scope inside \p Scope.
+bool approximateScopeMatch(llvm::StringRef Scope,
----------------
kadircet wrote:
> sammccall wrote:
> > I had a little trouble following this...
> > It seems a little simpler (fewer vars to track) if we avoid the up-front 
> > split on scopes.
> > 
> > ```
> > assert(Scope.empty() || Scope.endswith("::")); // or handle in some way
> > // Walk through Scope, consuming matching tokens from Query.
> > StringRef First;
> > while (!Scope.empty() && !Query.empty()) {
> >   tie(First, Scope) = Scope.split("::");
> >   if (Query.front() == First)
> >     Query = Query.drop_front();
> > }
> > return Query.empty(); // all components of query matched
> > ```
> > 
> > in fact we can avoid preprocessing query too:
> > 
> > ```
> > // Query is just a StringRef
> > assert(Scope.empty() || Scope.endswith("::")); // or handle in some way
> > assert(Query.empty() || Query.endswith("::"));
> > 
> > // Walk through Scope, consuming matching tokens from Query.
> > StringRef First;
> > while (!Scope.empty() && !Query.empty()) {
> >   tie(First, Scope) = Scope.split("::");
> >   Query.consume_front(StringRef(First.data(), First.size() + 2) /*Including 
> > ::*/);
> > }
> > return Query.empty(); // all components of query matched
> > ```
> Yes but they would do different things. I believe the confusion is caused by 
> usage of `sub-scope` without a clear definition.  The codes you've suggested 
> are performing sub-sequence matches rather than sub-string(i.e. we are 
> looking for a contigious segment in `Scope` that matches `Query`).
> 
> I believe a query of the form `a::c::` shouldn't be matched by `a::b::c::`. I 
> can try simplifying the logic, but it would be nice to agree on the behaviour 
> :D.
> 
> Sorry if I miscommunicated this so far.
Ah right, I was indeed misreading the code. Let's have some definitions...

given query `a::b::Foo` with scope `a::b::`

| | a::b:: | `W::a::b::` | `a::X::b::` | `a::b::Y` |
| exact | * | | | |
| prefix |*| | | * |
| suffix | *|* | | |
| substring | * | * | | * |
| subsequence | * | * | * | * |

These support correcting different types of "errors":
 - exact: none
 - prefix: may omit namespaces immediately before Foo
 - suffix: query may be rooted anywhere (other than global ns)
 - substring: query rooted anywhere, omit namespaces before Foo
 - subsequence: may omit any component

We know "exact" is too strict.

I think "prefix" and by extension "substring" aren't particularly compelling 
rules as the "immediately before Foo" requirement is arbitrary.
Why does `a::b::Foo` match `a::b::c::Foo` and not `a::c::b::Foo`? In each case 
we've omitted a namespace inside the query, the only difference is what it's 
sandwiched between.

Suffix makes intuitive sense, it accepts strings that make sense *somewhere* in 
the codebase.
Subsequence makes intuitive sense too: you're allowed to forget uninteresting 
components, similar to how fuzzy-match lets you omit uninteresting words.

I'd prefer one of those and don't really mind which - I'd assumed subsequence 
was intended. Suffix is way easier to implement though :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88814/new/

https://reviews.llvm.org/D88814

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to