[PATCH] D105904: [clangd] Support `#pragma mark` in the outline

David Goldman via Phabricator via cfe-commits Wed, 04 Aug 2021 09:50:16 -0700

dgoldman marked 2 inline comments as done.
dgoldman added inline comments.


================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:664
+      }
+      // NextPragma is after us but before the next symbol. Our reign is over.
+      break;
----------------
kadircet wrote:
> I suppose this is not correct in cases like.
> 
> ```
> #pragma mark - Outer
> foo {
>   #pragma mark - Foo
> }
> bar {
>   # pragma mark - Bar
> }
> baz {
>   # pragma mark - Baz
> }
> ```
> 
> We'll first move `foo` under `Outer`, then start processing `bar` we notice 
> that It comes after pragma `Foo`, and then also notice that pragma `Foo` is 
> not contained in `bar` hence break. but all of `foo,bar,baz` should've been 
> moved to be under `Outer` instead.
Yeah that's a good point, we'd have to do something like your suggested 
solution or maintain more state to consider the children that were moved.


================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:535
+/// by range.
+std::vector<DocumentSymbol> mergePragmas(std::vector<DocumentSymbol> &Syms,
+                                         std::vector<PragmaMarkSymbol> 
&Pragmas,
----------------
kadircet wrote:
> dgoldman wrote:
> > kadircet wrote:
> > > dgoldman wrote:
> > > > kadircet wrote:
> > > > > dgoldman wrote:
> > > > > > kadircet wrote:
> > > > > > > dgoldman wrote:
> > > > > > > > sammccall wrote:
> > > > > > > > > FWIW the flow control/how we make progress seem hard to 
> > > > > > > > > follow here to me.
> > > > > > > > > 
> > > > > > > > > In particular I think I'm struggling with the statefulness of 
> > > > > > > > > "is there an open mark group".
> > > > > > > > > 
> > > > > > > > > Possible simplifications:
> > > > > > > > >  - define a dummy root symbol, which seems clearer than the 
> > > > > > > > > vector<symbols> + range
> > > > > > > > >  - avoid reverse-sorting the list of pragma symbols, and just 
> > > > > > > > > consume from the front of an ArrayRef instead
> > > > > > > > >  - make the outer loop over pragmas, rather than symbols. It 
> > > > > > > > > would first check if the pragma belongs directly here or not, 
> > > > > > > > > and if so, loop over symbols to work out which should become 
> > > > > > > > > children. This seems very likely to be efficient enough in 
> > > > > > > > > practice (few pragmas, or most children are grouped into 
> > > > > > > > > pragmas)
> > > > > > > > > define a dummy root symbol, which seems clearer than the 
> > > > > > > > > vector<symbols> + range
> > > > > > > > 
> > > > > > > > I guess? Then we'd take in a `DocumentSymbol & and a 
> > > > > > > > ArrayRef<PragmaMarkSymbol> & (or just by value and then return 
> > > > > > > > it as well). The rest would be the same though
> > > > > > > > 
> > > > > > > > > In particular I think I'm struggling with the statefulness of 
> > > > > > > > > "is there an open mark group".
> > > > > > > > 
> > > > > > > > We need to track the current open group if there is one in 
> > > > > > > > order to move children to it.
> > > > > > > > 
> > > > > > > > > make the outer loop over pragmas, rather than symbols. It 
> > > > > > > > > would first check if the pragma belongs directly here or not, 
> > > > > > > > > and if so, loop over symbols to work out which should become 
> > > > > > > > > children. This seems very likely to be efficient enough in 
> > > > > > > > > practice (few pragmas, or most children are grouped into 
> > > > > > > > > pragmas)
> > > > > > > > 
> > > > > > > > The important thing here is knowing where the pragma mark ends 
> > > > > > > > - if it doesn't, it actually gets all of the children. So we'd 
> > > > > > > > have to peak at the next pragma mark, add all symbols before it 
> > > > > > > > to us as children, and then potentially recurse to nest it 
> > > > > > > > inside of a symbol. I'll try it out and see if it's simpler.
> > > > > > > > 
> > > > > > > > 
> > > > > > > ```
> > > > > > > while(Pragmas) {
> > > > > > > // We'll figure out where the Pragmas.front() should go.
> > > > > > > Pragma P = Pragmas.front();
> > > > > > > DocumentSymbol *Cur = Root;
> > > > > > > while(Cur->contains(P)) {
> > > > > > >   auto *OldCur = Cur;
> > > > > > >   for(auto *C : Cur->children) {
> > > > > > >      // We assume at most 1 child can contain the pragma (as 
> > > > > > > pragmas are on a single line, and children have disjoint ranges)
> > > > > > >      if (C->contains(P)) {
> > > > > > >          Cur = C;
> > > > > > >          break;
> > > > > > >      }
> > > > > > >   }
> > > > > > >   // Cur is immediate parent of P
> > > > > > >   if (OldCur == Cur) {
> > > > > > >     // Just insert P into children if it is not a group and we 
> > > > > > > are done.
> > > > > > >     // Otherwise we need to figure out when current pragma is 
> > > > > > > terminated:
> > > > > > > // if next pragma is not contained in Cur, or is contained in one 
> > > > > > > of the children, It is at the end of Cur, nest all the children 
> > > > > > > that appear after P under the symbol node for P.
> > > > > > > // Otherwise nest all the children that appear after P but before 
> > > > > > > next pragma under the symbol node for P.
> > > > > > > // Pop Pragmas and break
> > > > > > >   }
> > > > > > > }
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > Does that make sense, i hope i am not missing something obvious? 
> > > > > > > Complexity-wise in the worst case we'll go all the way down to a 
> > > > > > > leaf once per pragma, since there will only be a handful of 
> > > > > > > pragmas most of the time it shouldn't be too bad.
> > > > > > I've implemented your suggestion. I don't think it's simpler, but 
> > > > > > LMK, maybe it can be improved.
> > > > > oops, i was looking into an older revision and missed mergepragmas2, 
> > > > > i think it looks quite similar to this one but we can probably get 
> > > > > rid of the recursion as well and simplify a couple more cases
> > > > This makes sense,  I think that works for the most part besides 
> > > > dropping the recursion, specifically for
> > > > 
> > > > ```
> > > >       // Next pragma is contained in the Sym, it belongs there and 
> > > > doesn't
> > > >       // affect us at all.
> > > >       if (Sym.range.contains(NextPragma.DocSym.range)) {
> > > >         Sym.children = mergePragmas2(Sym.children, Pragmas, Sym.range);
> > > >         continue;
> > > >       }
> > > > ```
> > > > 
> > > > I guess we could explicitly forbid 3+ layers of nesting and handle it 
> > > > inline there? But I'm not sure it's worth the effort to rewrite all of 
> > > > this - the recursion shouldn't be deep and we avoid needing to shift 
> > > > vector elements over by recreating a new one.
> > > Sorry I don't follow why we can't get rid of the recursion in this case.
> > > 
> > > Two loop solution I described above literally tries to find the document 
> > > symbol node, such that the current pragma is contained in that node && 
> > > current pragma isn't contained in any of that node's children. Afterwards 
> > > it inserts the pragma into that node and starts traversing the tree from 
> > > root again for the next pragma.
> > > 
> > > Again I don't follow where the `3+ layers of nesting` constraint came 
> > > from. But I do feel like the iterative version is somewhat easier to 
> > > reason about (especially keeping track of what's happening with 
> > > `pragmas.front()` and the way it bails out via `parentrange` check). 
> > > Shifting of the vector is definitely unfortunate but I think it shouldn't 
> > > imply big performance hits in practice as we are only shifting the 
> > > children of a single node.
> > Yeah, I understand the first part,  I think specifically handling the group 
> > case after you discover where it needs to be inserted is a bit more 
> > complicated, something like the following:
> > 
> > 
> > ```
> >         // Pragma is a group, so we need to figure out where it terminates:
> >         // - If the next Pragma is not contained in Cur, P owns all of its
> >         //   parent's children which occur after P.
> >         // - If the next pragma is contained in Cur but actually belongs to 
> > one
> >         //   of the parent's children, we temporarily skip over it and look 
> > at
> >         //   the next pragma to decide where we end.
> >         // - Otherwise nest all of its parent's children which occur after 
> > P but
> >         //   before the next pragma.
> > ```
> > 
> > And yeah, shifting in the worst case is definitely worst (due to repeat 
> > shifting) although it shouldn't be too common in practice (things like a 
> > large @implementation block would probably have the most children).
> > we temporarily skip over it and look at the next pragma to decide where we 
> > end.
> 
> Ah you are right I was missing this part. As you suggested we'll need to loop 
> over all the remaining pragmas, which will make the complexity `((quadratic 
> in terms of # of pragmas) * (depth of document symbol tree))` I think it is 
> still acceptable in practice. I don't think there'll be more than a dozen 
> pragmas in a file. (moreover i think this problem exists with the current 
> implementation you've provided, see my comment below)
> 
> Are you fine with reshaping the implementation in that way then? Because I 
> still feel like the code will end up looking a lot more straightforward, even 
> if not the most optimal and we can try to optimize the rest if it shows up in 
> the latency graphs.
Implementing this now - there's one other edge case here, in that we don't want 
to treat `#pragma mark Blah` as a group. I guess I can maintain state similar 
to the first solution to solve that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105904/new/

https://reviews.llvm.org/D105904

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105904: [clangd] Support `#pragma mark` in the outline

Reply via email to