dgoldman marked 2 inline comments as done. dgoldman added inline comments.
================ Comment at: clang-tools-extra/clangd/FindSymbols.cpp:664 + } + // NextPragma is after us but before the next symbol. Our reign is over. + break; ---------------- kadircet wrote: > I suppose this is not correct in cases like. > > ``` > #pragma mark - Outer > foo { > #pragma mark - Foo > } > bar { > # pragma mark - Bar > } > baz { > # pragma mark - Baz > } > ``` > > We'll first move `foo` under `Outer`, then start processing `bar` we notice > that It comes after pragma `Foo`, and then also notice that pragma `Foo` is > not contained in `bar` hence break. but all of `foo,bar,baz` should've been > moved to be under `Outer` instead. Yeah that's a good point, we'd have to do something like your suggested solution or maintain more state to consider the children that were moved. ================ Comment at: clang-tools-extra/clangd/FindSymbols.cpp:535 +/// by range. +std::vector<DocumentSymbol> mergePragmas(std::vector<DocumentSymbol> &Syms, + std::vector<PragmaMarkSymbol> &Pragmas, ---------------- kadircet wrote: > dgoldman wrote: > > kadircet wrote: > > > dgoldman wrote: > > > > kadircet wrote: > > > > > dgoldman wrote: > > > > > > kadircet wrote: > > > > > > > dgoldman wrote: > > > > > > > > sammccall wrote: > > > > > > > > > FWIW the flow control/how we make progress seem hard to > > > > > > > > > follow here to me. > > > > > > > > > > > > > > > > > > In particular I think I'm struggling with the statefulness of > > > > > > > > > "is there an open mark group". > > > > > > > > > > > > > > > > > > Possible simplifications: > > > > > > > > > - define a dummy root symbol, which seems clearer than the > > > > > > > > > vector<symbols> + range > > > > > > > > > - avoid reverse-sorting the list of pragma symbols, and just > > > > > > > > > consume from the front of an ArrayRef instead > > > > > > > > > - make the outer loop over pragmas, rather than symbols. It > > > > > > > > > would first check if the pragma belongs directly here or not, > > > > > > > > > and if so, loop over symbols to work out which should become > > > > > > > > > children. This seems very likely to be efficient enough in > > > > > > > > > practice (few pragmas, or most children are grouped into > > > > > > > > > pragmas) > > > > > > > > > define a dummy root symbol, which seems clearer than the > > > > > > > > > vector<symbols> + range > > > > > > > > > > > > > > > > I guess? Then we'd take in a `DocumentSymbol & and a > > > > > > > > ArrayRef<PragmaMarkSymbol> & (or just by value and then return > > > > > > > > it as well). The rest would be the same though > > > > > > > > > > > > > > > > > In particular I think I'm struggling with the statefulness of > > > > > > > > > "is there an open mark group". > > > > > > > > > > > > > > > > We need to track the current open group if there is one in > > > > > > > > order to move children to it. > > > > > > > > > > > > > > > > > make the outer loop over pragmas, rather than symbols. It > > > > > > > > > would first check if the pragma belongs directly here or not, > > > > > > > > > and if so, loop over symbols to work out which should become > > > > > > > > > children. This seems very likely to be efficient enough in > > > > > > > > > practice (few pragmas, or most children are grouped into > > > > > > > > > pragmas) > > > > > > > > > > > > > > > > The important thing here is knowing where the pragma mark ends > > > > > > > > - if it doesn't, it actually gets all of the children. So we'd > > > > > > > > have to peak at the next pragma mark, add all symbols before it > > > > > > > > to us as children, and then potentially recurse to nest it > > > > > > > > inside of a symbol. I'll try it out and see if it's simpler. > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > while(Pragmas) { > > > > > > > // We'll figure out where the Pragmas.front() should go. > > > > > > > Pragma P = Pragmas.front(); > > > > > > > DocumentSymbol *Cur = Root; > > > > > > > while(Cur->contains(P)) { > > > > > > > auto *OldCur = Cur; > > > > > > > for(auto *C : Cur->children) { > > > > > > > // We assume at most 1 child can contain the pragma (as > > > > > > > pragmas are on a single line, and children have disjoint ranges) > > > > > > > if (C->contains(P)) { > > > > > > > Cur = C; > > > > > > > break; > > > > > > > } > > > > > > > } > > > > > > > // Cur is immediate parent of P > > > > > > > if (OldCur == Cur) { > > > > > > > // Just insert P into children if it is not a group and we > > > > > > > are done. > > > > > > > // Otherwise we need to figure out when current pragma is > > > > > > > terminated: > > > > > > > // if next pragma is not contained in Cur, or is contained in one > > > > > > > of the children, It is at the end of Cur, nest all the children > > > > > > > that appear after P under the symbol node for P. > > > > > > > // Otherwise nest all the children that appear after P but before > > > > > > > next pragma under the symbol node for P. > > > > > > > // Pop Pragmas and break > > > > > > > } > > > > > > > } > > > > > > > } > > > > > > > ``` > > > > > > > > > > > > > > Does that make sense, i hope i am not missing something obvious? > > > > > > > Complexity-wise in the worst case we'll go all the way down to a > > > > > > > leaf once per pragma, since there will only be a handful of > > > > > > > pragmas most of the time it shouldn't be too bad. > > > > > > I've implemented your suggestion. I don't think it's simpler, but > > > > > > LMK, maybe it can be improved. > > > > > oops, i was looking into an older revision and missed mergepragmas2, > > > > > i think it looks quite similar to this one but we can probably get > > > > > rid of the recursion as well and simplify a couple more cases > > > > This makes sense, I think that works for the most part besides > > > > dropping the recursion, specifically for > > > > > > > > ``` > > > > // Next pragma is contained in the Sym, it belongs there and > > > > doesn't > > > > // affect us at all. > > > > if (Sym.range.contains(NextPragma.DocSym.range)) { > > > > Sym.children = mergePragmas2(Sym.children, Pragmas, Sym.range); > > > > continue; > > > > } > > > > ``` > > > > > > > > I guess we could explicitly forbid 3+ layers of nesting and handle it > > > > inline there? But I'm not sure it's worth the effort to rewrite all of > > > > this - the recursion shouldn't be deep and we avoid needing to shift > > > > vector elements over by recreating a new one. > > > Sorry I don't follow why we can't get rid of the recursion in this case. > > > > > > Two loop solution I described above literally tries to find the document > > > symbol node, such that the current pragma is contained in that node && > > > current pragma isn't contained in any of that node's children. Afterwards > > > it inserts the pragma into that node and starts traversing the tree from > > > root again for the next pragma. > > > > > > Again I don't follow where the `3+ layers of nesting` constraint came > > > from. But I do feel like the iterative version is somewhat easier to > > > reason about (especially keeping track of what's happening with > > > `pragmas.front()` and the way it bails out via `parentrange` check). > > > Shifting of the vector is definitely unfortunate but I think it shouldn't > > > imply big performance hits in practice as we are only shifting the > > > children of a single node. > > Yeah, I understand the first part, I think specifically handling the group > > case after you discover where it needs to be inserted is a bit more > > complicated, something like the following: > > > > > > ``` > > // Pragma is a group, so we need to figure out where it terminates: > > // - If the next Pragma is not contained in Cur, P owns all of its > > // parent's children which occur after P. > > // - If the next pragma is contained in Cur but actually belongs to > > one > > // of the parent's children, we temporarily skip over it and look > > at > > // the next pragma to decide where we end. > > // - Otherwise nest all of its parent's children which occur after > > P but > > // before the next pragma. > > ``` > > > > And yeah, shifting in the worst case is definitely worst (due to repeat > > shifting) although it shouldn't be too common in practice (things like a > > large @implementation block would probably have the most children). > > we temporarily skip over it and look at the next pragma to decide where we > > end. > > Ah you are right I was missing this part. As you suggested we'll need to loop > over all the remaining pragmas, which will make the complexity `((quadratic > in terms of # of pragmas) * (depth of document symbol tree))` I think it is > still acceptable in practice. I don't think there'll be more than a dozen > pragmas in a file. (moreover i think this problem exists with the current > implementation you've provided, see my comment below) > > Are you fine with reshaping the implementation in that way then? Because I > still feel like the code will end up looking a lot more straightforward, even > if not the most optimal and we can try to optimize the rest if it shows up in > the latency graphs. Implementing this now - there's one other edge case here, in that we don't want to treat `#pragma mark Blah` as a group. I guess I can maintain state similar to the first solution to solve that. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105904/new/ https://reviews.llvm.org/D105904 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits