sammccall accepted this revision. sammccall added a comment. This revision is now accepted and ready to land.
Nice find! ================ Comment at: clang-tools-extra/clangd/index/dex/Iterator.cpp:91 /// sync. void sync() { ReachedEnd |= Children.front()->reachedEnd(); ---------------- I stared at this for a while and this change makes me think about the whole algo differently. Effectively, we're just iterating through the items C of front trying to find a valid sync point. With the slight optimization that when some other child skips over C to X, we catch up the front to X. This leads to two possible simplifications (strictly, unrelated to this patch and totally up to you): 1. The initial setup and outer loop has unclear bounds - it's basically a loop over the items of Children.front() but that's not clear. Consider rather: ``` auto &Front = *Children.front(); // smallest while ((ReachedEnd |= Front->reachedEnd())) { auto SyncID = Front.peek(); bool ValidSync = true; for (auto &Child : Children { ... if (Child->peek() > SyncID) { Front->advanceTo(Child->peek()); // XXX ValidSync = false; break; } ... } if (ValidSync) return; } ``` This seems basically equivalent and I find the flow control easier to reason about. 2. Is the "slight optimization" important? i.e. is `Front->advanceTo(Child->peek())` actually a significant win over `Front->advance()`? I think the flow control is even clearer in this version: ``` for (auto &Front = *Children.front() ;(ReachedEnd |= Front->reachedEnd()); Front.advance()) { auto SyncID = Front.peek(); bool ValidSync = true; for (auto &Child : Children { ... if (Child->peek() > SyncID) { ValidSync = false; break; } ... } if (ValidSync) return; } ``` Because now the outer loop is *entirely* just a loop over the items of Front. Intuitively the average number that advanceTo() advances by here is between 1 and 2 (we know it's at least 1, but the remaining items are mostly from the longer iterator so we probably don't skip over any from the shorter iterator). And advanceTo is more complicated, if advanceTo turns out to be 2x as expensive as advance in practice, this isn't a win. So this is maybe worth checking. ================ Comment at: clang-tools-extra/clangd/index/dex/Iterator.cpp:112 NeedsAdvance = true; + // Reset and make sure advanceTo happens much less frequently on + // large posting lists. This accounts for 45-60% performance boost. ---------------- First, say why: Now we try again with a new sync point. We always syncing with the first child as this is the cheapest (smallest). We only call advanceTo on the large posting lists once we have a fairly likely sync point. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106528/new/ https://reviews.llvm.org/D106528 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits