nikic added a comment. Herald added a subscriber: danielkiss. New compile-time numbers: https://llvm-compile-time-tracker.com/compare.php?from=d7c119d89c5f6d0789cfd0a139c80e23912c0bb0&to=e0a1a6cac1b982023f8ceba8285d1ee7bc96bd32&stat=instructions The regression is now reduced to 0.2%. I assume that's the overhead of those CallbackVHs.
I have no familiarity with BFI, so possibly stupid question: There is already some similar handling as part of BFIImpl here: https://github.com/llvm/llvm-project/blob/0f14b2e6cbb54c84ed3b00b0db521f5ce2d1e3f2/llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h#L1043-L1058 What is the difference to that / why are both needed? ================ Comment at: llvm/include/llvm/Analysis/BlockFrequencyInfo.h:43 + class BFICallbackVH final : public CallbackVH { + std::weak_ptr<ImplType> BFI; + const BasicBlock *BB; ---------------- Is the shared_ptr/weak_ptr usage here necessary? This type of map deletion CallbackVH is a common pattern, but this is the first time I see it with weak_ptr. ================ Comment at: llvm/include/llvm/Analysis/BlockFrequencyInfo.h:44 + std::weak_ptr<ImplType> BFI; + const BasicBlock *BB; + ---------------- It should not be necessary to explicitly store `BB` here, it is already part of the value handle (you can access it via `*this` for example). ================ Comment at: llvm/include/llvm/Analysis/BlockFrequencyInfo.h:48 + BFICallbackVH(std::shared_ptr<ImplType> BFI, const BasicBlock *BB); + virtual ~BFICallbackVH() = default; + ---------------- I don't believe this virtual dtor is needed (class is final). For that matter, the method below also don't need to marked virtual. ================ Comment at: llvm/lib/Analysis/BlockFrequencyInfo.cpp:212 + for (const auto *BB : RegisteredBlocks) + BlockWatchers.emplace_back(BFICallbackVH(BFI, BB)); + ---------------- May want to clear BlockWatchers in `BlockFrequencyInfo::releaseMemory()`? ================ Comment at: llvm/lib/Transforms/Scalar/LoopDistribute.cpp:1062 + LoopStandardAnalysisResults AR = {AA, AC, DT, LI, SE, + TLI, TTI, nullptr, nullptr}; return LAM.getResult<LoopAccessAnalysis>(L, AR); ---------------- Huh, surprised that clang-format allows this. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D86156/new/ https://reviews.llvm.org/D86156 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits