yurai007 created this revision. yurai007 added reviewers: nikic, xbolva00, aeubanks, ChuanqiXu, v.g.vassilev, serge-sans-paille, rsmith. Herald added subscribers: dexonsmith, pengfei, hiraditya. yurai007 requested review of this revision. Herald added projects: clang, LLVM. Herald added subscribers: llvm-commits, cfe-commits.
While building huge code bases it's not uncommon to see perf reports with following FoldingSet items: 1.56% 0.47% clang clang-14 [.] llvm::FoldingSetBase::FindNodeOrInsertPos 0.30% 0.01% clang clang-14 [.] llvm::ContextualFoldingSet<clang::FunctionProtoType, clang::ASTContext&>::NodeEquals 0.25% 0.02% clang clang-14 [.] llvm::FoldingSetBase::InsertNode 0.23% 0.12% clang clang-14 [.] llvm::FoldingSetBase::GrowBucketCount 0.22% 0.21% clang clang-14 [.] llvm::FoldingSetNodeID::AddPointer 0.47% 0.06% clang clang-14 [.] llvm::FoldingSetBase::InsertNode or 1.12% 0.75% clang++ libLLVM-13.so [.] llvm::FoldingSetBase::GrowBucketCount 0.49% 0.48% clang++ libLLVM-13.so [.] llvm::FoldingSetNodeID::AddPointer 0.41% 0.09% clang++ libLLVM-13.so [.] llvm::FoldingSetNodeID::operator== etc. Among many FoldingSet users most notable seem to be ASTContext and CodeGenTypes. The reasons that we spend not-so-tiny amount of time in FoldingSet calls from there, are following: 1. Default FoldingSet capacity for 2^6 items very often is not enough. For PointerTypes/ElaboratedTypes/ParenTypes it's not unlikely to observe growing it to 256 or 512 items. FunctionProtoTypes can easily exceed 1k items capacity growing up to 4k or even 8k size. 2. FoldingSetBase::GrowBucketCount cost itself is not very bad (pure reallocations are rather cheap thanks to BumpPtrAllocator) What matters is high collision rate when lot of items end up in same bucket slowing down FoldingSetBase::FindNodeOrInsertPos and trashing CPU cache (as items with same hash are organized in intrusive linked list which need to be traversed). 3. Lack of AddInteger/AddPointer and computeHash inlining slows down NodeEquals/Profile/:operator== calls. Inlining makes FunctionProtoTypes/PointerTypes/ElaboratedTypes/ParenTypes Profile functions faster but since NodeEquals is still called indirectly through function pointer from FindNodeOrInsertPos there is room for further inlining improvements. After addressing above issues I built Linux (with default config) on isolated CPU cores in silent x86-64 Linux environment. Compile time statistics diff produced by perf before and after change are following: instructions -0.4%, cycles -0.9% size-text change of output Clang binary is below +0.1%. Similarly like in: https://reviews.llvm.org/D118169 for code bases containing smaller translation units it's expected to get less significant speedup with this patch. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D118385 Files: clang/include/clang/AST/ASTContext.h clang/lib/AST/ASTContext.cpp clang/lib/CodeGen/CodeGenTypes.h llvm/include/llvm/ADT/FoldingSet.h llvm/lib/Support/FoldingSet.cpp
Index: llvm/lib/Support/FoldingSet.cpp =================================================================== --- llvm/lib/Support/FoldingSet.cpp +++ llvm/lib/Support/FoldingSet.cpp @@ -12,7 +12,6 @@ //===----------------------------------------------------------------------===// #include "llvm/ADT/FoldingSet.h" -#include "llvm/ADT/Hashing.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/Allocator.h" #include "llvm/Support/ErrorHandling.h" @@ -25,12 +24,6 @@ //===----------------------------------------------------------------------===// // FoldingSetNodeIDRef Implementation -/// ComputeHash - Compute a strong hash value for this FoldingSetNodeIDRef, -/// used to lookup the node in the FoldingSetBase. -unsigned FoldingSetNodeIDRef::ComputeHash() const { - return static_cast<unsigned>(hash_combine_range(Data, Data+Size)); -} - bool FoldingSetNodeIDRef::operator==(FoldingSetNodeIDRef RHS) const { if (Size != RHS.Size) return false; return memcmp(Data, RHS.Data, Size*sizeof(*Data)) == 0; @@ -49,41 +42,6 @@ /// Add* - Add various data types to Bit data. /// -void FoldingSetNodeID::AddPointer(const void *Ptr) { - // Note: this adds pointers to the hash using sizes and endianness that - // depend on the host. It doesn't matter, however, because hashing on - // pointer values is inherently unstable. Nothing should depend on the - // ordering of nodes in the folding set. - static_assert(sizeof(uintptr_t) <= sizeof(unsigned long long), - "unexpected pointer size"); - AddInteger(reinterpret_cast<uintptr_t>(Ptr)); -} -void FoldingSetNodeID::AddInteger(signed I) { - Bits.push_back(I); -} -void FoldingSetNodeID::AddInteger(unsigned I) { - Bits.push_back(I); -} -void FoldingSetNodeID::AddInteger(long I) { - AddInteger((unsigned long)I); -} -void FoldingSetNodeID::AddInteger(unsigned long I) { - if (sizeof(long) == sizeof(int)) - AddInteger(unsigned(I)); - else if (sizeof(long) == sizeof(long long)) { - AddInteger((unsigned long long)I); - } else { - llvm_unreachable("unexpected sizeof(long)"); - } -} -void FoldingSetNodeID::AddInteger(long long I) { - AddInteger((unsigned long long)I); -} -void FoldingSetNodeID::AddInteger(unsigned long long I) { - AddInteger(unsigned(I)); - AddInteger(unsigned(I >> 32)); -} - void FoldingSetNodeID::AddString(StringRef String) { unsigned Size = String.size(); @@ -145,12 +103,6 @@ Bits.append(ID.Bits.begin(), ID.Bits.end()); } -/// ComputeHash - Compute a strong hash value for this FoldingSetNodeID, used to -/// lookup the node in the FoldingSetBase. -unsigned FoldingSetNodeID::ComputeHash() const { - return FoldingSetNodeIDRef(Bits.data(), Bits.size()).ComputeHash(); -} - /// operator== - Used to compare two nodes to each other. /// bool FoldingSetNodeID::operator==(const FoldingSetNodeID &RHS) const { Index: llvm/include/llvm/ADT/FoldingSet.h =================================================================== --- llvm/include/llvm/ADT/FoldingSet.h +++ llvm/include/llvm/ADT/FoldingSet.h @@ -15,6 +15,7 @@ #ifndef LLVM_ADT_FOLDINGSET_H #define LLVM_ADT_FOLDINGSET_H +#include "llvm/ADT/Hashing.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/iterator.h" #include "llvm/Support/Allocator.h" @@ -292,7 +293,9 @@ /// ComputeHash - Compute a strong hash value for this FoldingSetNodeIDRef, /// used to lookup the node in the FoldingSetBase. - unsigned ComputeHash() const; + unsigned ComputeHash() const { + return static_cast<unsigned>(hash_combine_range(Data, Data + Size)); + } bool operator==(FoldingSetNodeIDRef) const; @@ -322,13 +325,33 @@ : Bits(Ref.getData(), Ref.getData() + Ref.getSize()) {} /// Add* - Add various data types to Bit data. - void AddPointer(const void *Ptr); - void AddInteger(signed I); - void AddInteger(unsigned I); - void AddInteger(long I); - void AddInteger(unsigned long I); - void AddInteger(long long I); - void AddInteger(unsigned long long I); + void AddPointer(const void *Ptr) { + // Note: this adds pointers to the hash using sizes and endianness that + // depend on the host. It doesn't matter, however, because hashing on + // pointer values is inherently unstable. Nothing should depend on the + // ordering of nodes in the folding set. + static_assert(sizeof(uintptr_t) <= sizeof(unsigned long long), + "unexpected pointer size"); + AddInteger(reinterpret_cast<uintptr_t>(Ptr)); + } + void AddInteger(signed I) { Bits.push_back(I); } + void AddInteger(unsigned I) { Bits.push_back(I); } + void AddInteger(long I) { AddInteger((unsigned long)I); } + void AddInteger(unsigned long I) { + if (sizeof(long) == sizeof(int)) + AddInteger(unsigned(I)); + else if (sizeof(long) == sizeof(long long)) { + AddInteger((unsigned long long)I); + } else { + llvm_unreachable("unexpected sizeof(long)"); + } + } + void AddInteger(long long I) { AddInteger((unsigned long long)I); } + void AddInteger(unsigned long long I) { + AddInteger(unsigned(I)); + AddInteger(unsigned(I >> 32)); + } + void AddBoolean(bool B) { AddInteger(B ? 1U : 0U); } void AddString(StringRef String); void AddNodeID(const FoldingSetNodeID &ID); @@ -342,7 +365,9 @@ /// ComputeHash - Compute a strong hash value for this FoldingSetNodeID, used /// to lookup the node in the FoldingSetBase. - unsigned ComputeHash() const; + unsigned ComputeHash() const { + return FoldingSetNodeIDRef(Bits.data(), Bits.size()).ComputeHash(); + } /// operator== - Used to compare two nodes to each other. bool operator==(const FoldingSetNodeID &RHS) const; Index: clang/lib/CodeGen/CodeGenTypes.h =================================================================== --- clang/lib/CodeGen/CodeGenTypes.h +++ clang/lib/CodeGen/CodeGenTypes.h @@ -76,7 +76,7 @@ llvm::DenseMap<const Type*, llvm::StructType *> RecordDeclTypes; /// Hold memoized CGFunctionInfo results. - llvm::FoldingSet<CGFunctionInfo> FunctionInfos; + llvm::FoldingSet<CGFunctionInfo> FunctionInfos{9}; /// This set keeps track of records that we're currently converting /// to an IR type. For example, when converting: Index: clang/lib/AST/ASTContext.cpp =================================================================== --- clang/lib/AST/ASTContext.cpp +++ clang/lib/AST/ASTContext.cpp @@ -973,7 +973,7 @@ ASTContext::ASTContext(LangOptions &LOpts, SourceManager &SM, IdentifierTable &idents, SelectorTable &sels, Builtin::Context &builtins, TranslationUnitKind TUKind) - : ConstantArrayTypes(this_()), FunctionProtoTypes(this_()), + : ConstantArrayTypes(this_(), 8), FunctionProtoTypes(this_(), 12), TemplateSpecializationTypes(this_()), DependentTemplateSpecializationTypes(this_()), AutoTypes(this_()), SubstTemplateTemplateParmPacks(this_()), Index: clang/include/clang/AST/ASTContext.h =================================================================== --- clang/include/clang/AST/ASTContext.h +++ clang/include/clang/AST/ASTContext.h @@ -211,7 +211,7 @@ mutable SmallVector<Type *, 0> Types; mutable llvm::FoldingSet<ExtQuals> ExtQualNodes; mutable llvm::FoldingSet<ComplexType> ComplexTypes; - mutable llvm::FoldingSet<PointerType> PointerTypes; + mutable llvm::FoldingSet<PointerType> PointerTypes{9}; mutable llvm::FoldingSet<AdjustedType> AdjustedTypes; mutable llvm::FoldingSet<BlockPointerType> BlockPointerTypes; mutable llvm::FoldingSet<LValueReferenceType> LValueReferenceTypes; @@ -243,9 +243,9 @@ SubstTemplateTypeParmPackTypes; mutable llvm::ContextualFoldingSet<TemplateSpecializationType, ASTContext&> TemplateSpecializationTypes; - mutable llvm::FoldingSet<ParenType> ParenTypes; + mutable llvm::FoldingSet<ParenType> ParenTypes{9}; mutable llvm::FoldingSet<UsingType> UsingTypes; - mutable llvm::FoldingSet<ElaboratedType> ElaboratedTypes; + mutable llvm::FoldingSet<ElaboratedType> ElaboratedTypes{9}; mutable llvm::FoldingSet<DependentNameType> DependentNameTypes; mutable llvm::ContextualFoldingSet<DependentTemplateSpecializationType, ASTContext&>
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits