Author: Michael Park
Date: 2025-12-15T23:33:22-08:00
New Revision: 1928c1ea9b57e9c44325d436bc7bb2f4585031f3

URL: 
https://github.com/llvm/llvm-project/commit/1928c1ea9b57e9c44325d436bc7bb2f4585031f3
DIFF: 
https://github.com/llvm/llvm-project/commit/1928c1ea9b57e9c44325d436bc7bb2f4585031f3.diff

LOG: [C++20][Modules] Improve namespace look-up performance for modules. 
(#171769)

## Problem

Given code such as `N::foo();`, we perform name look-up on `N`. In the
case where `N` is a namespace declared in imported modules, one
namespace decl (the "key declaration") for each module that declares a
namespace `foo` is loaded and stored. In large scales where there are
many such modules, (e.g., 1,500) and many uses (e.g., 500,000), this
becomes extremely inefficient because every look-up (500,000 of them)
return 1,500 results.

The following synthetic script demonstrates the problem:
```bash
#/usr/bin/env bash

CLANG=${CLANG:-clang++}
NUM_MODULES=${NUM_MODULES:-1500}
NUM_USES=${NUM_USES:-500000}
USE_MODULES=${USE_MODULES:-true}

TMPDIR=$(mktemp -d)
echo "Working in temp directory: $TMPDIR"
cd $TMPDIR

trap "rm -rf \"$TMPDIR\"" EXIT

echo "namespace N { inline void foo() {} }" > m1.h
for i in $(seq 2 $NUM_MODULES); do echo "namespace N {}" > m${i}.h; done

if $USE_MODULES; then
  seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 
-fmodule-header m{}.h"
fi

> a.cpp
if $USE_MODULES; then
  for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done
else
  for i in $(seq 1 $NUM_MODULES); do echo "#include \"m${i}.h\"" >> a.cpp; done
fi

echo "int main() {" >> a.cpp
for i in $(seq 1 $NUM_USES); do echo "  N::foo();" >> a.cpp; done
echo "}" >> a.cpp

if $USE_MODULES; then
  time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null \
      $(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; 
done)
else
  time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null
fi
```

As of 575d6892bcc5cef926cfc1b95225148262c96a15, without modules
(`USE_MODULES=false`) this takes about **4.5s**, whereas with modules
(`USE_MODULES=true`), this takes about **37s**.

With this PR, without modules there's no change (as expected) at 4.5s,
but with modules it improves to about **5.2s**.

## Approach

The approach taken here aims to maintain status-quo with respect to the
input and output of modules. That is, the `ASTReader` and `ASTWriter`
both read and write the same declarations as it did before. The
difference is in the middle part: the [`StoredDeclsMap` in
`DeclContext`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/AST/DeclBase.h#L2024-L2030).
The `StoredDeclsMap` is roughly a `map<DeclarationName,
StoredDeclsList>`. Currently, we read all of the external namespace
decls from `ASTReader`, they all get stored into the `StoredDeclsList`,
and the `ASTWriter` iterates through that list and writes out the
results.

This PR continues to read all of the external namespace decls from
`ASTReader`, but only stores one namespace decl in the
`StoredDeclsList`. This is okay since the reading of the decls handles
all of the merging and chaining of the namespace decls, and as long as
they're loaded and chained, returning one for look-up purposes is
sufficient.

The other half of the problem is to write out all of the external
namespaces that we used to store in `StoredDeclsList` but no longer. For
this, we take advantage of the
[`KeyDecls`](https://github.com/llvm/llvm-project/blob/release/21.x/clang/include/clang/Serialization/ASTReader.h#L1342-L1347)
data structure in `ASTReader`. `KeyDecls` is roughly a `map<Decl *,
vector<GlobalDeclID>>`, and it stores a mapping from the canonical decl
of a redeclarable decl to a list of `GlobalDeclID`s where each ID
represents a "key declaration" from each imported module. More to the
point, if we read external namespaces `N1`, `N2`, `N3` in `ASTReader`,
we'll either have `N1` mapped to `[N2, N3]`, or some newly local
canonical decl mapped to `[N1, N2, N3]`. Either way, we can visit `N1`,
`N2`, and `N3` by doing `ASTReader::forEachImportedKeyDecls(N1,
Visitor)`, and we leverage this to maintain the current behavior of
writing out all of the imported namespace decls in `ASTWriter`.

## Alternatives Attempted

- Tried reading fewer declarations on the `ASTReader` side, and writing
out fewer declarations on the `ASTWriter` side, and neither options
worked at all.
- Tried trying to split `StoredDeclsList` into two pieces, one with
non-namespace decls and one with only namespace decls, but that didn't
work well... I think because the order of the declarations matter
sometimes, and maybe also because the declaration replacement logic gets
more complicated.
- Tried to deduplicate at the `SemaLookup` level. Basically, retrieve
all the stored decls but deduplicate populating the `LookupResult`
[here](https://github.com/llvm/llvm-project/blob/release/21.x/clang/lib/Sema/SemaLookup.cpp#L1137-L1144).
This did improve things slightly, but not quite enough, and this
solution seemed cleaner in the end anyway.

Added: 
    clang/unittests/Serialization/NamespaceLookupTest.cpp

Modified: 
    clang/lib/Serialization/ASTReader.cpp
    clang/lib/Serialization/ASTWriter.cpp
    clang/unittests/Serialization/CMakeLists.txt

Removed: 
    


################################################################################
diff  --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index aec61322fb8be..e8600d1e914cd 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -555,7 +555,25 @@ namespace {
 
 using MacroDefinitionsMap =
     llvm::StringMap<std::pair<StringRef, bool /*IsUndef*/>>;
-using DeclsMap = llvm::DenseMap<DeclarationName, SmallVector<NamedDecl *, 8>>;
+
+class DeclsSet {
+  SmallVector<NamedDecl *, 64> Decls;
+  llvm::SmallPtrSet<NamedDecl *, 8> Found;
+
+public:
+  operator ArrayRef<NamedDecl *>() const { return Decls; }
+
+  bool empty() const { return Decls.empty(); }
+
+  bool insert(NamedDecl *ND) {
+    auto [_, Inserted] = Found.insert(ND);
+    if (Inserted)
+      Decls.push_back(ND);
+    return Inserted;
+  }
+};
+
+using DeclsMap = llvm::DenseMap<DeclarationName, DeclsSet>;
 
 } // namespace
 
@@ -8702,14 +8720,23 @@ bool ASTReader::FindExternalVisibleDeclsByName(const 
DeclContext *DC,
     return false;
 
   // Load the list of declarations.
-  SmallVector<NamedDecl *, 64> Decls;
-  llvm::SmallPtrSet<NamedDecl *, 8> Found;
+  DeclsSet DS;
 
   auto Find = [&, this](auto &&Table, auto &&Key) {
     for (GlobalDeclID ID : Table.find(Key)) {
       NamedDecl *ND = cast<NamedDecl>(GetDecl(ID));
-      if (ND->getDeclName() == Name && Found.insert(ND).second)
-        Decls.push_back(ND);
+      if (ND->getDeclName() != Name)
+        continue;
+      // Special case for namespaces: There can be a lot of redeclarations of
+      // some namespaces, and we import a "key declaration" per imported 
module.
+      // Since all declarations of a namespace are essentially interchangeable,
+      // we can optimize namespace look-up by only storing the key declaration
+      // of the current TU, rather than storing N key declarations where N is
+      // the # of imported modules that declare that namespace.
+      // TODO: Try to generalize this optimization to other redeclarable decls.
+      if (isa<NamespaceDecl>(ND))
+        ND = cast<NamedDecl>(getKeyDeclaration(ND));
+      DS.insert(ND);
     }
   };
 
@@ -8744,8 +8771,8 @@ bool ASTReader::FindExternalVisibleDeclsByName(const 
DeclContext *DC,
     Find(It->second.Table, Name);
   }
 
-  SetExternalVisibleDeclsForName(DC, Name, Decls);
-  return !Decls.empty();
+  SetExternalVisibleDeclsForName(DC, Name, DS);
+  return !DS.empty();
 }
 
 void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) {
@@ -8763,7 +8790,16 @@ void ASTReader::completeVisibleDeclsMap(const 
DeclContext *DC) {
 
     for (GlobalDeclID ID : It->second.Table.findAll()) {
       NamedDecl *ND = cast<NamedDecl>(GetDecl(ID));
-      Decls[ND->getDeclName()].push_back(ND);
+      // Special case for namespaces: There can be a lot of redeclarations of
+      // some namespaces, and we import a "key declaration" per imported 
module.
+      // Since all declarations of a namespace are essentially interchangeable,
+      // we can optimize namespace look-up by only storing the key declaration
+      // of the current TU, rather than storing N key declarations where N is
+      // the # of imported modules that declare that namespace.
+      // TODO: Try to generalize this optimization to other redeclarable decls.
+      if (isa<NamespaceDecl>(ND))
+        ND = cast<NamedDecl>(getKeyDeclaration(ND));
+      Decls[ND->getDeclName()].insert(ND);
     }
 
     // FIXME: Why a PCH test is failing if we remove the iterator after 
findAll?
@@ -8773,9 +8809,9 @@ void ASTReader::completeVisibleDeclsMap(const DeclContext 
*DC) {
   findAll(ModuleLocalLookups, NumModuleLocalVisibleDeclContexts);
   findAll(TULocalLookups, NumTULocalVisibleDeclContexts);
 
-  for (DeclsMap::iterator I = Decls.begin(), E = Decls.end(); I != E; ++I) {
-    SetExternalVisibleDeclsForName(DC, I->first, I->second);
-  }
+  for (auto &[Name, DS] : Decls)
+    SetExternalVisibleDeclsForName(DC, Name, DS);
+
   const_cast<DeclContext *>(DC)->setHasExternalVisibleStorage(false);
 }
 

diff  --git a/clang/lib/Serialization/ASTWriter.cpp 
b/clang/lib/Serialization/ASTWriter.cpp
index dc7f93e0483e4..899fd69c2045e 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -4396,20 +4396,20 @@ class ASTDeclContextNameLookupTrait
 
   template <typename Coll> data_type getData(const Coll &Decls) {
     unsigned Start = DeclIDs.size();
-    for (NamedDecl *D : Decls) {
+    auto AddDecl = [this](NamedDecl *D) {
       NamedDecl *DeclForLocalLookup =
           getDeclForLocalLookup(Writer.getLangOpts(), D);
 
       if (Writer.getDoneWritingDeclsAndTypes() &&
           !Writer.wasDeclEmitted(DeclForLocalLookup))
-        continue;
+        return;
 
       // Try to avoid writing internal decls to reduced BMI.
       // See comments in ASTWriter::WriteDeclContextLexicalBlock for details.
       if (Writer.isGeneratingReducedBMI() &&
           !DeclForLocalLookup->isFromExplicitGlobalModule() &&
           IsInternalDeclFromFileContext(DeclForLocalLookup))
-        continue;
+        return;
 
       auto ID = Writer.GetDeclRef(DeclForLocalLookup);
 
@@ -4423,7 +4423,7 @@ class ASTDeclContextNameLookupTrait
             ModuleLocalDeclsMap.insert({Key, DeclIDsTy{ID}});
           else
             Iter->second.push_back(ID);
-          continue;
+          return;
         }
         break;
       case LookupVisibility::TULocal: {
@@ -4432,7 +4432,7 @@ class ASTDeclContextNameLookupTrait
           TULocalDeclsMap.insert({D->getDeclName(), DeclIDsTy{ID}});
         else
           Iter->second.push_back(ID);
-        continue;
+        return;
       }
       case LookupVisibility::GenerallyVisibile:
         // Generally visible decls go into the general lookup table.
@@ -4440,6 +4440,24 @@ class ASTDeclContextNameLookupTrait
       }
 
       DeclIDs.push_back(ID);
+    };
+    for (NamedDecl *D : Decls) {
+      if (ASTReader *Chain = Writer.getChain();
+          Chain && isa<NamespaceDecl>(D) && D->isFromASTFile() &&
+          D == Chain->getKeyDeclaration(D)) {
+        // In ASTReader, we stored only the key declaration of a namespace decl
+        // for this TU rather than storing all of the key declarations from 
each
+        // imported module. If we have an external namespace decl, this is that
+        // key declaration and we need to re-expand it to write out all of the
+        // key declarations from each imported module again.
+        //
+        // See comment 'ASTReader::FindExternalVisibleDeclsByName' for details.
+        Chain->forEachImportedKeyDecl(D, [&AddDecl](const Decl *D) {
+          AddDecl(cast<NamedDecl>(const_cast<Decl *>(D)));
+        });
+      } else {
+        AddDecl(D);
+      }
     }
     return std::make_pair(Start, DeclIDs.size());
   }

diff  --git a/clang/unittests/Serialization/CMakeLists.txt 
b/clang/unittests/Serialization/CMakeLists.txt
index 6782e6b4d7330..a5cc1ed83af49 100644
--- a/clang/unittests/Serialization/CMakeLists.txt
+++ b/clang/unittests/Serialization/CMakeLists.txt
@@ -2,6 +2,7 @@ add_clang_unittest(SerializationTests
   ForceCheckFileInputTest.cpp
   InMemoryModuleCacheTest.cpp
   ModuleCacheTest.cpp
+  NamespaceLookupTest.cpp
   NoCommentsTest.cpp
   PreambleInNamedModulesTest.cpp
   LoadSpecLazilyTest.cpp

diff  --git a/clang/unittests/Serialization/NamespaceLookupTest.cpp 
b/clang/unittests/Serialization/NamespaceLookupTest.cpp
new file mode 100644
index 0000000000000..eefa4be9fbee5
--- /dev/null
+++ b/clang/unittests/Serialization/NamespaceLookupTest.cpp
@@ -0,0 +1,247 @@
+//== unittests/Serialization/NamespaceLookupOptimizationTest.cpp =======//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Driver/CreateInvocationFromArgs.h"
+#include "clang/Frontend/CompilerInstance.h"
+#include "clang/Frontend/FrontendAction.h"
+#include "clang/Frontend/FrontendActions.h"
+#include "clang/Parse/ParseAST.h"
+#include "clang/Serialization/ASTReader.h"
+#include "clang/Tooling/Tooling.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+using namespace clang;
+using namespace clang::tooling;
+
+namespace {
+
+class NamespaceLookupTest : public ::testing::Test {
+  void SetUp() override {
+    ASSERT_FALSE(
+        sys::fs::createUniqueDirectory("namespace-lookup-test", TestDir));
+  }
+
+  void TearDown() override { sys::fs::remove_directories(TestDir); }
+
+public:
+  SmallString<256> TestDir;
+
+  void addFile(StringRef Path, StringRef Contents) {
+    ASSERT_FALSE(sys::path::is_absolute(Path));
+
+    SmallString<256> AbsPath(TestDir);
+    sys::path::append(AbsPath, Path);
+
+    ASSERT_FALSE(
+        sys::fs::create_directories(llvm::sys::path::parent_path(AbsPath)));
+
+    std::error_code EC;
+    llvm::raw_fd_ostream OS(AbsPath, EC);
+    ASSERT_FALSE(EC);
+    OS << Contents;
+  }
+
+  std::string GenerateModuleInterface(StringRef ModuleName,
+                                      StringRef Contents) {
+    std::string FileName = llvm::Twine(ModuleName + ".cppm").str();
+    addFile(FileName, Contents);
+
+    IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS =
+        llvm::vfs::createPhysicalFileSystem();
+    DiagnosticOptions DiagOpts;
+    IntrusiveRefCntPtr<DiagnosticsEngine> Diags =
+        CompilerInstance::createDiagnostics(*VFS, DiagOpts);
+    CreateInvocationOptions CIOpts;
+    CIOpts.Diags = Diags;
+    CIOpts.VFS = VFS;
+
+    std::string CacheBMIPath =
+        llvm::Twine(TestDir + "/" + ModuleName + ".pcm").str();
+    std::string PrebuiltModulePath =
+        "-fprebuilt-module-path=" + TestDir.str().str();
+    const char *Args[] = {"clang++",
+                          "-std=c++20",
+                          "--precompile",
+                          PrebuiltModulePath.c_str(),
+                          "-working-directory",
+                          TestDir.c_str(),
+                          "-I",
+                          TestDir.c_str(),
+                          FileName.c_str(),
+                          "-o",
+                          CacheBMIPath.c_str()};
+    std::shared_ptr<CompilerInvocation> Invocation =
+        createInvocation(Args, CIOpts);
+    EXPECT_TRUE(Invocation);
+
+    CompilerInstance Instance(std::move(Invocation));
+    Instance.setDiagnostics(Diags);
+    Instance.getFrontendOpts().OutputFile = CacheBMIPath;
+    // Avoid memory leaks.
+    Instance.getFrontendOpts().DisableFree = false;
+    GenerateModuleInterfaceAction Action;
+    EXPECT_TRUE(Instance.ExecuteAction(Action));
+    EXPECT_FALSE(Diags->hasErrorOccurred());
+
+    return CacheBMIPath;
+  }
+};
+
+struct NamespaceLookupResult {
+  int NumLocalNamespaces = 0;
+  int NumExternalNamespaces = 0;
+};
+
+class NamespaceLookupConsumer : public ASTConsumer {
+  NamespaceLookupResult &Result;
+
+public:
+  explicit NamespaceLookupConsumer(NamespaceLookupResult &Result)
+      : Result(Result) {}
+
+  void HandleTranslationUnit(ASTContext &Context) override {
+    TranslationUnitDecl *TU = Context.getTranslationUnitDecl();
+    ASSERT_TRUE(TU);
+    ASTReader *Chain = 
dyn_cast_or_null<ASTReader>(Context.getExternalSource());
+    ASSERT_TRUE(Chain);
+    for (const Decl *D :
+         TU->lookup(DeclarationName(&Context.Idents.get("N")))) {
+      if (!isa<NamespaceDecl>(D))
+        continue;
+      if (!D->isFromASTFile()) {
+        ++Result.NumLocalNamespaces;
+      } else {
+        ++Result.NumExternalNamespaces;
+        EXPECT_EQ(D, Chain->getKeyDeclaration(D));
+      }
+    }
+  }
+};
+
+class NamespaceLookupAction : public ASTFrontendAction {
+  NamespaceLookupResult &Result;
+
+public:
+  explicit NamespaceLookupAction(NamespaceLookupResult &Result)
+      : Result(Result) {}
+
+  std::unique_ptr<ASTConsumer>
+  CreateASTConsumer(CompilerInstance &CI, StringRef /*Unused*/) override {
+    return std::make_unique<NamespaceLookupConsumer>(Result);
+  }
+};
+
+TEST_F(NamespaceLookupTest, ExternalNamespacesOnly) {
+  GenerateModuleInterface("M1", R"cpp(
+export module M1;
+namespace N {}
+  )cpp");
+  GenerateModuleInterface("M2", R"cpp(
+export module M2;
+namespace N {}
+  )cpp");
+  GenerateModuleInterface("M3", R"cpp(
+export module M3;
+namespace N {}
+  )cpp");
+  const char *test_file_contents = R"cpp(
+import M1;
+import M2;
+import M3;
+  )cpp";
+  std::string DepArg = "-fprebuilt-module-path=" + TestDir.str().str();
+  NamespaceLookupResult Result;
+  EXPECT_TRUE(runToolOnCodeWithArgs(
+      std::make_unique<NamespaceLookupAction>(Result), test_file_contents,
+      {
+          "-std=c++20",
+          DepArg.c_str(),
+          "-I",
+          TestDir.c_str(),
+      },
+      "main.cpp"));
+
+  EXPECT_EQ(0, Result.NumLocalNamespaces);
+  EXPECT_EQ(1, Result.NumExternalNamespaces);
+}
+
+TEST_F(NamespaceLookupTest, ExternalReplacedByLocal) {
+  GenerateModuleInterface("M1", R"cpp(
+export module M1;
+namespace N {}
+  )cpp");
+  GenerateModuleInterface("M2", R"cpp(
+export module M2;
+namespace N {}
+  )cpp");
+  GenerateModuleInterface("M3", R"cpp(
+export module M3;
+namespace N {}
+  )cpp");
+  const char *test_file_contents = R"cpp(
+import M1;
+import M2;
+import M3;
+
+namespace N {}
+  )cpp";
+  std::string DepArg = "-fprebuilt-module-path=" + TestDir.str().str();
+  NamespaceLookupResult Result;
+  EXPECT_TRUE(runToolOnCodeWithArgs(
+      std::make_unique<NamespaceLookupAction>(Result), test_file_contents,
+      {
+          "-std=c++20",
+          DepArg.c_str(),
+          "-I",
+          TestDir.c_str(),
+      },
+      "main.cpp"));
+
+  EXPECT_EQ(1, Result.NumLocalNamespaces);
+  EXPECT_EQ(0, Result.NumExternalNamespaces);
+}
+
+TEST_F(NamespaceLookupTest, LocalAndExternalInterleaved) {
+  GenerateModuleInterface("M1", R"cpp(
+export module M1;
+namespace N {}
+  )cpp");
+  GenerateModuleInterface("M2", R"cpp(
+export module M2;
+namespace N {}
+  )cpp");
+  GenerateModuleInterface("M3", R"cpp(
+export module M3;
+namespace N {}
+  )cpp");
+  const char *test_file_contents = R"cpp(
+import M1;
+
+namespace N {}
+
+import M2;
+import M3;
+  )cpp";
+  std::string DepArg = "-fprebuilt-module-path=" + TestDir.str().str();
+  NamespaceLookupResult Result;
+  EXPECT_TRUE(runToolOnCodeWithArgs(
+      std::make_unique<NamespaceLookupAction>(Result), test_file_contents,
+      {
+          "-std=c++20",
+          DepArg.c_str(),
+          "-I",
+          TestDir.c_str(),
+      },
+      "main.cpp"));
+
+  EXPECT_EQ(1, Result.NumLocalNamespaces);
+  EXPECT_EQ(1, Result.NumExternalNamespaces);
+}
+
+} // namespace


        
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to