https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/132853

Instead of eagerly populating the `clang::ModuleMap` when looking up a module 
by name, this patch changes `HeaderSearch` to only load the modules that are 
actually used.

This introduces `ModuleMap::findOrLoadModule` which will load modules from 
parsed but not loaded module maps. This cannot be used anywhere that the module 
loading code calls into as it can create infinite recursion.

This currently just reparses module maps when looking up a module by header. 
This is fine as redeclarations are allowed from the same file, but future 
patches will also make looking up a module by header lazy.

This patch changes the shadow.m test to use explicitly built modules and 
`#import`. This test and the shadow feature are very brittle and do not work in 
general. The test relied on pcm files being left behind by prior failing clang 
invocations that were then reused by the last invocation. If you clean the 
cache then the last invocation will always fail. This is because the input 
module map and the `-fmodule-map-file=` module map are parsed in the same 
module scope, and `-fmodule-map-file=` is forwarded to implicit module builds. 
That means you are guaranteed to hit a module redeclaration error if the TU 
actually imports the module it is trying to shadow.

This patch changes when we load A2's module map to after the `A` module has 
been loaded, which sets the `IsFromModuleFile` bit on `A`. This means that A2's 
`A` is skipped entirely instead of creating a shadow module, and we get textual 
inclusion. It is possible to construct a case where this would happen before 
this patch too.

An upcoming patch in this series will rework shadowing to work in the general 
case, but that's only possible once header -> module lookup is lazy too.

>From f650b8a571db3bbf4a60e1b2b50ca4d390c47605 Mon Sep 17 00:00:00 2001
From: Michael Spencer <bigchees...@gmail.com>
Date: Wed, 29 Jan 2025 12:49:29 -0800
Subject: [PATCH] [clang][modules] Lazily load by name lookups in module maps

Instead of eagerly populating the `clang::ModuleMap` when looking up
a module by name, this patch changes `HeaderSearch` to only load the
modules that are actually used.

This introduces `ModuleMap::findOrLoadModule` which will load modules
from parsed but not loaded module maps. This cannot be used anywhere
that the module loading code calls into as it can create infinite
recursion.

This currently just reparses module maps when looking up a module by
header. This is fine as redeclarations are allowed from the same file,
but future patches will also make looking up a module by header lazy.

This patch changes the shadow.m test to use explicitly built modules
and `#import`. This test and the shadow feature are very brittle and
do not work in general. The test relied on pcm files being left behind
by prior failing clang invocations that were then reused by the last
invocation. If you clean the cache then the last invocation will
always fail. This is because the input module map and the
`-fmodule-map-file=` module map are parsed in the same module scope,
and `-fmodule-map-file=` is forwarded to implicit module builds. That
means you are guaranteed to hit a module redeclaration error if the TU
actually imports the module it is trying to shadow.

This patch changes when we load A2's module map to after the `A`
module has been loaded, which sets the `IsFromModuleFile` bit on `A`.
This means that A2's `A` is skipped entirely instead of creating a
shadow module, and we get textual inclusion. It is possible to
construct a case where this would happen before this patch too.

An upcoming patch in this series will rework shadowing to work in the
general case, but that's only possible once header -> module lookup is
lazy too.
---
 clang/include/clang/Basic/DiagnosticGroups.td |   1 +
 .../include/clang/Basic/DiagnosticLexKinds.td |   6 +
 clang/include/clang/Lex/HeaderSearch.h        |  41 ++++-
 clang/include/clang/Lex/ModuleMap.h           |  20 +++
 clang/include/clang/Lex/ModuleMapFile.h       |   9 ++
 clang/lib/Frontend/CompilerInstance.cpp       |   4 +-
 clang/lib/Lex/HeaderSearch.cpp                | 103 ++++++++++--
 clang/lib/Lex/ModuleMap.cpp                   | 151 ++++++++++++++++--
 clang/lib/Lex/ModuleMapFile.cpp               |   3 +
 clang/lib/Sema/SemaModule.cpp                 |   2 +-
 clang/test/Modules/Inputs/shadow/A1/A1.h      |   0
 .../Modules/Inputs/shadow/A1/module.modulemap |   4 +-
 clang/test/Modules/Inputs/shadow/A2/A2.h      |   0
 .../Modules/Inputs/shadow/A2/module.modulemap |   4 +-
 clang/test/Modules/lazy-by-name-lookup.c      |  31 ++++
 clang/test/Modules/modulemap-locations.m      |   3 +-
 clang/test/Modules/shadow.m                   |  11 +-
 17 files changed, 354 insertions(+), 39 deletions(-)
 create mode 100644 clang/test/Modules/Inputs/shadow/A1/A1.h
 create mode 100644 clang/test/Modules/Inputs/shadow/A2/A2.h
 create mode 100644 clang/test/Modules/lazy-by-name-lookup.c

diff --git a/clang/include/clang/Basic/DiagnosticGroups.td 
b/clang/include/clang/Basic/DiagnosticGroups.td
index b9f08d96151c9..1abb63ba3aea6 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -576,6 +576,7 @@ def ModuleImport : DiagGroup<"module-import">;
 def ModuleConflict : DiagGroup<"module-conflict">;
 def ModuleFileExtension : DiagGroup<"module-file-extension">;
 def ModuleIncludeDirectiveTranslation : 
DiagGroup<"module-include-translation">;
+def ModuleMap : DiagGroup<"module-map">;
 def RoundTripCC1Args : DiagGroup<"round-trip-cc1-args">;
 def NewlineEOF : DiagGroup<"newline-eof">;
 def Nullability : DiagGroup<"nullability">;
diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td 
b/clang/include/clang/Basic/DiagnosticLexKinds.td
index 912b8bd46e194..a6866ef868dcd 100644
--- a/clang/include/clang/Basic/DiagnosticLexKinds.td
+++ b/clang/include/clang/Basic/DiagnosticLexKinds.td
@@ -836,6 +836,12 @@ def warn_pp_date_time : Warning<
   ShowInSystemHeader, DefaultIgnore, InGroup<DiagGroup<"date-time">>;
 
 // Module map parsing
+def remark_mmap_parse : Remark<
+  "parsing modulemap '%0'">, ShowInSystemHeader, InGroup<ModuleMap>;
+def remark_mmap_load : Remark<
+  "loading modulemap '%0'">, ShowInSystemHeader, InGroup<ModuleMap>;
+def remark_mmap_load_module : Remark<
+  "loading parsed module '%0'">, ShowInSystemHeader, InGroup<ModuleMap>;
 def err_mmap_unknown_token : Error<"skipping stray token">;
 def err_mmap_expected_module : Error<"expected module declaration">;
 def err_mmap_expected_module_name : Error<"expected module name">;
diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index f3dac905318c6..2c1e245fbfd37 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -332,13 +332,24 @@ class HeaderSearch {
   /// The mapping between modules and headers.
   mutable ModuleMap ModMap;
 
+  enum ModuleMapDirectoryState {
+    MMDS_Parsed,
+    MMDS_Loaded,
+    MMDS_Invalid,
+  };
+
   /// Describes whether a given directory has a module map in it.
-  llvm::DenseMap<const DirectoryEntry *, bool> DirectoryHasModuleMap;
+  llvm::DenseMap<const DirectoryEntry *, ModuleMapDirectoryState>
+      DirectoryHasModuleMap;
 
   /// Set of module map files we've already loaded, and a flag indicating
   /// whether they were valid or not.
   llvm::DenseMap<const FileEntry *, bool> LoadedModuleMaps;
 
+  /// Set of module map files we've already pre-parsed, and a flag indicating
+  /// whether they were valid or not.
+  llvm::DenseMap<const FileEntry *, bool> PreParsedModuleMaps;
+
   // A map of discovered headers with their associated include file name.
   llvm::DenseMap<const FileEntry *, llvm::SmallString<64>> IncludeNames;
 
@@ -435,7 +446,7 @@ class HeaderSearch {
 
   /// Consider modules when including files from this directory.
   void setDirectoryHasModuleMap(const DirectoryEntry* Dir) {
-    DirectoryHasModuleMap[Dir] = true;
+    DirectoryHasModuleMap[Dir] = MMDS_Loaded;
   }
 
   /// Forget everything we know about headers so far.
@@ -717,6 +728,23 @@ class HeaderSearch {
                          unsigned *Offset = nullptr,
                          StringRef OriginalModuleMapFile = StringRef());
 
+  /// Read the contents of the given module map file.
+  ///
+  /// \param File The module map file.
+  /// \param IsSystem Whether this file is in a system header directory.
+  /// \param ID If the module map file is already mapped (perhaps as part of
+  ///        processing a preprocessed module), the ID of the file.
+  /// \param Offset [inout] An offset within ID to start parsing. On exit,
+  ///        filled by the end of the parsed contents (either EOF or the
+  ///        location of an end-of-module-map pragma).
+  /// \param OriginalModuleMapFile The original path to the module map file,
+  ///        used to resolve paths within the module (this is required when
+  ///        building the module from preprocessed source).
+  /// \returns true if an error occurred, false otherwise.
+  bool preLoadModuleMapFile(FileEntryRef File, bool IsSystem,
+                            FileID ID = FileID(),
+                            StringRef OriginalModuleMapFile = StringRef());
+
   /// Collect the set of all known, top-level modules.
   ///
   /// \param Modules Will be filled with the set of known, top-level modules.
@@ -936,6 +964,10 @@ class HeaderSearch {
                                             FileID ID = FileID(),
                                             unsigned *Offset = nullptr);
 
+  LoadModuleMapResult parseModuleMapFileImpl(FileEntryRef File, bool IsSystem,
+                                             DirectoryEntryRef Dir,
+                                             FileID ID = FileID());
+
   /// Try to load the module map file in the given directory.
   ///
   /// \param DirName The name of the directory where we will look for a module
@@ -958,6 +990,11 @@ class HeaderSearch {
   /// named directory.
   LoadModuleMapResult loadModuleMapFile(DirectoryEntryRef Dir, bool IsSystem,
                                         bool IsFramework);
+
+  LoadModuleMapResult parseModuleMapFile(StringRef DirName, bool IsSystem,
+                                         bool IsFramework);
+  LoadModuleMapResult parseModuleMapFile(DirectoryEntryRef Dir, bool IsSystem,
+                                         bool IsFramework);
 };
 
 /// Apply the header search options to get given HeaderSearch object.
diff --git a/clang/include/clang/Lex/ModuleMap.h 
b/clang/include/clang/Lex/ModuleMap.h
index 9de1b3b546c11..8c460778891ba 100644
--- a/clang/include/clang/Lex/ModuleMap.h
+++ b/clang/include/clang/Lex/ModuleMap.h
@@ -18,6 +18,7 @@
 #include "clang/Basic/LangOptions.h"
 #include "clang/Basic/Module.h"
 #include "clang/Basic/SourceLocation.h"
+#include "clang/Lex/ModuleMapFile.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
@@ -267,6 +268,18 @@ class ModuleMap {
   /// Describes whether we haved parsed a particular file as a module
   /// map.
   llvm::DenseMap<const FileEntry *, bool> ParsedModuleMap;
+  llvm::DenseMap<const FileEntry *, const modulemap::ModuleMapFile *>
+      PreParsedModuleMap;
+
+  std::vector<std::unique_ptr<modulemap::ModuleMapFile>> PreParsedModuleMaps;
+
+  /// Map from top level module name to a list of ModuleDecls in the order they
+  /// were discovered. This allows handling shadowing correctly and diagnosing
+  /// redefinitions.
+  llvm::StringMap<SmallVector<std::pair<const modulemap::ModuleMapFile *,
+                                        const modulemap::ModuleDecl *>,
+                              1>>
+      PreParsedModules;
 
   /// Resolve the given export declaration into an actual export
   /// declaration.
@@ -483,6 +496,8 @@ class ModuleMap {
   /// \returns The named module, if known; otherwise, returns null.
   Module *findModule(StringRef Name) const;
 
+  Module *findOrLoadModule(StringRef Name);
+
   Module *findOrInferSubmodule(Module *Parent, StringRef Name);
 
   /// Retrieve a module with the given name using lexical name lookup,
@@ -698,6 +713,11 @@ class ModuleMap {
   void addHeader(Module *Mod, Module::Header Header,
                  ModuleHeaderRole Role, bool Imported = false);
 
+  /// Parse a module map without creating `clang::Module` instances.
+  bool preParseModuleMapFile(FileEntryRef File, bool IsSystem,
+                             DirectoryEntryRef Dir, FileID ID = FileID(),
+                             SourceLocation ExternModuleLoc = 
SourceLocation());
+
   /// Parse the given module map file, and record any modules we
   /// encounter.
   ///
diff --git a/clang/include/clang/Lex/ModuleMapFile.h 
b/clang/include/clang/Lex/ModuleMapFile.h
index 1219cc2b50753..7d0e36e9ab86c 100644
--- a/clang/include/clang/Lex/ModuleMapFile.h
+++ b/clang/include/clang/Lex/ModuleMapFile.h
@@ -133,8 +133,17 @@ using TopLevelDecl = std::variant<ModuleDecl, 
ExternModuleDecl>;
 /// This holds many reference types (StringRef, SourceLocation, etc.) whose
 /// lifetimes are bound by the SourceManager and FileManager used.
 struct ModuleMapFile {
+  /// The FileID used to parse this module map. This is always a local ID.
+  FileID ID;
+
+  /// The directory in which the module map was discovered. Declarations in
+  /// the module map are relative to this directory.
+  OptionalDirectoryEntryRef Dir;
+
   /// Beginning of the file, used for moduleMapFileRead callback.
   SourceLocation Start;
+
+  bool IsSystem;
   std::vector<TopLevelDecl> Decls;
 
   void dump(llvm::raw_ostream &out) const;
diff --git a/clang/lib/Frontend/CompilerInstance.cpp 
b/clang/lib/Frontend/CompilerInstance.cpp
index bff5326e89973..162961bd84065 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -579,13 +579,13 @@ struct ReadModuleNames : ASTReaderListener {
     ModuleMap &MM = PP.getHeaderSearchInfo().getModuleMap();
     for (const std::string &LoadedModule : LoadedModules)
       MM.cacheModuleLoad(*PP.getIdentifierInfo(LoadedModule),
-                         MM.findModule(LoadedModule));
+                         MM.findOrLoadModule(LoadedModule));
     LoadedModules.clear();
   }
 
   void markAllUnavailable() {
     for (const std::string &LoadedModule : LoadedModules) {
-      if (Module *M = PP.getHeaderSearchInfo().getModuleMap().findModule(
+      if (Module *M = PP.getHeaderSearchInfo().getModuleMap().findOrLoadModule(
               LoadedModule)) {
         M->HasIncompatibleModuleFile = true;
 
diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp
index ad9263f2994f2..90ab460e98b39 100644
--- a/clang/lib/Lex/HeaderSearch.cpp
+++ b/clang/lib/Lex/HeaderSearch.cpp
@@ -300,7 +300,7 @@ Module *HeaderSearch::lookupModule(StringRef ModuleName,
                                    SourceLocation ImportLoc, bool AllowSearch,
                                    bool AllowExtraModuleMapSearch) {
   // Look in the module map to determine if there is a module by this name.
-  Module *Module = ModMap.findModule(ModuleName);
+  Module *Module = ModMap.findOrLoadModule(ModuleName);
   if (Module || !AllowSearch || !HSOpts->ImplicitModuleMaps)
     return Module;
 
@@ -360,11 +360,11 @@ Module *HeaderSearch::lookupModule(StringRef ModuleName, 
StringRef SearchName,
     // checked
     DirectoryEntryRef NormalDir = *Dir.getDirRef();
     // Search for a module map file in this directory.
-    if (loadModuleMapFile(NormalDir, IsSystem,
-                          /*IsFramework*/false) == LMM_NewlyLoaded) {
+    if (parseModuleMapFile(NormalDir, IsSystem,
+                           /*IsFramework*/ false) == LMM_NewlyLoaded) {
       // We just loaded a module map file; check whether the module is
       // available now.
-      Module = ModMap.findModule(ModuleName);
+      Module = ModMap.findOrLoadModule(ModuleName);
       if (Module)
         break;
     }
@@ -374,10 +374,10 @@ Module *HeaderSearch::lookupModule(StringRef ModuleName, 
StringRef SearchName,
     SmallString<128> NestedModuleMapDirName;
     NestedModuleMapDirName = Dir.getDirRef()->getName();
     llvm::sys::path::append(NestedModuleMapDirName, ModuleName);
-    if (loadModuleMapFile(NestedModuleMapDirName, IsSystem,
-                          /*IsFramework*/false) == LMM_NewlyLoaded){
+    if (parseModuleMapFile(NestedModuleMapDirName, IsSystem,
+                           /*IsFramework*/ false) == LMM_NewlyLoaded) {
       // If we just loaded a module map file, look for the module again.
-      Module = ModMap.findModule(ModuleName);
+      Module = ModMap.findOrLoadModule(ModuleName);
       if (Module)
         break;
     }
@@ -394,7 +394,7 @@ Module *HeaderSearch::lookupModule(StringRef ModuleName, 
StringRef SearchName,
         loadSubdirectoryModuleMaps(Dir);
 
       // Look again for the module.
-      Module = ModMap.findModule(ModuleName);
+      Module = ModMap.findOrLoadModule(ModuleName);
       if (Module)
         break;
     }
@@ -1583,7 +1583,7 @@ bool HeaderSearch::hasModuleMap(StringRef FileName,
       // Success. All of the directories we stepped through inherit this module
       // map file.
       for (unsigned I = 0, N = FixUpDirectories.size(); I != N; ++I)
-        DirectoryHasModuleMap[FixUpDirectories[I]] = true;
+        DirectoryHasModuleMap[FixUpDirectories[I]] = MMDS_Loaded;
       return true;
 
     case LMM_NoDirectory:
@@ -1801,6 +1801,33 @@ HeaderSearch::loadModuleMapFileImpl(FileEntryRef File, 
bool IsSystem,
   return LMM_NewlyLoaded;
 }
 
+HeaderSearch::LoadModuleMapResult
+HeaderSearch::parseModuleMapFileImpl(FileEntryRef File, bool IsSystem,
+                                     DirectoryEntryRef Dir, FileID ID) {
+  // Check whether we've already parsed this module map, and mark it as being
+  // parsed in case we recursively try to parse it from itself.
+  auto AddResult = PreParsedModuleMaps.insert(std::make_pair(File, true));
+  if (!AddResult.second)
+    return AddResult.first->second ? LMM_AlreadyLoaded : LMM_InvalidModuleMap;
+
+  if (ModMap.preParseModuleMapFile(File, IsSystem, Dir, ID)) {
+    PreParsedModuleMaps[File] = false;
+    return LMM_InvalidModuleMap;
+  }
+
+  // Try to load a corresponding private module map.
+  if (OptionalFileEntryRef PMMFile =
+          getPrivateModuleMap(File, FileMgr, Diags)) {
+    if (ModMap.preParseModuleMapFile(*PMMFile, IsSystem, Dir)) {
+      PreParsedModuleMaps[File] = false;
+      return LMM_InvalidModuleMap;
+    }
+  }
+
+  // This directory has a module map.
+  return LMM_NewlyLoaded;
+}
+
 OptionalFileEntryRef
 HeaderSearch::lookupModuleMapFile(DirectoryEntryRef Dir, bool IsFramework) {
   if (!HSOpts->ImplicitModuleMaps)
@@ -1853,7 +1880,7 @@ Module *HeaderSearch::loadFrameworkModule(StringRef Name, 
DirectoryEntryRef Dir,
     break;
   }
 
-  return ModMap.findModule(Name);
+  return ModMap.findOrLoadModule(Name);
 }
 
 HeaderSearch::LoadModuleMapResult
@@ -1869,8 +1896,16 @@ HeaderSearch::LoadModuleMapResult
 HeaderSearch::loadModuleMapFile(DirectoryEntryRef Dir, bool IsSystem,
                                 bool IsFramework) {
   auto KnownDir = DirectoryHasModuleMap.find(Dir);
-  if (KnownDir != DirectoryHasModuleMap.end())
-    return KnownDir->second ? LMM_AlreadyLoaded : LMM_InvalidModuleMap;
+  if (KnownDir != DirectoryHasModuleMap.end()) {
+    switch (KnownDir->second) {
+    case MMDS_Parsed:
+      break;
+    case MMDS_Loaded:
+      return LMM_AlreadyLoaded;
+    case MMDS_Invalid:
+      return LMM_InvalidModuleMap;
+    };
+  }
 
   if (OptionalFileEntryRef ModuleMapFile =
           lookupModuleMapFile(Dir, IsFramework)) {
@@ -1880,9 +1915,49 @@ HeaderSearch::loadModuleMapFile(DirectoryEntryRef Dir, 
bool IsSystem,
     // E.g. Foo.framework/Modules/module.modulemap
     //      ^Dir                  ^ModuleMapFile
     if (Result == LMM_NewlyLoaded)
-      DirectoryHasModuleMap[Dir] = true;
+      DirectoryHasModuleMap[Dir] = MMDS_Loaded;
+    else if (Result == LMM_InvalidModuleMap)
+      DirectoryHasModuleMap[Dir] = MMDS_Invalid;
+    return Result;
+  }
+  return LMM_InvalidModuleMap;
+}
+
+HeaderSearch::LoadModuleMapResult
+HeaderSearch::parseModuleMapFile(StringRef DirName, bool IsSystem,
+                                 bool IsFramework) {
+  if (auto Dir = FileMgr.getOptionalDirectoryRef(DirName))
+    return parseModuleMapFile(*Dir, IsSystem, IsFramework);
+
+  return LMM_NoDirectory;
+}
+
+HeaderSearch::LoadModuleMapResult
+HeaderSearch::parseModuleMapFile(DirectoryEntryRef Dir, bool IsSystem,
+                                 bool IsFramework) {
+  // Check if this modulemap has already been fully loaded, if so skip it.
+  auto KnownDir = DirectoryHasModuleMap.find(Dir);
+  if (KnownDir != DirectoryHasModuleMap.end()) {
+    switch (KnownDir->second) {
+    case MMDS_Parsed:
+    case MMDS_Loaded:
+      return LMM_AlreadyLoaded;
+    case MMDS_Invalid:
+      return LMM_InvalidModuleMap;
+    };
+  }
+
+  if (OptionalFileEntryRef ModuleMapFile =
+          lookupModuleMapFile(Dir, IsFramework)) {
+    LoadModuleMapResult Result =
+        parseModuleMapFileImpl(*ModuleMapFile, IsSystem, Dir);
+    // Add Dir explicitly in case ModuleMapFile is in a subdirectory.
+    // E.g. Foo.framework/Modules/module.modulemap
+    //      ^Dir                  ^ModuleMapFile
+    if (Result == LMM_NewlyLoaded)
+      DirectoryHasModuleMap[Dir] = MMDS_Parsed;
     else if (Result == LMM_InvalidModuleMap)
-      DirectoryHasModuleMap[Dir] = false;
+      DirectoryHasModuleMap[Dir] = MMDS_Invalid;
     return Result;
   }
   return LMM_InvalidModuleMap;
diff --git a/clang/lib/Lex/ModuleMap.cpp b/clang/lib/Lex/ModuleMap.cpp
index e6985a40433ec..f44faa648ac47 100644
--- a/clang/lib/Lex/ModuleMap.cpp
+++ b/clang/lib/Lex/ModuleMap.cpp
@@ -1052,6 +1052,9 @@ Module *ModuleMap::inferFrameworkModule(DirectoryEntryRef 
FrameworkDir,
           bool IsFrameworkDir = Parent.ends_with(".framework");
           if (OptionalFileEntryRef ModMapFile =
                   HeaderInfo.lookupModuleMapFile(*ParentDir, IsFrameworkDir)) {
+            // TODO: Pre-parsing a module map should populate
+            //       `InferredDirectories` so we don't need to do a full load
+            //       here.
             parseModuleMapFile(*ModMapFile, Attrs.IsSystem, *ParentDir);
             inferred = InferredDirectories.find(*ParentDir);
           }
@@ -1321,6 +1324,84 @@ void ModuleMap::addHeader(Module *Mod, Module::Header 
Header,
     Cb->moduleMapAddHeader(HeaderEntry.getName());
 }
 
+bool ModuleMap::preParseModuleMapFile(clang::FileEntryRef File, bool IsSystem,
+                                      clang::DirectoryEntryRef Dir,
+                                      clang::FileID ID,
+                                      SourceLocation ExternModuleLoc) {
+  llvm::DenseMap<const FileEntry *, const modulemap::ModuleMapFile *>::iterator
+      Known = PreParsedModuleMap.find(File);
+  if (Known != PreParsedModuleMap.end())
+    return Known->second == nullptr;
+
+  // If the module map file wasn't already entered, do so now.
+  if (ID.isInvalid()) {
+    ID = SourceMgr.translateFile(File);
+    if (ID.isInvalid() || SourceMgr.isLoadedFileID(ID)) {
+      auto FileCharacter =
+          IsSystem ? SrcMgr::C_System_ModuleMap : SrcMgr::C_User_ModuleMap;
+      ID = SourceMgr.createFileID(File, ExternModuleLoc, FileCharacter);
+    }
+  }
+
+  std::optional<llvm::MemoryBufferRef> Buffer = SourceMgr.getBufferOrNone(ID);
+  if (!Buffer) {
+    PreParsedModuleMap[File] = nullptr;
+    return true;
+  }
+
+  Diags.Report(diag::remark_mmap_parse) << File.getName();
+  std::optional<modulemap::ModuleMapFile> MaybeMMF =
+      modulemap::parseModuleMap(ID, Dir, SourceMgr, Diags, IsSystem, nullptr);
+
+  if (!MaybeMMF) {
+    PreParsedModuleMap[File] = nullptr;
+    return true;
+  }
+
+  PreParsedModuleMaps.push_back(
+      std::make_unique<modulemap::ModuleMapFile>(std::move(*MaybeMMF)));
+  const modulemap::ModuleMapFile &MMF = *PreParsedModuleMaps.back();
+  std::vector<const modulemap::ExternModuleDecl *> PendingExternalModuleMaps;
+  for (const auto &Decl : MMF.Decls) {
+    std::visit(llvm::makeVisitor(
+                   [&](const modulemap::ModuleDecl &MD) {
+                     // Only use the first part of the name even for 
submodules.
+                     // This will correctly load the submodule declarations 
when
+                     // the module is loaded.
+                     auto &ModuleDecls =
+                         PreParsedModules[StringRef(MD.Id.front().first)];
+                     ModuleDecls.push_back(std::pair(&MMF, &MD));
+                   },
+                   [&](const modulemap::ExternModuleDecl &EMD) {
+                     PendingExternalModuleMaps.push_back(&EMD);
+                   }),
+               Decl);
+  }
+
+  for (const modulemap::ExternModuleDecl *EMD : PendingExternalModuleMaps) {
+    StringRef FileNameRef = EMD->Path;
+    SmallString<128> ModuleMapFileName;
+    if (llvm::sys::path::is_relative(FileNameRef)) {
+      ModuleMapFileName += Dir.getName();
+      llvm::sys::path::append(ModuleMapFileName, EMD->Path);
+      FileNameRef = ModuleMapFileName;
+    }
+
+    if (auto EFile =
+            SourceMgr.getFileManager().getOptionalFileRef(FileNameRef)) {
+      preParseModuleMapFile(*EFile, IsSystem, EFile->getDir(), FileID(),
+                            ExternModuleLoc);
+    }
+  }
+
+  PreParsedModuleMap[File] = &MMF;
+
+  for (const auto &Cb : Callbacks)
+    Cb->moduleMapFileRead(SourceLocation(), File, IsSystem);
+
+  return false;
+}
+
 FileID ModuleMap::getContainingModuleMapFileID(const Module *Module) const {
   if (Module->DefinitionLoc.isInvalid())
     return {};
@@ -1459,7 +1540,6 @@ bool ModuleMap::resolveConflicts(Module *Mod, bool 
Complain) {
 
 namespace clang {
 class ModuleMapParser {
-  modulemap::ModuleMapFile &MMF;
   SourceManager &SourceMgr;
 
   DiagnosticsEngine &Diags;
@@ -1516,13 +1596,15 @@ class ModuleMapParser {
   using Attributes = ModuleMap::Attributes;
 
 public:
-  ModuleMapParser(modulemap::ModuleMapFile &MMF, SourceManager &SourceMgr,
-                  DiagnosticsEngine &Diags, ModuleMap &Map, FileID 
ModuleMapFID,
+  ModuleMapParser(SourceManager &SourceMgr, DiagnosticsEngine &Diags,
+                  ModuleMap &Map, FileID ModuleMapFID,
                   DirectoryEntryRef Directory, bool IsSystem)
-      : MMF(MMF), SourceMgr(SourceMgr), Diags(Diags), Map(Map),
+      : SourceMgr(SourceMgr), Diags(Diags), Map(Map),
         ModuleMapFID(ModuleMapFID), Directory(Directory), IsSystem(IsSystem) {}
 
-  bool parseModuleMapFile();
+  bool parseModuleDecl(const modulemap::ModuleDecl &MD);
+  bool parseExternModuleDecl(const modulemap::ExternModuleDecl &EMD);
+  bool parseModuleMapFile(const modulemap::ModuleMapFile &MMF);
 };
 
 } // namespace clang
@@ -1661,7 +1743,11 @@ void ModuleMapParser::handleModuleDecl(const 
modulemap::ModuleDecl &MD) {
         Map.LangOpts.CurrentModule == ModuleName &&
         SourceMgr.getDecomposedLoc(ModuleNameLoc).first !=
             SourceMgr.getDecomposedLoc(Existing->DefinitionLoc).first;
-    if (LoadedFromASTFile || Inferred || PartOfFramework || ParsedAsMainInput) 
{
+    // TODO: Remove this check when we can avoid loading module maps multiple
+    //       times.
+    bool SameModuleDecl = ModuleNameLoc == Existing->DefinitionLoc;
+    if (LoadedFromASTFile || Inferred || PartOfFramework || ParsedAsMainInput 
||
+        SameModuleDecl) {
       ActiveModule = PreviousActiveModule;
       // Skip the module definition.
       return;
@@ -2104,7 +2190,18 @@ void ModuleMapParser::handleInferredModuleDecl(
   }
 }
 
-bool ModuleMapParser::parseModuleMapFile() {
+bool ModuleMapParser::parseModuleDecl(const modulemap::ModuleDecl &MD) {
+  handleModuleDecl(MD);
+  return HadError;
+}
+
+bool ModuleMapParser::parseExternModuleDecl(
+    const modulemap::ExternModuleDecl &EMD) {
+  handleExternModuleDecl(EMD);
+  return HadError;
+}
+
+bool ModuleMapParser::parseModuleMapFile(const modulemap::ModuleMapFile &MMF) {
   for (const auto &Decl : MMF.Decls) {
     std::visit(
         llvm::makeVisitor(
@@ -2117,6 +2214,28 @@ bool ModuleMapParser::parseModuleMapFile() {
   return HadError;
 }
 
+Module *ModuleMap::findOrLoadModule(StringRef Name) {
+  llvm::StringMap<Module *>::const_iterator Known = Modules.find(Name);
+  if (Known != Modules.end())
+    return Known->getValue();
+
+  auto PreParsedMod = PreParsedModules.find(Name);
+  if (PreParsedMod == PreParsedModules.end())
+    return nullptr;
+
+  Diags.Report(diag::remark_mmap_load_module) << Name;
+
+  for (const auto &ModuleDecl : PreParsedMod->second) {
+    const modulemap::ModuleMapFile &MMF = *ModuleDecl.first;
+    ModuleMapParser Parser(SourceMgr, Diags, const_cast<ModuleMap &>(*this),
+                           MMF.ID, *MMF.Dir, MMF.IsSystem);
+    if (Parser.parseModuleDecl(*ModuleDecl.second))
+      return nullptr;
+  }
+
+  return findModule(Name);
+}
+
 bool ModuleMap::parseModuleMapFile(FileEntryRef File, bool IsSystem,
                                    DirectoryEntryRef Dir, FileID ID,
                                    unsigned *Offset,
@@ -2129,9 +2248,16 @@ bool ModuleMap::parseModuleMapFile(FileEntryRef File, 
bool IsSystem,
 
   // If the module map file wasn't already entered, do so now.
   if (ID.isInvalid()) {
-    auto FileCharacter =
-        IsSystem ? SrcMgr::C_System_ModuleMap : SrcMgr::C_User_ModuleMap;
-    ID = SourceMgr.createFileID(File, ExternModuleLoc, FileCharacter);
+    ID = SourceMgr.translateFile(File);
+    // TODO: The way we compute affecting module maps requires this to be a
+    //       local FileID. This should be changed to reuse loaded FileIDs when
+    //       available, and change the way that affecting module maps are
+    //       computed to not require this.
+    if (ID.isInvalid() || SourceMgr.isLoadedFileID(ID)) {
+      auto FileCharacter =
+          IsSystem ? SrcMgr::C_System_ModuleMap : SrcMgr::C_User_ModuleMap;
+      ID = SourceMgr.createFileID(File, ExternModuleLoc, FileCharacter);
+    }
   }
 
   assert(Target && "Missing target information");
@@ -2145,8 +2271,9 @@ bool ModuleMap::parseModuleMapFile(FileEntryRef File, 
bool IsSystem,
       modulemap::parseModuleMap(ID, Dir, SourceMgr, Diags, IsSystem, Offset);
   bool Result = false;
   if (MMF) {
-    ModuleMapParser Parser(*MMF, SourceMgr, Diags, *this, ID, Dir, IsSystem);
-    Result = Parser.parseModuleMapFile();
+    Diags.Report(diag::remark_mmap_load) << File.getName();
+    ModuleMapParser Parser(SourceMgr, Diags, *this, ID, Dir, IsSystem);
+    Result = Parser.parseModuleMapFile(*MMF);
   }
   ParsedModuleMap[File] = Result;
 
diff --git a/clang/lib/Lex/ModuleMapFile.cpp b/clang/lib/Lex/ModuleMapFile.cpp
index 5cf4a4c3d96c1..f457de85243cc 100644
--- a/clang/lib/Lex/ModuleMapFile.cpp
+++ b/clang/lib/Lex/ModuleMapFile.cpp
@@ -169,7 +169,10 @@ modulemap::parseModuleMap(FileID ID, 
clang::DirectoryEntryRef Dir,
 
   if (Failed)
     return std::nullopt;
+  Parser.MMF.ID = ID;
+  Parser.MMF.Dir = Dir;
   Parser.MMF.Start = Start;
+  Parser.MMF.IsSystem = IsSystem;
   return std::move(Parser.MMF);
 }
 
diff --git a/clang/lib/Sema/SemaModule.cpp b/clang/lib/Sema/SemaModule.cpp
index 76589bff40be9..7549d09e9fb27 100644
--- a/clang/lib/Sema/SemaModule.cpp
+++ b/clang/lib/Sema/SemaModule.cpp
@@ -393,7 +393,7 @@ Sema::ActOnModuleDecl(SourceLocation StartLoc, 
SourceLocation ModuleLoc,
   case ModuleDeclKind::PartitionInterface: {
     // We can't have parsed or imported a definition of this module or parsed a
     // module map defining it already.
-    if (auto *M = Map.findModule(ModuleName)) {
+    if (auto *M = Map.findOrLoadModule(ModuleName)) {
       Diag(Path[0].second, diag::err_module_redefinition) << ModuleName;
       if (M->DefinitionLoc.isValid())
         Diag(M->DefinitionLoc, diag::note_prev_module_definition);
diff --git a/clang/test/Modules/Inputs/shadow/A1/A1.h 
b/clang/test/Modules/Inputs/shadow/A1/A1.h
new file mode 100644
index 0000000000000..e69de29bb2d1d
diff --git a/clang/test/Modules/Inputs/shadow/A1/module.modulemap 
b/clang/test/Modules/Inputs/shadow/A1/module.modulemap
index 9439a431b1dbe..3a47280776ae2 100644
--- a/clang/test/Modules/Inputs/shadow/A1/module.modulemap
+++ b/clang/test/Modules/Inputs/shadow/A1/module.modulemap
@@ -2,4 +2,6 @@ module A {
   header "A.h"
 }
 
-module A1 {}
+module A1 {
+  header "A1.h"
+}
diff --git a/clang/test/Modules/Inputs/shadow/A2/A2.h 
b/clang/test/Modules/Inputs/shadow/A2/A2.h
new file mode 100644
index 0000000000000..e69de29bb2d1d
diff --git a/clang/test/Modules/Inputs/shadow/A2/module.modulemap 
b/clang/test/Modules/Inputs/shadow/A2/module.modulemap
index 935d89bb425e0..9e6fe6448ead8 100644
--- a/clang/test/Modules/Inputs/shadow/A2/module.modulemap
+++ b/clang/test/Modules/Inputs/shadow/A2/module.modulemap
@@ -2,4 +2,6 @@ module A {
   header "A.h"
 }
 
-module A2 {}
+module A2 {
+  header "A2.h"
+}
diff --git a/clang/test/Modules/lazy-by-name-lookup.c 
b/clang/test/Modules/lazy-by-name-lookup.c
new file mode 100644
index 0000000000000..11a3a5cda709d
--- /dev/null
+++ b/clang/test/Modules/lazy-by-name-lookup.c
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -I%t \
+// RUN:   -fmodules-cache-path=%t/cache %t/tu.c -fsyntax-only -Rmodule-map \
+// RUN:   -verify
+
+//--- module.modulemap
+
+module A {
+  header "A.h"
+}
+
+module B {
+  header "B.h"
+}
+
+//--- A.h
+
+//--- B.h
+
+//--- tu.c
+
+#pragma clang __debug module_lookup A // does module map search for A
+#pragma clang __debug module_map A // A is now in the ModuleMap,
+#pragma clang __debug module_map B // expected-warning{{unknown module 'B'}}
+                                   // but B isn't.
+#include <B.h> // Now load B via header search
+
+// expected-remark@*{{parsing modulemap}}
+// expected-remark@*{{loading parsed module 'A'}}
+// expected-remark@*{{loading modulemap}}
\ No newline at end of file
diff --git a/clang/test/Modules/modulemap-locations.m 
b/clang/test/Modules/modulemap-locations.m
index 330a03d1fb633..17a60cfd3273b 100644
--- a/clang/test/Modules/modulemap-locations.m
+++ b/clang/test/Modules/modulemap-locations.m
@@ -12,7 +12,8 @@
 @import Module_ModuleMap_F;
 @import Module_ModuleMap_F.Private;
 @import Both_F;
-@import Inferred;
+@import Inferred; // expected-warning@*{{as a module map name is deprecated, 
rename it to module.modulemap}}
+                  // expected-warning@*{{as a module map name is deprecated, 
rename it to module.private.modulemap}}
 
 void test(void) {
   will_be_found1();
diff --git a/clang/test/Modules/shadow.m b/clang/test/Modules/shadow.m
index 44320af2b0c66..c45d0185d4d80 100644
--- a/clang/test/Modules/shadow.m
+++ b/clang/test/Modules/shadow.m
@@ -1,13 +1,14 @@
 // RUN: rm -rf %t
-// RUN: not %clang_cc1 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t -I %S/Inputs/shadow/A1 -I %S/Inputs/shadow/A2 %s 
-fsyntax-only 2>&1 | FileCheck %s -check-prefix=REDEFINITION
-// RUN: not %clang_cc1 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t -fmodule-map-file=%S/Inputs/shadow/A1/module.modulemap 
-fmodule-map-file=%S/Inputs/shadow/A2/module.modulemap %s -fsyntax-only 2>&1 | 
FileCheck %s -check-prefix=REDEFINITION
+// RUN: not %clang_cc1 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t -I %S/Inputs/shadow/A1 -I %S/Inputs/shadow/A2 -I 
%S/Inputs/shadow %s -fsyntax-only 2>&1 | FileCheck %s -check-prefix=REDEFINITION
+// RUN: not %clang_cc1 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t -fmodule-map-file=%S/Inputs/shadow/A1/module.modulemap 
-fmodule-map-file=%S/Inputs/shadow/A2/module.modulemap %S/Inputs/shadow %s 
-fsyntax-only 2>&1 | FileCheck %s -check-prefix=REDEFINITION
 // REDEFINITION: error: redefinition of module 'A'
 // REDEFINITION: note: previously defined
 
-// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t 
-fmodule-map-file=%S/Inputs/shadow/A1/module.modulemap -I %S/Inputs/shadow %s 
-verify
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -x 
objective-c-header %S/Inputs/shadow/A1/module.modulemap -emit-module -o 
%t/A.pcm -fmodule-name=A
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t 
-fmodule-map-file=%S/Inputs/shadow/A1/module.modulemap -fmodule-file=A=%t/A.pcm 
-I %S/Inputs/shadow %s -verify
 
-@import A1;
-@import A2;
+#import "A1/A1.h"
+#import "A2/A2.h"
 @import A;
 
 #import "A2/A.h" // expected-note {{implicitly imported}}

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to