[Lldb-commits] [lldb] 00cd6c0 - [Preprocessor] Reduce the memory overhead of `#define` directives (Recommit)

2022-02-14 Thread Alex Lorenz via lldb-commits

Author: Alex Lorenz
Date: 2022-02-14T09:27:44-08:00
New Revision: 00cd6c04202acf71f74c670b2dd4343929d1f45f

URL: 
https://github.com/llvm/llvm-project/commit/00cd6c04202acf71f74c670b2dd4343929d1f45f
DIFF: 
https://github.com/llvm/llvm-project/commit/00cd6c04202acf71f74c670b2dd4343929d1f45f.diff

LOG: [Preprocessor] Reduce the memory overhead of `#define` directives 
(Recommit)

Recently we observed high memory pressure caused by clang during some parallel 
builds.
We discovered that we have several projects that have a large number of #define 
directives
in their TUs (on the order of millions), which caused huge memory consumption 
in clang due
to a lot of allocations for MacroInfo. We would like to reduce the memory 
overhead of
clang for a single #define to reduce the memory overhead for these files, to 
allow us to
reduce the memory pressure on the system during highly parallel builds. This 
change achieves
that by removing the SmallVector in MacroInfo and instead storing the tokens in 
an array
allocated using the bump pointer allocator, after all tokens are lexed.

The added unit test with 100 #define directives illustrates the problem. 
Prior to this
change, on arm64 macOS, clang's PP bump pointer allocator allocated 272007616 
bytes, and
used roughly 272 bytes per #define. After this change, clang's PP bump pointer 
allocator
allocates 120002016 bytes, and uses only roughly 120 bytes per #define.

For an example test file that we have internally with 7.8 million #define 
directives, this
change produces the following improvement on arm64 macOS: Persistent allocation 
footprint for
this test case file as it's being compiled to LLVM IR went down 22% from 5.28 
GB to 4.07 GB
and the total allocations went down 14% from 8.26 GB to 7.05 GB. Furthermore, 
this change
reduced the total number of allocations made by the system for this clang 
invocation from
1454853 to 133663, an order of magnitude improvement.

The recommit fixes the LLDB build failure.

Differential Revision: https://reviews.llvm.org/D117348

Added: 
clang/unittests/Lex/PPMemoryAllocationsTest.cpp

Modified: 
clang/include/clang/Lex/MacroInfo.h
clang/lib/Lex/MacroInfo.cpp
clang/lib/Lex/PPDirectives.cpp
clang/lib/Serialization/ASTReader.cpp
clang/lib/Serialization/ASTWriter.cpp
clang/unittests/Lex/CMakeLists.txt
lldb/source/Plugins/ExpressionParser/Clang/ClangModulesDeclVendor.cpp

Removed: 




diff  --git a/clang/include/clang/Lex/MacroInfo.h 
b/clang/include/clang/Lex/MacroInfo.h
index 0347a7a37186b..1947bc8fc509e 100644
--- a/clang/include/clang/Lex/MacroInfo.h
+++ b/clang/include/clang/Lex/MacroInfo.h
@@ -54,11 +54,14 @@ class MacroInfo {
   /// macro, this includes the \c __VA_ARGS__ identifier on the list.
   IdentifierInfo **ParameterList = nullptr;
 
+  /// This is the list of tokens that the macro is defined to.
+  const Token *ReplacementTokens = nullptr;
+
   /// \see ParameterList
   unsigned NumParameters = 0;
 
-  /// This is the list of tokens that the macro is defined to.
-  SmallVector ReplacementTokens;
+  /// \see ReplacementTokens
+  unsigned NumReplacementTokens = 0;
 
   /// Length in characters of the macro definition.
   mutable unsigned DefinitionLength;
@@ -230,26 +233,47 @@ class MacroInfo {
   bool isWarnIfUnused() const { return IsWarnIfUnused; }
 
   /// Return the number of tokens that this macro expands to.
-  unsigned getNumTokens() const { return ReplacementTokens.size(); }
+  unsigned getNumTokens() const { return NumReplacementTokens; }
 
   const Token &getReplacementToken(unsigned Tok) const {
-assert(Tok < ReplacementTokens.size() && "Invalid token #");
+assert(Tok < NumReplacementTokens && "Invalid token #");
 return ReplacementTokens[Tok];
   }
 
-  using tokens_iterator = SmallVectorImpl::const_iterator;
+  using const_tokens_iterator = const Token *;
 
-  tokens_iterator tokens_begin() const { return ReplacementTokens.begin(); }
-  tokens_iterator tokens_end() const { return ReplacementTokens.end(); }
-  bool tokens_empty() const { return ReplacementTokens.empty(); }
-  ArrayRef tokens() const { return ReplacementTokens; }
+  const_tokens_iterator tokens_begin() const { return ReplacementTokens; }
+  const_tokens_iterator tokens_end() const {
+return ReplacementTokens + NumReplacementTokens;
+  }
+  bool tokens_empty() const { return NumReplacementTokens == 0; }
+  ArrayRef tokens() const {
+return llvm::makeArrayRef(ReplacementTokens, NumReplacementTokens);
+  }
 
-  /// Add the specified token to the replacement text for the macro.
-  void AddTokenToBody(const Token &Tok) {
+  llvm::MutableArrayRef
+  allocateTokens(unsigned NumTokens, llvm::BumpPtrAllocator &PPAllocator) {
+assert(ReplacementTokens == nullptr && NumReplacementTokens == 0 &&
+   "Token list already allocated!");
+NumReplacementTokens = NumTokens;
+Token *NewReplacementT

[Lldb-commits] [lldb] r366956 - [Support] move FileCollector from LLDB to llvm/Support

2019-10-04 Thread Alex Lorenz via lldb-commits
Author: arphaman
Date: Wed Jul 24 15:59:20 2019
New Revision: 366956

URL: http://llvm.org/viewvc/llvm-project?rev=366956&view=rev
Log:
[Support] move FileCollector from LLDB to llvm/Support

The file collector class is useful for creating reproducers,
not just for LLDB, but for other tools as well in LLVM/Clang.

Differential Revision: https://reviews.llvm.org/D65237

Removed:
lldb/trunk/source/Utility/FileCollector.cpp
lldb/trunk/unittests/Utility/FileCollectorTest.cpp
Modified:
lldb/trunk/include/lldb/Utility/FileCollector.h
lldb/trunk/source/Utility/CMakeLists.txt
lldb/trunk/unittests/Utility/CMakeLists.txt

Modified: lldb/trunk/include/lldb/Utility/FileCollector.h
URL: 
http://llvm.org/viewvc/llvm-project/lldb/trunk/include/lldb/Utility/FileCollector.h?rev=366956&r1=366955&r2=366956&view=diff
==
--- lldb/trunk/include/lldb/Utility/FileCollector.h (original)
+++ lldb/trunk/include/lldb/Utility/FileCollector.h Wed Jul 24 15:59:20 2019
@@ -11,65 +11,29 @@
 
 #include "lldb/Utility/FileSpec.h"
 
-#include "llvm/ADT/SmallVector.h"
-#include "llvm/ADT/StringMap.h"
-#include "llvm/ADT/StringSet.h"
-#include "llvm/ADT/Twine.h"
-#include "llvm/Support/VirtualFileSystem.h"
-
-#include 
+#include "llvm/Support/FileCollector.h"
 
 namespace lldb_private {
 
 /// Collects files into a directory and generates a mapping that can be used by
 /// the VFS.
-class FileCollector {
+class FileCollector : public llvm::FileCollector {
 public:
-  FileCollector(const FileSpec &root, const FileSpec &overlay);
-
-  void AddFile(const llvm::Twine &file);
-  void AddFile(const FileSpec &file) { return AddFile(file.GetPath()); }
+  FileCollector(const FileSpec &root, const FileSpec &overlay) :
+llvm::FileCollector(root.GetPath(), overlay.GetPath()) {}
 
-  /// Write the yaml mapping (for the VFS) to the given file.
-  std::error_code WriteMapping(const FileSpec &mapping_file);
+  using llvm::FileCollector::AddFile;
 
-  /// Copy the files into the root directory.
-  ///
-  /// When stop_on_error is true (the default) we abort as soon as one file
-  /// cannot be copied. This is relatively common, for example when a file was
-  /// removed after it was added to the mapping.
-  std::error_code CopyFiles(bool stop_on_error = true);
-
-protected:
-  void AddFileImpl(llvm::StringRef src_path);
-
-  bool MarkAsSeen(llvm::StringRef path) { return m_seen.insert(path).second; }
-
-  bool GetRealPath(llvm::StringRef src_path,
-   llvm::SmallVectorImpl &result);
-
-  void AddFileToMapping(llvm::StringRef virtual_path,
-llvm::StringRef real_path) {
-m_vfs_writer.addFileMapping(virtual_path, real_path);
+  void AddFile(const FileSpec &file) {
+  std::string path = file.GetPath();
+  llvm::FileCollector::AddFile(path);
   }
 
-  /// Synchronizes adding files.
-  std::mutex m_mutex;
-
-  /// The root directory where files are copied.
-  FileSpec m_root;
-
-  /// The root directory where the VFS overlay lives.
-  FileSpec m_overlay_root;
-
-  /// Tracks already seen files so they can be skipped.
-  llvm::StringSet<> m_seen;
-
-  /// The yaml mapping writer.
-  llvm::vfs::YAMLVFSWriter m_vfs_writer;
-
-  /// Caches real_path calls when resolving symlinks.
-  llvm::StringMap m_symlink_map;
+  /// Write the yaml mapping (for the VFS) to the given file.
+  std::error_code WriteMapping(const FileSpec &mapping_file) {
+std::string path = mapping_file.GetPath();
+return llvm::FileCollector::WriteMapping(path);
+  }
 };
 
 } // namespace lldb_private

Modified: lldb/trunk/source/Utility/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/lldb/trunk/source/Utility/CMakeLists.txt?rev=366956&r1=366955&r2=366956&view=diff
==
--- lldb/trunk/source/Utility/CMakeLists.txt (original)
+++ lldb/trunk/source/Utility/CMakeLists.txt Wed Jul 24 15:59:20 2019
@@ -23,7 +23,6 @@ add_lldb_library(lldbUtility
   DataEncoder.cpp
   DataExtractor.cpp
   Environment.cpp
-  FileCollector.cpp
   Event.cpp
   FileSpec.cpp
   IOObject.cpp

Removed: lldb/trunk/source/Utility/FileCollector.cpp
URL: 
http://llvm.org/viewvc/llvm-project/lldb/trunk/source/Utility/FileCollector.cpp?rev=366955&view=auto
==
--- lldb/trunk/source/Utility/FileCollector.cpp (original)
+++ lldb/trunk/source/Utility/FileCollector.cpp (removed)
@@ -1,182 +0,0 @@
-//===-- FileCollector.cpp ---*- C++ 
-*-===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===--===//
-
-#include "lldb/Utility/FileCollector.h"
-
-#include "llvm/ADT/S