[Lldb-commits] [PATCH] D115324: Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster.

2021-12-15 Thread Pavel Labath via Phabricator via lldb-commits
labath added a comment.

Thanks for the quick response.

I did another pass over the patch, and I think I have a fairly good 
understanding of how it works. I have another round of comments, but nothing 
major.




Comment at: lldb/include/lldb/Core/DataFileCache.h:9
+
+#ifndef LLDB_UTILITY_DATAFILECACHE_H
+#define LLDB_UTILITY_DATAFILECACHE_H





Comment at: lldb/include/lldb/Core/DataFileCache.h:129-132
+if (m_uuid.hasValue() != rhs.m_uuid.hasValue())
+  return true;
+if (m_uuid.hasValue() && *m_uuid != *rhs.m_uuid)
+  return true;

the Optional class already has comparison operators that do the right thing.



Comment at: lldb/include/lldb/Core/Mangled.h:72-73
+  bool operator!=(const Mangled &rhs) const {
+return m_mangled != rhs.m_mangled ||
+   GetDemangledName() != rhs.GetDemangledName();
+  }





Comment at: lldb/source/Core/Module.cpp:1679
+
+std::string Module::GetCacheKey() {
+  std::string key;

The cache key and cache signature algorithms are fairly similar but subtly 
different (IIUC, one is used to locate the cache entry on the disk and the 
second one to verify that it refers to the correct file). I think it'd be 
better to have the two functions next to each other, to highlight the 
differences. If you moved the signature construction function here (by changing 
it from a Module constructor to a CacheSignature getter function), then that 
would also the dependency problem I mentioned last time as I believe these 
constructors were the only non-utility dependency of that code (I don't think 
that code has to be moved back to Utility, but it *could* be moved, it need 
arises).



Comment at: lldb/source/Symbol/Symbol.cpp:649
+/// The only tricky thing in this encoding is encoding all of the bits in the
+/// bitfields. We use a trick to store all bitfields as a 16 bit value and we
+/// do the same thing when decoding the symbol. There are test that ensure this

I guess this isn't a "trick" anymore, just manual encoding of boolean bitfields 
into an int.



Comment at: lldb/source/Symbol/Symbol.cpp:645
+  file.AppendU16(m_type_data);
+  file.AppendU16((&m_type_data)[1]);
+  m_mangled.Encode(file, strtab);

clayborg wrote:
> labath wrote:
> > This is quite dodgy. It would be better to change the storage so that we 
> > don't have to do this. Perhaps something like
> > ```
> > union {
> >   uint16_t flag_storage;
> >   struct {
> > uint16_t m_type_data_resolved : 1;
> > ...
> >   };
> > };
> > ```
> I encoded them manually now. It was too disruptive to the code to try and add 
> an anonymous union with an anonymous struct as there were many compiler 
> warnings about it being a GNU extension and made the code way too messy, 
> especially in the constructor as you can't init each value easily.
That works too.



Comment at: lldb/source/Symbol/Symtab.cpp:118-147
+if (!pair.second.IsEmpty()) {
+  std::multimap index_to_names;
+  s->Printf("\n0x%2.2x name to index table:\n", pair.first);
+  for (const auto &entry: pair.second)
+index_to_names.insert(std::make_pair(entry.value, entry.cstring));
+  std::set names;
+  uint32_t last_idx = UINT32_MAX;

If I managed to understand what this is doing correctly, it could be more 
simply implemented as:
```
if (pair.second.IsEmpty())
  continue;
s->Printf("\n0x%2.2x name to index table:\n", pair.first);
std::map> index_to_names;
for (const auto &entry: pair.second)
  index_to_names[entry.value].push_back(entry.cstring); // automatically 
creates a new map entry for the first string
for (const auto &pair: index_to_names) {
  s->Printf("0x%8.8x: ", pair.first);
  for (const auto &name: pair.second) // vector always non-empty
s->Printf("\"%s\" ", name.AsCString());
  s->EOL();
}
```



Comment at: lldb/source/Symbol/Symtab.cpp:303
 m_name_indexes_computed = true;
+// During unit testing we don't have a object file.
 ElapsedTime elapsed(m_objfile->GetModule()->GetSymtabIndexTime());

I guess this is no longer relevant.



Comment at: lldb/source/Symbol/Symtab.cpp:1214
+  if (Encode(file)) {
+auto key = GetCacheKey();
+cache->SetCachedData(key, file.GetData());

this is not a good use of auto per the llvm coding standards



Comment at: lldb/source/Symbol/Symtab.cpp:1221
+
+void EncodeCStrMap(DataEncoder &encoder, ConstStringTable &strtab,
+   const UniqueCStringMap &cstr_map) {

add `static` (for the next function as well)



Comment at: 
lldb/test/API/functionalities/module_cache/universal/TestModuleCacheUniversal.py:21
+self.cache_dir = os.path.join(self.getBuildDir(), 'lldb-module-cache'

[Lldb-commits] [PATCH] D115324: Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster.

2021-12-15 Thread Greg Clayton via Phabricator via lldb-commits
clayborg updated this revision to Diff 394656.
clayborg marked 7 inline comments as done.
clayborg added a comment.

- Fixed all review issues
- Moved tests to separate test files in SymbolTests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115324/new/

https://reviews.llvm.org/D115324

Files:
  lldb/include/lldb/Core/DataFileCache.h
  lldb/include/lldb/Core/Mangled.h
  lldb/include/lldb/Core/Module.h
  lldb/include/lldb/Core/ModuleList.h
  lldb/include/lldb/Host/FileSystem.h
  lldb/include/lldb/Symbol/ObjectFile.h
  lldb/include/lldb/Symbol/Symbol.h
  lldb/include/lldb/Symbol/Symtab.h
  lldb/include/lldb/Utility/DataEncoder.h
  lldb/include/lldb/lldb-forward.h
  lldb/packages/Python/lldbsuite/test/lldbtest.py
  lldb/source/Core/CMakeLists.txt
  lldb/source/Core/CoreProperties.td
  lldb/source/Core/DataFileCache.cpp
  lldb/source/Core/Mangled.cpp
  lldb/source/Core/Module.cpp
  lldb/source/Core/ModuleList.cpp
  lldb/source/Host/common/FileSystem.cpp
  lldb/source/Symbol/ObjectFile.cpp
  lldb/source/Symbol/Symbol.cpp
  lldb/source/Symbol/Symtab.cpp
  lldb/source/Utility/DataEncoder.cpp
  lldb/test/API/functionalities/module_cache/bsd/Makefile
  lldb/test/API/functionalities/module_cache/bsd/TestModuleCacheBSD.py
  lldb/test/API/functionalities/module_cache/bsd/a.c
  lldb/test/API/functionalities/module_cache/bsd/b.c
  lldb/test/API/functionalities/module_cache/bsd/c.c
  lldb/test/API/functionalities/module_cache/bsd/main.c
  lldb/test/API/functionalities/module_cache/simple_exe/Makefile
  lldb/test/API/functionalities/module_cache/simple_exe/TestModuleCacheSimple.py
  lldb/test/API/functionalities/module_cache/simple_exe/main.c
  lldb/test/API/functionalities/module_cache/universal/Makefile
  
lldb/test/API/functionalities/module_cache/universal/TestModuleCacheUniversal.py
  lldb/test/API/functionalities/module_cache/universal/main.c
  lldb/unittests/Symbol/CMakeLists.txt
  lldb/unittests/Symbol/MangledTest.cpp
  lldb/unittests/Symbol/SymbolTest.cpp
  lldb/unittests/Symbol/SymtabTest.cpp

Index: lldb/unittests/Symbol/SymtabTest.cpp
===
--- /dev/null
+++ lldb/unittests/Symbol/SymtabTest.cpp
@@ -0,0 +1,305 @@
+//===-- SymbolTest.cpp ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "Plugins/ObjectFile/Mach-O/ObjectFileMachO.h"
+#include "Plugins/SymbolFile/DWARF/SymbolFileDWARF.h"
+#include "Plugins/TypeSystem/Clang/TypeSystemClang.h"
+#include "TestingSupport/SubsystemRAII.h"
+#include "TestingSupport/TestUtilities.h"
+
+#include "lldb/Core/DataFileCache.h"
+#include "lldb/Core/Module.h"
+#include "lldb/Host/FileSystem.h"
+#include "lldb/Host/HostInfo.h"
+#include "lldb/Symbol/Symbol.h"
+#include "lldb/Symbol/Symtab.h"
+#include "lldb/Utility/DataEncoder.h"
+#include "lldb/Utility/DataExtractor.h"
+
+#include 
+
+#include "gtest/gtest.h"
+
+using namespace lldb;
+using namespace lldb_private;
+
+class SymtabTest : public testing::Test {
+  SubsystemRAII
+  subsystem;
+};
+
+static void EncodeDecode(const Symtab &object, ByteOrder byte_order) {
+  const uint8_t addr_size = 8;
+  DataEncoder file(byte_order, addr_size);
+
+  object.Encode(file);
+  llvm::ArrayRef bytes = file.GetData();
+  DataExtractor data(bytes.data(), bytes.size(), byte_order, addr_size);
+  Symtab decoded_object(object.GetObjectFile());
+  offset_t data_offset = 0;
+  bool uuid_mismatch = false;
+  decoded_object.Decode(data, &data_offset, uuid_mismatch);
+  ASSERT_EQ(object.GetNumSymbols(), decoded_object.GetNumSymbols());
+  for (size_t i = 0; i < object.GetNumSymbols(); ++i)
+EXPECT_EQ(*object.SymbolAtIndex(i), *decoded_object.SymbolAtIndex(i));
+}
+
+static void EncodeDecode(const Symtab &object) {
+  EncodeDecode(object, eByteOrderLittle);
+  EncodeDecode(object, eByteOrderBig);
+}
+
+TEST_F(SymtabTest, EncodeDecodeSymtab) {
+
+  auto ExpectedFile = TestFile::fromYaml(R"(
+--- !mach-o
+FileHeader:
+  magic:   0xFEEDFACF
+  cputype: 0x10C
+  cpusubtype:  0x0
+  filetype:0x2
+  ncmds:   17
+  sizeofcmds:  792
+  flags:   0x200085
+  reserved:0x0
+LoadCommands:
+  - cmd: LC_SEGMENT_64
+cmdsize: 72
+segname: __PAGEZERO
+vmaddr:  0
+vmsize:  4294967296
+fileoff: 0
+filesize:0
+maxprot: 0
+initprot:0
+nsects:  0
+flags:   0
+  - cmd: LC_SEGMENT_64
+cmdsize: 232
+segname: __TEXT
+vmaddr:  4294967296
+vmsize:  16384
+fileoff: 0
+filesize:16384

[Lldb-commits] [PATCH] D115324: Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster.

2021-12-15 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added inline comments.



Comment at: lldb/source/Core/Module.cpp:1679
+
+std::string Module::GetCacheKey() {
+  std::string key;

labath wrote:
> The cache key and cache signature algorithms are fairly similar but subtly 
> different (IIUC, one is used to locate the cache entry on the disk and the 
> second one to verify that it refers to the correct file). I think it'd be 
> better to have the two functions next to each other, to highlight the 
> differences. If you moved the signature construction function here (by 
> changing it from a Module constructor to a CacheSignature getter function), 
> then that would also the dependency problem I mentioned last time as I 
> believe these constructors were the only non-utility dependency of that code 
> (I don't think that code has to be moved back to Utility, but it *could* be 
> moved, it need arises).
So the signature only contains an optional UUID, optional mod time, and 
optional object mod time, so they really are not related. The cache key must be 
the same for the same file on disk even if the file gets updated. So the cache 
key is designed to stay the same for the same file on disk, but also allow for 
the same file on disk to have more than one architecture (universal files) or 
for a file to contain multiple files like .a files that contain .o files. Since 
we only include the basename of the path for the module file, we also include 
the hash from Module::Hash() which _does_ include the full file path. That way 
we can have many "a.out" cache files from different directories and the main 
thing that will differ is the hash that is appended at the end. 

The DataCacheFile.cpp was relying on the ModuleList.cpp in order to read the 
"symbols.* preferences via "ModuleList::GetGlobalModuleListProperties()", so 
the ModuleList is actually what was the part that violated the dependency issue.



Comment at: lldb/source/Symbol/Symbol.cpp:649
+/// The only tricky thing in this encoding is encoding all of the bits in the
+/// bitfields. We use a trick to store all bitfields as a 16 bit value and we
+/// do the same thing when decoding the symbol. There are test that ensure this

labath wrote:
> I guess this isn't a "trick" anymore, just manual encoding of boolean 
> bitfields into an int.
It is. I tried doing your approach of making an anonymous union and an 
anonymous struct but I got a bunch of warnings that these were GNU extensions 
and I feared that some buildbots would fail with warnings as errors. I also 
didn't know if this would mess up the windows buildbots in case they didn't 
support these. So to be safe I just manually encode stuff to be safe.



Comment at: lldb/source/Symbol/Symtab.cpp:118-147
+if (!pair.second.IsEmpty()) {
+  std::multimap index_to_names;
+  s->Printf("\n0x%2.2x name to index table:\n", pair.first);
+  for (const auto &entry: pair.second)
+index_to_names.insert(std::make_pair(entry.value, entry.cstring));
+  std::set names;
+  uint32_t last_idx = UINT32_MAX;

labath wrote:
> If I managed to understand what this is doing correctly, it could be more 
> simply implemented as:
> ```
> if (pair.second.IsEmpty())
>   continue;
> s->Printf("\n0x%2.2x name to index table:\n", pair.first);
> std::map> index_to_names;
> for (const auto &entry: pair.second)
>   index_to_names[entry.value].push_back(entry.cstring); // automatically 
> creates a new map entry for the first string
> for (const auto &pair: index_to_names) {
>   s->Printf("0x%8.8x: ", pair.first);
>   for (const auto &name: pair.second) // vector always non-empty
> s->Printf("\"%s\" ", name.AsCString());
>   s->EOL();
> }
> ```
Whoops this was just dumping code I added to verify that the name tables were 
being reproduced correctly. No one will want to see these. I will remove the 
code.



Comment at: 
lldb/test/API/functionalities/module_cache/universal/TestModuleCacheUniversal.py:21
+self.cache_dir = os.path.join(self.getBuildDir(), 'lldb-module-cache')
+print('build-dir: ' + self.getBuildDir())
+# Set the lldb module cache directory to a directory inside the build

labath wrote:
> leftover from debugging?
yes! good catch



Comment at: 
lldb/test/API/functionalities/module_cache/universal/TestModuleCacheUniversal.py:57-58
+
+self.assertEqual(len(cache_files), 2,
+"make sure there are two files in the module cache directory 
(%s) for %s" % (self.cache_dir, exe_basename))

labath wrote:
> IIUC, the python tests check the caching behavior, while the (c++) unit tests 
> check that the actual contents of the cache entries can be read/written 
> correctly. I like that.
Exactly. I wanted to make sure that encoding could be done with unit testing as 
that is easier. I could also verify that every bit i

[Lldb-commits] [PATCH] D115324: Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster.

2021-12-15 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

Pavel: I fixed all issues you identified. Let me know if there is anything else 
you would like to see changed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115324/new/

https://reviews.llvm.org/D115324

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] 8b62429 - Use StringRef instead of char* (NFC)

2021-12-15 Thread Adrian Prantl via lldb-commits

Author: Adrian Prantl
Date: 2021-12-15T14:48:39-08:00
New Revision: 8b624290635fea64dfa14587b650dbb5077879a2

URL: 
https://github.com/llvm/llvm-project/commit/8b624290635fea64dfa14587b650dbb5077879a2
DIFF: 
https://github.com/llvm/llvm-project/commit/8b624290635fea64dfa14587b650dbb5077879a2.diff

LOG: Use StringRef instead of char* (NFC)

Added: 


Modified: 
lldb/include/lldb/Expression/UserExpression.h
lldb/source/Expression/UserExpression.cpp

Removed: 




diff  --git a/lldb/include/lldb/Expression/UserExpression.h 
b/lldb/include/lldb/Expression/UserExpression.h
index 8236c417f73a0..3874a60e06f09 100644
--- a/lldb/include/lldb/Expression/UserExpression.h
+++ b/lldb/include/lldb/Expression/UserExpression.h
@@ -266,10 +266,8 @@ class UserExpression : public Expression {
   0x1001; ///< ValueObject::GetError() returns this if there is no result
   /// from the expression.
 
-  const char *GetFixedText() {
-if (m_fixed_text.empty())
-  return nullptr;
-return m_fixed_text.c_str();
+  llvm::StringRef GetFixedText() {
+return m_fixed_text;
   }
 
 protected:

diff  --git a/lldb/source/Expression/UserExpression.cpp 
b/lldb/source/Expression/UserExpression.cpp
index b61781c0b82bd..6c6fbda4ec180 100644
--- a/lldb/source/Expression/UserExpression.cpp
+++ b/lldb/source/Expression/UserExpression.cpp
@@ -254,9 +254,7 @@ UserExpression::Evaluate(ExecutionContext &exe_ctx,
   if (fixed_expression == nullptr)
 fixed_expression = &tmp_fixed_expression;
 
-  const char *fixed_text = user_expression_sp->GetFixedText();
-  if (fixed_text != nullptr)
-fixed_expression->append(fixed_text);
+  *fixed_expression = user_expression_sp->GetFixedText().str();
 
   // If there is a fixed expression, try to parse it:
   if (!parse_success) {
@@ -285,8 +283,8 @@ UserExpression::Evaluate(ExecutionContext &exe_ctx,
 } else {
   // The fixed expression also didn't parse. Let's check for any new
   // Fix-Its we could try.
-  if (fixed_expression_sp->GetFixedText()) {
-*fixed_expression = fixed_expression_sp->GetFixedText();
+  if (!fixed_expression_sp->GetFixedText().empty()) {
+*fixed_expression = fixed_expression_sp->GetFixedText().str();
   } else {
 // Fixed expression didn't compile without a fixit, don't retry and
 // don't tell the user about it.



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] 11c2af0 - Remove redundant check (NFC)

2021-12-15 Thread Adrian Prantl via lldb-commits

Author: Adrian Prantl
Date: 2021-12-15T14:54:03-08:00
New Revision: 11c2af04f27a293de974ffdfc8faefe247a4e722

URL: 
https://github.com/llvm/llvm-project/commit/11c2af04f27a293de974ffdfc8faefe247a4e722
DIFF: 
https://github.com/llvm/llvm-project/commit/11c2af04f27a293de974ffdfc8faefe247a4e722.diff

LOG: Remove redundant check (NFC)

Added: 


Modified: 
lldb/source/Expression/UserExpression.cpp

Removed: 




diff  --git a/lldb/source/Expression/UserExpression.cpp 
b/lldb/source/Expression/UserExpression.cpp
index 6c6fbda4ec18..692594b03f16 100644
--- a/lldb/source/Expression/UserExpression.cpp
+++ b/lldb/source/Expression/UserExpression.cpp
@@ -263,8 +263,7 @@ UserExpression::Evaluate(ExecutionContext &exe_ctx,
 user_expression_sp.reset();
 
 execution_results = lldb::eExpressionParseError;
-if (fixed_expression && !fixed_expression->empty() &&
-options.GetAutoApplyFixIts()) {
+if (!fixed_expression->empty() && options.GetAutoApplyFixIts()) {
   const uint64_t max_fix_retries = options.GetRetriesWithFixIts();
   for (uint64_t i = 0; i < max_fix_retries; ++i) {
 // Try parsing the fixed expression.



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits