Author: Jacob Lalonde
Date: 2025-04-08T09:47:52-07:00
New Revision: f869d6efeec825c384dd9410fd29f90078e40c30

URL: 
https://github.com/llvm/llvm-project/commit/f869d6efeec825c384dd9410fd29f90078e40c30
DIFF: 
https://github.com/llvm/llvm-project/commit/f869d6efeec825c384dd9410fd29f90078e40c30.diff

LOG: [LLDB][Minidump]Update MinidumpFileBuilder to read and write in chunks 
(#129307)

I recently received an internal error report that LLDB was OOM'ing when
creating a Minidump. In my 64b refactor we made a decision to acquire
buffers the size of the largest memory region so we could read all of
the contents in one call. This made error handling very simple (and
simpler coding for me!) but had the trade off of large allocations if
huge pages were enabled.

This patch is one I've had on the back burner for awhile, but we can
read and write the Minidump memory sections in discrete chunks which we
already do for writing to disk.

I had to refactor the error handling a bit, but it remains the same. We
make a best effort attempt to read as much of the memory region as
possible, but fail immediately if we receive an error writing to disk. I
did not add new tests for this because our existing test suite is quite
good, but I did manually verify a few Minidumps couldn't read beyond the
red_zone.

```
(lldb) reg read $sp
     rsp = 0x00007fffffffc3b0
(lldb) p/x 0x00007fffffffc3b0 - 128
(long) 0x00007fffffffc330
(lldb) memory read 0x00007fffffffc330
0x7fffffffc330: 60 c3 ff ff ff 7f 00 00 60 cd ff ff ff 7f 00 00  
`.......`.......
0x7fffffffc340: 60 c3 ff ff ff 7f 00 00 65 e6 26 00 00 00 00 00  
`.......e.&.....
(lldb) memory read 0x00007fffffffc329
error: could not parse memory info (Success!)
```

I'm not sure how to quantify the memory improvement other than we would
allocate the largest size regardless of the size. So a 2gb unreadable
region would cause a 2gb allocation even if we were reading 4096 kb. Now
we will take the range size or the max chunk size of 128 mb.

Added: 
    

Modified: 
    lldb/include/lldb/Target/Process.h
    lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
    lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
    lldb/source/Target/Process.cpp

Removed: 
    


################################################################################
diff  --git a/lldb/include/lldb/Target/Process.h 
b/lldb/include/lldb/Target/Process.h
index 2e827d4c5cb74..7bed610b2830d 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -1589,6 +1589,52 @@ class Process : public 
std::enable_shared_from_this<Process>,
   size_t ReadMemoryFromInferior(lldb::addr_t vm_addr, void *buf, size_t size,
                                 Status &error);
 
+  // Callback definition for read Memory in chunks
+  //
+  // Status, the status returned from ReadMemoryFromInferior
+  // addr_t, the bytes_addr, start + bytes read so far.
+  // void*, pointer to the bytes read
+  // bytes_size, the count of bytes read for this chunk
+  typedef std::function<IterationAction(
+      lldb_private::Status &error, lldb::addr_t bytes_addr, const void *bytes,
+      lldb::offset_t bytes_size)>
+      ReadMemoryChunkCallback;
+
+  /// Read of memory from a process in discrete chunks, terminating
+  /// either when all bytes are read, or the supplied callback returns
+  /// IterationAction::Stop
+  ///
+  /// \param[in] vm_addr
+  ///     A virtual load address that indicates where to start reading
+  ///     memory from.
+  ///
+  /// \param[in] buf
+  ///    If NULL, a buffer of \a chunk_size will be created and used for the
+  ///    callback. If non NULL, this buffer must be at least \a chunk_size 
bytes
+  ///    and will be used for storing chunked memory reads.
+  ///
+  /// \param[in] chunk_size
+  ///     The minimum size of the byte buffer, and the chunk size of memory
+  ///     to read.
+  ///
+  /// \param[in] total_size
+  ///     The total number of bytes to read.
+  ///
+  /// \param[in] callback
+  ///     The callback to invoke when a chunk is read from memory.
+  ///
+  /// \return
+  ///     The number of bytes that were actually read into \a buf and
+  ///     written to the provided callback.
+  ///     If the returned number is greater than zero, yet less than \a
+  ///     size, then this function will get called again with \a
+  ///     vm_addr, \a buf, and \a size updated appropriately. Zero is
+  ///     returned in the case of an error.
+  lldb::offset_t ReadMemoryInChunks(lldb::addr_t vm_addr, void *buf,
+                                    lldb::addr_t chunk_size,
+                                    lldb::offset_t total_size,
+                                    ReadMemoryChunkCallback callback);
+
   /// Read a NULL terminated C string from memory
   ///
   /// This function will read a cache page at a time until the NULL

diff  --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp 
b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
index c5013ea5e3be4..6ed184273572b 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
@@ -969,6 +969,66 @@ Status MinidumpFileBuilder::DumpDirectories() const {
   return error;
 }
 
+Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
+    lldb_private::DataBufferHeap &data_buffer,
+    const lldb_private::CoreFileMemoryRange &range, uint64_t &bytes_read) {
+
+  const lldb::addr_t addr = range.range.start();
+  const lldb::addr_t size = range.range.size();
+  Log *log = GetLog(LLDBLog::Object);
+  Status addDataError;
+  Process::ReadMemoryChunkCallback callback =
+      [&](Status &error, lldb::addr_t current_addr, const void *buf,
+          uint64_t bytes_read) -> lldb_private::IterationAction {
+    if (error.Fail() || bytes_read == 0) {
+      LLDB_LOGF(log,
+                "Failed to read memory region at: 0x%" PRIx64
+                ". Bytes read: %" PRIx64 ", error: %s",
+                current_addr, bytes_read, error.AsCString());
+
+      // If we failed in a memory read, we would normally want to skip
+      // this entire region, if we had already written to the minidump
+      // file, we can't easily rewind that state.
+      //
+      // So if we do encounter an error while reading, we just return
+      // immediately, any prior bytes read will still be included but
+      // any bytes partially read before the error are ignored.
+      return lldb_private::IterationAction::Stop;
+    }
+
+    // Write to the minidump file with the chunk potentially flushing to
+    // disk.
+    // This error will be captured by the outer scope and is considered fatal.
+    // If we get an error writing to disk we can't easily guarauntee that we
+    // won't corrupt the minidump.
+    addDataError = AddData(buf, bytes_read);
+    if (addDataError.Fail())
+      return lldb_private::IterationAction::Stop;
+
+    // If we have a partial read, report it, but only if the partial read
+    // didn't finish reading the entire region.
+    if (bytes_read != data_buffer.GetByteSize() &&
+        current_addr + bytes_read != size) {
+      LLDB_LOGF(log,
+                "Memory region at: %" PRIx64 " partiall read 0x%" PRIx64
+                " bytes out of %" PRIx64 " bytes.",
+                current_addr, bytes_read,
+                data_buffer.GetByteSize() - bytes_read);
+
+      // If we've read some bytes, we stop trying to read more and return
+      // this best effort attempt
+      return lldb_private::IterationAction::Stop;
+    }
+
+    // No problems, keep going!
+    return lldb_private::IterationAction::Continue;
+  };
+
+  bytes_read = m_process_sp->ReadMemoryInChunks(
+      addr, data_buffer.GetBytes(), data_buffer.GetByteSize(), size, callback);
+  return addDataError;
+}
+
 static uint64_t
 GetLargestRangeSize(const std::vector<CoreFileMemoryRange> &ranges) {
   uint64_t max_size = 0;
@@ -987,8 +1047,8 @@ 
MinidumpFileBuilder::AddMemoryList_32(std::vector<CoreFileMemoryRange> &ranges,
 
   Log *log = GetLog(LLDBLog::Object);
   size_t region_index = 0;
-  auto data_up =
-      std::make_unique<DataBufferHeap>(GetLargestRangeSize(ranges), 0);
+  lldb_private::DataBufferHeap data_buffer(
+      std::min(GetLargestRangeSize(ranges), MAX_WRITE_CHUNK_SIZE), 0);
   for (const auto &core_range : ranges) {
     // Take the offset before we write.
     const offset_t offset_for_data = GetCurrentDataEndOffset();
@@ -1003,18 +1063,15 @@ 
MinidumpFileBuilder::AddMemoryList_32(std::vector<CoreFileMemoryRange> &ranges,
     ++region_index;
 
     progress.Increment(1, "Adding Memory Range " + core_range.Dump());
-    const size_t bytes_read =
-        m_process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-    if (error.Fail() || bytes_read == 0) {
-      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: 
%s",
-                bytes_read, error.AsCString());
-      // Just skip sections with errors or zero bytes in 32b mode
+    uint64_t bytes_read = 0;
+    error = ReadWriteMemoryInChunks(data_buffer, core_range, bytes_read);
+    if (error.Fail())
+      return error;
+
+    // If we completely failed to read this range
+    // we can just omit any of the book keeping.
+    if (bytes_read == 0)
       continue;
-    } else if (bytes_read != size) {
-      LLDB_LOGF(
-          log, "Memory region at: %" PRIx64 " failed to read %" PRIx64 " 
bytes",
-          addr, size);
-    }
 
     MemoryDescriptor descriptor;
     descriptor.StartOfMemoryRange =
@@ -1026,11 +1083,6 @@ 
MinidumpFileBuilder::AddMemoryList_32(std::vector<CoreFileMemoryRange> &ranges,
     descriptors.push_back(descriptor);
     if (m_thread_by_range_end.count(end) > 0)
       m_thread_by_range_end[end].Stack = descriptor;
-
-    // Add the data to the buffer, flush as needed.
-    error = AddData(data_up->GetBytes(), bytes_read);
-    if (error.Fail())
-      return error;
   }
 
   // Add a directory that references this list
@@ -1088,6 +1140,8 @@ 
MinidumpFileBuilder::AddMemoryList_64(std::vector<CoreFileMemoryRange> &ranges,
   list_header.BaseRVA = memory_ranges_base_rva;
   m_data.AppendData(&list_header, sizeof(llvm::minidump::Memory64ListHeader));
 
+  lldb_private::DataBufferHeap data_buffer(
+      std::min(GetLargestRangeSize(ranges), MAX_WRITE_CHUNK_SIZE), 0);
   bool cleanup_required = false;
   std::vector<MemoryDescriptor_64> descriptors;
   // Enumerate the ranges and create the memory descriptors so we can append
@@ -1106,8 +1160,6 @@ 
MinidumpFileBuilder::AddMemoryList_64(std::vector<CoreFileMemoryRange> &ranges,
 
   Log *log = GetLog(LLDBLog::Object);
   size_t region_index = 0;
-  auto data_up =
-      std::make_unique<DataBufferHeap>(GetLargestRangeSize(ranges), 0);
   for (const auto &core_range : ranges) {
     const addr_t addr = core_range.range.start();
     const addr_t size = core_range.range.size();
@@ -1120,27 +1172,19 @@ 
MinidumpFileBuilder::AddMemoryList_64(std::vector<CoreFileMemoryRange> &ranges,
     ++region_index;
 
     progress.Increment(1, "Adding Memory Range " + core_range.Dump());
-    const size_t bytes_read =
-        m_process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-    if (error.Fail()) {
-      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: 
%s",
-                bytes_read, error.AsCString());
-      error.Clear();
+    uint64_t bytes_read = 0;
+    error = ReadWriteMemoryInChunks(data_buffer, core_range, bytes_read);
+    if (error.Fail())
+      return error;
+
+    if (bytes_read == 0) {
       cleanup_required = true;
       descriptors[region_index].DataSize = 0;
     }
     if (bytes_read != size) {
-      LLDB_LOGF(
-          log, "Memory region at: %" PRIx64 " failed to read %" PRIx64 " 
bytes",
-          addr, size);
       cleanup_required = true;
       descriptors[region_index].DataSize = bytes_read;
     }
-
-    // Add the data to the buffer, flush as needed.
-    error = AddData(data_up->GetBytes(), bytes_read);
-    if (error.Fail())
-      return error;
   }
 
   // Early return if there is no cleanup needed.

diff  --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h 
b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
index 48293ee1bf5e5..a3f8f00ee215d 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
@@ -142,6 +142,14 @@ class MinidumpFileBuilder {
   lldb_private::Status AddDirectory(llvm::minidump::StreamType type,
                                     uint64_t stream_size);
   lldb::offset_t GetCurrentDataEndOffset() const;
+
+  // Read a memory region from the process and write it to the file
+  // in fixed size chunks.
+  lldb_private::Status
+  ReadWriteMemoryInChunks(lldb_private::DataBufferHeap &data_buffer,
+                          const lldb_private::CoreFileMemoryRange &range,
+                          uint64_t &bytes_read);
+
   // Stores directories to fill in later
   std::vector<llvm::minidump::Directory> m_directories;
   // When we write off the threads for the first time, we need to clean them up

diff  --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp
index 0b7ba343c11f2..a9787823b9108 100644
--- a/lldb/source/Target/Process.cpp
+++ b/lldb/source/Target/Process.cpp
@@ -2184,6 +2184,50 @@ size_t Process::ReadMemoryFromInferior(addr_t addr, void 
*buf, size_t size,
   return bytes_read;
 }
 
+lldb::offset_t Process::ReadMemoryInChunks(lldb::addr_t vm_addr, void *buf,
+                                           lldb::addr_t chunk_size,
+                                           lldb::offset_t size,
+                                           ReadMemoryChunkCallback callback) {
+  // Safety check to prevent an infinite loop.
+  if (chunk_size == 0)
+    return 0;
+
+  // Buffer for when a NULL buf is provided, initialized
+  // to 0 bytes, we set it to chunk_size and then replace buf
+  // with the new buffer.
+  DataBufferHeap data_buffer;
+  if (!buf) {
+    data_buffer.SetByteSize(chunk_size);
+    buf = data_buffer.GetBytes();
+  }
+
+  uint64_t bytes_remaining = size;
+  uint64_t bytes_read = 0;
+  Status error;
+  while (bytes_remaining > 0) {
+    // Get the next read chunk size as the minimum of the remaining bytes and
+    // the write chunk max size.
+    const lldb::addr_t bytes_to_read = std::min(bytes_remaining, chunk_size);
+    const lldb::addr_t current_addr = vm_addr + bytes_read;
+    const lldb::addr_t bytes_read_for_chunk =
+        ReadMemoryFromInferior(current_addr, buf, bytes_to_read, error);
+
+    bytes_read += bytes_read_for_chunk;
+    // If the bytes read in this chunk would cause us to overflow, something
+    // went wrong and we should fail fast.
+    if (bytes_read_for_chunk > bytes_remaining)
+      return 0;
+    else
+      bytes_remaining -= bytes_read_for_chunk;
+
+    if (callback(error, current_addr, buf, bytes_read_for_chunk) ==
+        IterationAction::Stop)
+      break;
+  }
+
+  return bytes_read;
+}
+
 uint64_t Process::ReadUnsignedIntegerFromMemory(lldb::addr_t vm_addr,
                                                 size_t integer_byte_size,
                                                 uint64_t fail_value,


        
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to