jasonmolenda created this revision. jasonmolenda added reviewers: jingham, clayborg. jasonmolenda added a project: LLDB. Herald added subscribers: lldb-commits, JDevlieghere. Herald added a reviewer: JDevlieghere. jasonmolenda requested review of this revision.
I implemented something that's been talked about for a while -- creating "skinny corefiles", or a user process corefile that only includes dirty memory in it. This is the patchset to implement this with lldb's "process save-core" and lldb's corefile reading, for Mach-O files. Because of how the system libraries are shared across all processes on Darwin systems, dumping all available memory for a process involves writing multiple gigabytes for even a trivial command line app. The majority of this corefile are unmodified memory pages for the system libraries; it's a very expensive and slow operation. This patchset does a few things: 1. Adds support in debugserver to find the list of pages in a memory region that are dirty. It adds a key-value pair to qHostInfo to advertise how large a VM memory page is on the target system, and it includes a list of dirty pages for a memory region in the qMemoryRegionInfo reply to lldb's queries about memory regions. 2. It adds a new LC_NOTE "all image infos" in the corefile, which specifies all binary images that are present in the inferior process and where they are loaded in memory. The filepaths, UUIDs, and segment name+load addresses are required because lldb will not have access to dyld's dyld_all_image_infos structure or the ability to read these binaries out of memory. (normally the segment load addresses could be found by reading the Mach-O load commands for the binaries from the corefile) 3. It adds a new LC_NOTE "executing uuids" in the corefile, which is a list of the UUIDs of the binary images that are actually executing code at the point when the corefile was written. That is, the set of binaries that appear on any backtrace on any thread at the point of the coredump. A graphical app on macOS these days can easily have 500 different binary images loaded in the process; likely only a couple dozen of those are actually executing. Knowing which binary images are executing allows us to do a more expensive search for these most-important binaries, like we do with crashlog.py. 4. Changes the Mach-O corefile creation to only include dirty pages when communicating with a debugserver that provides this information. 5. Read and use the "all image infos" and "executing uuids" in ProcessMachCore. 6. Finally, of course, adds a test case where we have 3 binaries (one executable, two dylibs), hits a breakpoint, saves a skinny corefile. Then it moves the executable to a hidden location that lldb can only discover by calling the dsymForUUID script (to test 'executing uuids'), deletes one dylib, and leaves one at the expected location. It loads the skinny corefile into lldb and confirms that the present libraries are present, the absent library is absent (and its non-dirty memory is unreadable), and we can read variables that are on the heap/stack correctly. Before this change, the minimum size for a corefile on macOS 10.15 was around 2GB and it would take around 5 minutes to dump. With this change, a simple test binary user process corefile is around 500KB and takes a few seconds to dump. Larger real-world applications will have larger amounts of stack & heap and will have correspondingly larger corefiles. I'm not thrilled that I'm touching the same parts of lldb as David Spickett in https://reviews.llvm.org/D87442 - we're going to create conflicts depending on which of us lands first. Both of us need to modify the qMemoryRegionInfo packet and lldb's MemoryRegionInfo class to store our additional informations. The majority of this patch concerns ObjectFileMachO and ProcessMachCore, as well as debugserver; I expect those to be of less interest to any reviewer/commenters. But of course I welcome any suggestions/comments to those parts as well as the other parts of lldb I touched along the way. I did need to add some API/structs in the ObjectFile base class so ObjectFileMachO could pass information up to ProcessMachCore that are very specific to what I'm doing here. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D88387 Files: lldb/docs/lldb-gdb-remote.txt lldb/include/lldb/Symbol/ObjectFile.h lldb/include/lldb/Target/MemoryRegionInfo.h lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.h lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp lldb/test/API/macosx/skinny-corefile/Makefile lldb/test/API/macosx/skinny-corefile/TestSkinnyCorefile.py lldb/test/API/macosx/skinny-corefile/main.c lldb/test/API/macosx/skinny-corefile/present.c lldb/test/API/macosx/skinny-corefile/present.h lldb/test/API/macosx/skinny-corefile/to-be-removed.c lldb/test/API/macosx/skinny-corefile/to-be-removed.h lldb/tools/debugserver/source/DNBDefs.h lldb/tools/debugserver/source/MacOSX/MachVMMemory.cpp lldb/tools/debugserver/source/RNBRemote.cpp
Index: lldb/tools/debugserver/source/RNBRemote.cpp =================================================================== --- lldb/tools/debugserver/source/RNBRemote.cpp +++ lldb/tools/debugserver/source/RNBRemote.cpp @@ -18,6 +18,7 @@ #include <libproc.h> #include <mach-o/loader.h> #include <mach/exception_types.h> +#include <mach/mach_vm.h> #include <mach/task_info.h> #include <pwd.h> #include <signal.h> @@ -4422,7 +4423,7 @@ __FILE__, __LINE__, p, "Invalid address in qMemoryRegionInfo packet"); } - DNBRegionInfo region_info = {0, 0, 0}; + DNBRegionInfo region_info; DNBProcessMemoryRegionInfo(m_ctx.ProcessID(), address, ®ion_info); std::ostringstream ostrm; @@ -4442,6 +4443,18 @@ if (region_info.permissions & eMemoryPermissionsExecutable) ostrm << 'x'; ostrm << ';'; + + ostrm << "dirty-pages:"; + if (region_info.dirty_pages.size() > 0) { + bool first = true; + for (nub_addr_t addr : region_info.dirty_pages) { + if (!first) + ostrm << ","; + first = false; + ostrm << "0x" << std::hex << addr; + } + } + ostrm << ";"; } return SendPacket(ostrm.str()); } @@ -4963,6 +4976,8 @@ strm << "default_packet_timeout:10;"; #endif + strm << "vm-page-size:" << std::dec << vm_page_size << ";"; + return SendPacket(strm.str()); } Index: lldb/tools/debugserver/source/MacOSX/MachVMMemory.cpp =================================================================== --- lldb/tools/debugserver/source/MacOSX/MachVMMemory.cpp +++ lldb/tools/debugserver/source/MacOSX/MachVMMemory.cpp @@ -72,6 +72,49 @@ return count; } +#define MAX_STACK_ALLOC_DISPOSITIONS \ + (16 * 1024 / sizeof(int)) // 16K of allocations + +std::vector<nub_addr_t> get_dirty_pages(task_t task, mach_vm_address_t addr, + mach_vm_size_t size) { + std::vector<nub_addr_t> dirty_pages; + + int pages_to_query = size / vm_page_size; + // Don't try to fetch too many pages' dispositions in a single call or we + // could blow our stack out. + mach_vm_size_t dispositions_size = + std::min(pages_to_query, (int)MAX_STACK_ALLOC_DISPOSITIONS); + int dispositions[dispositions_size]; + + mach_vm_size_t chunk_count = + ((pages_to_query + MAX_STACK_ALLOC_DISPOSITIONS - 1) / + MAX_STACK_ALLOC_DISPOSITIONS); + + for (mach_vm_size_t cur_disposition_chunk = 0; + cur_disposition_chunk < chunk_count; cur_disposition_chunk++) { + mach_vm_size_t dispositions_already_queried = + cur_disposition_chunk * MAX_STACK_ALLOC_DISPOSITIONS; + + mach_vm_size_t chunk_pages_to_query = std::min( + pages_to_query - dispositions_already_queried, dispositions_size); + mach_vm_address_t chunk_page_aligned_start_addr = + addr + (dispositions_already_queried * vm_page_size); + + kern_return_t kr = mach_vm_page_range_query( + task, chunk_page_aligned_start_addr, + chunk_pages_to_query * vm_page_size, (mach_vm_address_t)dispositions, + &chunk_pages_to_query); + if (kr != KERN_SUCCESS) + return dirty_pages; + for (mach_vm_size_t i = 0; i < chunk_pages_to_query; i++) { + uint64_t dirty_addr = chunk_page_aligned_start_addr + (i * vm_page_size); + if (dispositions[i] & VM_PAGE_QUERY_PAGE_DIRTY) + dirty_pages.push_back(dirty_addr); + } + } + return dirty_pages; +} + nub_bool_t MachVMMemory::GetMemoryRegionInfo(task_t task, nub_addr_t address, DNBRegionInfo *region_info) { MachVMRegion vmRegion(task); @@ -80,6 +123,8 @@ region_info->addr = vmRegion.StartAddress(); region_info->size = vmRegion.GetByteSize(); region_info->permissions = vmRegion.GetDNBPermissions(); + region_info->dirty_pages = + get_dirty_pages(task, vmRegion.StartAddress(), vmRegion.GetByteSize()); } else { region_info->addr = address; region_info->size = 0; Index: lldb/tools/debugserver/source/DNBDefs.h =================================================================== --- lldb/tools/debugserver/source/DNBDefs.h +++ lldb/tools/debugserver/source/DNBDefs.h @@ -18,6 +18,7 @@ #include <stdio.h> #include <sys/syslimits.h> #include <unistd.h> +#include <vector> // Define nub_addr_t and the invalid address value from the architecture #if defined(__x86_64__) || defined(__arm64__) || defined(__aarch64__) @@ -316,9 +317,12 @@ }; struct DNBRegionInfo { +public: + DNBRegionInfo() : addr(0), size(0), permissions(0), dirty_pages() {} nub_addr_t addr; nub_addr_t size; uint32_t permissions; + std::vector<nub_addr_t> dirty_pages; }; enum DNBProfileDataScanType { Index: lldb/test/API/macosx/skinny-corefile/to-be-removed.h =================================================================== --- /dev/null +++ lldb/test/API/macosx/skinny-corefile/to-be-removed.h @@ -0,0 +1,2 @@ +void to_be_removed_init (int in); +int to_be_removed (char *main_heap_buf, int main_const_data, int main_dirty_data); Index: lldb/test/API/macosx/skinny-corefile/to-be-removed.c =================================================================== --- /dev/null +++ lldb/test/API/macosx/skinny-corefile/to-be-removed.c @@ -0,0 +1,21 @@ +#include <stdio.h> +#include <stdlib.h> + +#include "present.h" +#include "to-be-removed.h" + +const int to_be_removed_const_data = 5; +int to_be_removed_dirty_data = 10; + +void to_be_removed_init(int in) { to_be_removed_dirty_data += 10; } + +int to_be_removed(char *main_heap_buf, int main_const_data, + int main_dirty_data) { + char *to_be_removed_heap_buf = (char *)malloc(256); + sprintf(to_be_removed_heap_buf, "got string '%s' have int %d %d %d", + main_heap_buf, to_be_removed_dirty_data, main_const_data, + main_dirty_data); + printf("%s\n", to_be_removed_heap_buf); + return present(to_be_removed_heap_buf, to_be_removed_const_data, + to_be_removed_dirty_data); +} Index: lldb/test/API/macosx/skinny-corefile/present.h =================================================================== --- /dev/null +++ lldb/test/API/macosx/skinny-corefile/present.h @@ -0,0 +1,2 @@ +void present_init (int in); +int present (char *to_be_removed_heap_buf, int to_be_removed_const_data, int to_be_removed_dirty_data); Index: lldb/test/API/macosx/skinny-corefile/present.c =================================================================== --- /dev/null +++ lldb/test/API/macosx/skinny-corefile/present.c @@ -0,0 +1,22 @@ +#include <stdio.h> +#include <stdlib.h> + +#include "present.h" + +const int present_const_data = 5; +int present_dirty_data = 10; + +void present_init(int in) { present_dirty_data += 10; } + +int present(char *to_be_removed_heap_buf, int to_be_removed_const_data, + int to_be_removed_dirty_data) { + char *present_heap_buf = (char *)malloc(256); + sprintf(present_heap_buf, "have ints %d %d %d %d", to_be_removed_const_data, + to_be_removed_dirty_data, present_dirty_data, present_const_data); + printf("%s\n", present_heap_buf); + puts(to_be_removed_heap_buf); + + puts("break here"); + + return present_const_data + present_dirty_data; +} Index: lldb/test/API/macosx/skinny-corefile/main.c =================================================================== --- /dev/null +++ lldb/test/API/macosx/skinny-corefile/main.c @@ -0,0 +1,20 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "present.h" +#include "to-be-removed.h" + +const int main_const_data = 5; +int main_dirty_data = 10; +int main(int argc, char **argv) { + + to_be_removed_init(argc); + present_init(argc); + main_dirty_data += argc; + + char *heap_buf = (char *)malloc(80); + strcpy(heap_buf, "this is a string on the heap"); + + return to_be_removed(heap_buf, main_const_data, main_dirty_data); +} Index: lldb/test/API/macosx/skinny-corefile/TestSkinnyCorefile.py =================================================================== --- /dev/null +++ lldb/test/API/macosx/skinny-corefile/TestSkinnyCorefile.py @@ -0,0 +1,162 @@ +"""Test that lldb can create a skinny corefile, and load all available libraries correctly.""" + + + +import os +import re +import subprocess + +import lldb +from lldbsuite.test.decorators import * +from lldbsuite.test.lldbtest import * +from lldbsuite.test import lldbutil + + +class TestFirmwareCorefiles(TestBase): + + mydir = TestBase.compute_mydir(__file__) + + @skipIf(debug_info=no_match(["dsym"]), bugnumber="This test is looking explicitly for a dSYM") + @skipUnlessDarwin + def test_lc_note(self): + self.build() + self.aout_exe = self.getBuildArtifact("a.out") + self.aout_dsym = self.getBuildArtifact("a.out.dSYM") + self.to_be_removed_dylib = self.getBuildArtifact("libto-be-removed.dylib") + self.to_be_removed_dsym = self.getBuildArtifact("libto-be-removed.dylib.dSYM") + self.corefile = self.getBuildArtifact("process.core") + self.dsym_for_uuid = self.getBuildArtifact("dsym-for-uuid.sh") + + # After the corefile is created, we'll move a.out and a.out.dSYM + # into hide.noindex and lldb will have to use the + # LLDB_APPLE_DSYMFORUUID_EXECUTABLE script to find them. + self.hide_dir = self.getBuildArtifact("hide.noindex") + lldbutil.mkdir_p(self.hide_dir) + self.hide_aout_exe = self.getBuildArtifact("hide.noindex/a.out") + self.hide_aout_dsym = self.getBuildArtifact("hide.noindex/a.out.dSYM") + + # We can hook in our dsym-for-uuid shell script to lldb with + # this env var instead of requiring a defaults write. + os.environ['LLDB_APPLE_DSYMFORUUID_EXECUTABLE'] = self.dsym_for_uuid + self.addTearDownHook(lambda: os.environ.pop('LLDB_APPLE_DSYMFORUUID_EXECUTABLE', None)) + + dwarfdump_uuid_regex = re.compile( + 'UUID: ([-0-9a-fA-F]+) \(([^\(]+)\) .*') + dwarfdump_cmd_output = subprocess.check_output( + ('/usr/bin/dwarfdump --uuid "%s"' % self.aout_exe), shell=True).decode("utf-8") + aout_uuid = None + for line in dwarfdump_cmd_output.splitlines(): + match = dwarfdump_uuid_regex.search(line) + if match: + aout_uuid = match.group(1) + self.assertNotEqual(aout_uuid, None, "Could not get uuid of built a.out") + + ### Create our dsym-for-uuid shell script which returns self.hide_aout_exe. + shell_cmds = [ + '#! /bin/sh', + '# the last argument is the uuid', + 'while [ $# -gt 1 ]', + 'do', + ' shift', + 'done', + 'ret=0', + 'echo "<?xml version=\\"1.0\\" encoding=\\"UTF-8\\"?>"', + 'echo "<!DOCTYPE plist PUBLIC \\"-//Apple//DTD PLIST 1.0//EN\\" \\"http://www.apple.com/DTDs/PropertyList-1.0.dtd\\">"', + 'echo "<plist version=\\"1.0\\">"', + '', + 'if [ "$1" = "%s" ]' % aout_uuid, + 'then', + ' uuid=%s' % aout_uuid, + ' bin=%s' % self.hide_aout_exe, + ' dsym=%s.dSYM/Contents/Resources/DWARF/%s' % (self.hide_aout_exe, os.path.basename(self.hide_aout_exe)), + 'fi', + 'if [ -z "$uuid" -o -z "$bin" -o ! -f "$bin" ]', + 'then', + ' echo "<key>DBGError</key><string>not found</string>"', + ' echo "</plist>"', + ' exit 1', + 'fi', + 'echo "<dict><key>$uuid</key><dict>"', + '', + 'echo "<key>DBGArchitecture</key><string>x86_64</string>"', + 'echo "<key>DBGDSYMPath</key><string>$dsym</string>"', + 'echo "<key>DBGSymbolRichExecutable</key><string>$bin</string>"', + 'echo "</dict></dict></plist>"', + 'exit $ret' + ] + + with open(self.dsym_for_uuid, "w") as writer: + for l in shell_cmds: + writer.write(l + '\n') + + os.chmod(self.dsym_for_uuid, 0o755) + + + # Launch a live process with a.out, libto-be-removed.dylib, + # libpresent.dylib all in their original locations, create + # a corefile at the breakpoint. + (target, process, t, bp) = lldbutil.run_to_source_breakpoint ( + self, "break here", lldb.SBFileSpec('present.c')) + + self.assertTrue(process.IsValid()) + + if self.TraceOn(): + self.runCmd("bt") + self.runCmd("image list") + + self.runCmd("process save-core " + self.corefile) + process.Kill() + target.Clear() + + # Move the main binary and its dSYM into the hide.noindex + # directory. Now the only way lldb can find them is with + # the LLDB_APPLE_DSYMFORUUID_EXECUTABLE shell script - + # so we're testing that this dSYM discovery method works. + os.rename(self.aout_exe, self.hide_aout_exe) + os.rename(self.aout_dsym, self.hide_aout_dsym) + + # Completely remove the libto-be-removed.dylib, so we're + # testing that lldb handles an unavailable binary correctly, + # and non-dirty memory from this binary (e.g. the executing + # instructions) are NOT included in the corefile. + os.unlink(self.to_be_removed_dylib) + shutil.rmtree(self.to_be_removed_dsym) + + + # Now load the corefile + self.target = self.dbg.CreateTarget('') + self.process = self.target.LoadCore(self.corefile) + self.assertTrue(self.process.IsValid()) + if self.TraceOn(): + self.runCmd("image list") + self.runCmd("bt") + + self.assertTrue(self.process.IsValid()) + self.assertTrue(self.process.GetSelectedThread().IsValid()) + + # f0 is present() in libpresent.dylib + f0 = self.process.GetSelectedThread().GetFrameAtIndex(0) + to_be_removed_dirty_data = f0.FindVariable("to_be_removed_dirty_data") + self.assertEqual(to_be_removed_dirty_data.GetValueAsUnsigned(), 20) + + present_heap_buf = f0.FindVariable("present_heap_buf") + self.assertTrue("have ints 5 20 20 5" in present_heap_buf.GetSummary()) + + + # f1 is to_be_removed() in libto-be-removed.dylib + # it has been removed since the corefile was created, + # and the instructions for this frame should NOT be included + # in the corefile. They were not dirty pages. + f1 = self.process.GetSelectedThread().GetFrameAtIndex(1) + err = lldb.SBError() + uint = self.process.ReadUnsignedFromMemory(f1.GetPC(), 4, err) + self.assertTrue(err.Fail()) + + + # TODO Future testing could check that read-only constant data + # (main_const_data, present_const_data) can be read both as an + # SBValue and in an expression -- which means lldb needs to read + # them out of the binaries, they are not present in the corefile. + # And checking file-scope dirty data (main_dirty_data, + # present_dirty_data) the same way would be good, instead of just + # checking the heap and stack like are being done right now. Index: lldb/test/API/macosx/skinny-corefile/Makefile =================================================================== --- /dev/null +++ lldb/test/API/macosx/skinny-corefile/Makefile @@ -0,0 +1,15 @@ +LD_EXTRAS = -L. -lto-be-removed -lpresent +C_SOURCES = main.c + +include Makefile.rules + +a.out: libto-be-removed libpresent + +libto-be-removed: libpresent + $(MAKE) -f $(MAKEFILE_RULES) \ + DYLIB_ONLY=YES DYLIB_C_SOURCES=to-be-removed.c DYLIB_NAME=to-be-removed \ + LD_EXTRAS="-L. -lpresent" + +libpresent: + $(MAKE) -f $(MAKEFILE_RULES) \ + DYLIB_ONLY=YES DYLIB_C_SOURCES=present.c DYLIB_NAME=present Index: lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp =================================================================== --- lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp +++ lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp @@ -276,6 +276,21 @@ m_core_range_infos.Sort(); } + // If we have an "executing uuids" LC_NOTE, force a dsymForUUID + // style lookup for those binaries / dSYMs. The corefile may have + // hundreds of binary images included in the process, but only a + // handful of them were actually executing code when the corefile was + // taken. We can do an expensive search for this more limited set of + // images. + std::vector<UUID> executing_uuids = core_objfile->GetCorefileExecutingUUIDs(); + for (const UUID &uuid : executing_uuids) { + ModuleSpec module_spec; + module_spec.GetUUID() = uuid; + Symbols::DownloadObjectAndSymbolFile(module_spec, true); + if (FileSystem::Instance().Exists(module_spec.GetFileSpec())) { + GetTarget().GetOrCreateModule(module_spec, false); + } + } bool found_main_binary_definitively = false; @@ -403,6 +418,41 @@ } } + // If we have a "all image infos" LC_NOTE, try to load all of the + // binaries listed, and set their Section load addresses in the Target. + ObjectFile::MachOCorefileAllImageInfos image_infos = + core_objfile->GetCorefileAllImageInfos(); + if (found_main_binary_definitively == false && image_infos.IsValid()) { + m_dyld_plugin_name = DynamicLoaderDarwinKernel::GetPluginNameStatic(); + found_main_binary_definitively = true; + for (const ObjectFile::MachOCorefileImageEntry &image : + image_infos.all_image_infos) { + ModuleSpec module_spec; + module_spec.GetUUID() = image.uuid; + module_spec.GetFileSpec() = FileSpec(image.filename.c_str()); + Status error; + ModuleSP module_sp = + GetTarget().GetOrCreateModule(module_spec, false, &error); + if (module_sp.get() && module_sp->GetObjectFile()) { + if (module_sp->GetObjectFile()->GetType() == + ObjectFile::eTypeExecutable) { + GetTarget().SetExecutableModule(module_sp, eLoadDependentsNo); + } + for (auto name_vmaddr_tuple : image.segment_load_addresses) { + SectionList *sectlist = module_sp->GetObjectFile()->GetSectionList(); + if (sectlist) { + SectionSP sect_sp = + sectlist->FindSectionByName(std::get<0>(name_vmaddr_tuple)); + if (sect_sp) { + GetTarget().SetSectionLoadAddress(sect_sp, + std::get<1>(name_vmaddr_tuple)); + } + } + } + } + } + } + if (!found_main_binary_definitively && (m_dyld_addr == LLDB_INVALID_ADDRESS || m_mach_kernel_addr == LLDB_INVALID_ADDRESS)) { Index: lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.h =================================================================== --- lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.h +++ lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.h @@ -244,6 +244,10 @@ std::chrono::seconds GetHostDefaultPacketTimeout(); + // Returns the size of VM pages on the target system; + // 0 is returned for an un-set value. + int GetTargetVMPageSize(); + const ArchSpec &GetProcessArchitecture(); void GetRemoteQSupported(); @@ -585,6 +589,7 @@ uint32_t m_gdb_server_version; // from reply to qGDBServerVersion, zero if // qGDBServerVersion is not supported std::chrono::seconds m_default_packet_timeout; + int m_target_vm_page_size; // target system VM page size; 0 if unspecified uint64_t m_max_packet_size; // as returned by qSupported std::string m_qSupported_response; // the complete response to qSupported Index: lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp =================================================================== --- lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp +++ lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp @@ -16,6 +16,7 @@ #include "lldb/Core/ModuleSpec.h" #include "lldb/Host/HostInfo.h" +#include "lldb/Host/StringConvert.h" #include "lldb/Host/XML.h" #include "lldb/Symbol/Symbol.h" #include "lldb/Target/MemoryRegionInfo.h" @@ -102,7 +103,7 @@ m_num_supported_hardware_watchpoints(0), m_host_arch(), m_process_arch(), m_os_build(), m_os_kernel(), m_hostname(), m_gdb_server_name(), m_gdb_server_version(UINT32_MAX), m_default_packet_timeout(0), - m_max_packet_size(0), m_qSupported_response(), + m_target_vm_page_size(0), m_max_packet_size(0), m_qSupported_response(), m_supported_async_json_packets_is_valid(false), m_supported_async_json_packets_sp(), m_qXfer_memory_map(), m_qXfer_memory_map_loaded(false) {} @@ -321,6 +322,7 @@ m_gdb_server_name.clear(); m_gdb_server_version = UINT32_MAX; m_default_packet_timeout = seconds(0); + m_target_vm_page_size = 0; m_max_packet_size = 0; m_qSupported_response.clear(); m_supported_async_json_packets_is_valid = false; @@ -1245,6 +1247,12 @@ SetPacketTimeout(m_default_packet_timeout); ++num_keys_decoded; } + } else if (name.equals("vm-page-size")) { + int page_size; + if (!value.getAsInteger(0, page_size)) { + m_target_vm_page_size = page_size; + ++num_keys_decoded; + } } } @@ -1380,6 +1388,12 @@ return m_default_packet_timeout; } +int GDBRemoteCommunicationClient::GetTargetVMPageSize() { + if (m_qHostInfo_is_valid == eLazyBoolCalculate) + GetHostInfo(); + return m_target_vm_page_size; +} + addr_t GDBRemoteCommunicationClient::AllocateMemory(size_t size, uint32_t permissions) { if (m_supports_alloc_dealloc_memory != eLazyBoolNo) { @@ -1529,6 +1543,24 @@ std::string name; name_extractor.GetHexByteString(name); region_info.SetName(name.c_str()); + } else if (name.equals("dirty-pages")) { + std::vector<addr_t> dirty_page_list; + std::string comma_sep_str = value.str(); + size_t comma_pos; + addr_t page; + while ((comma_pos = comma_sep_str.find(',')) != std::string::npos) { + comma_sep_str[comma_pos] = '\0'; + page = StringConvert::ToUInt64(comma_sep_str.c_str(), + LLDB_INVALID_ADDRESS, 16); + if (page != LLDB_INVALID_ADDRESS) + dirty_page_list.push_back(page); + comma_sep_str.erase(0, comma_pos + 1); + } + page = StringConvert::ToUInt64(comma_sep_str.c_str(), + LLDB_INVALID_ADDRESS, 16); + if (page != LLDB_INVALID_ADDRESS) + dirty_page_list.push_back(page); + region_info.SetDirtyPageList(dirty_page_list); } else if (name.equals("error")) { StringExtractorGDBRemote error_extractor(value); std::string error_string; @@ -1538,6 +1570,9 @@ } } + if (GetTargetVMPageSize() != 0) + region_info.SetPageSize(GetTargetVMPageSize()); + if (region_info.GetRange().IsValid()) { // We got a valid address range back but no permissions -- which means // this is an unmapped page Index: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h =================================================================== --- lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h +++ lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h @@ -15,6 +15,7 @@ #include "lldb/Symbol/ObjectFile.h" #include "lldb/Utility/FileSpec.h" #include "lldb/Utility/RangeMap.h" +#include "lldb/Utility/StreamString.h" #include "lldb/Utility/UUID.h" // This class needs to be hidden as eventually belongs in a plugin that @@ -116,6 +117,11 @@ lldb_private::UUID &uuid, ObjectFile::BinaryType &type) override; + lldb_private::ObjectFile::MachOCorefileAllImageInfos + GetCorefileAllImageInfos() override; + + std::vector<lldb_private::UUID> GetCorefileExecutingUUIDs() override; + lldb::RegisterContextSP GetThreadContextAtIndex(uint32_t idx, lldb_private::Thread &thread) override; @@ -209,6 +215,14 @@ bool SectionIsLoadable(const lldb_private::Section *section); + static lldb::offset_t CreateAllImageInfosPayload( + const lldb::ProcessSP &process_sp, lldb::offset_t file_offset, + lldb_private::StreamString &all_image_infos_payload); + + static lldb::offset_t CreateExecutingUUIDsPayload( + const lldb::ProcessSP &process_sp, lldb::offset_t file_offset, + lldb_private::StreamString &all_image_infos_payload); + llvm::MachO::mach_header m_header; static lldb_private::ConstString GetSegmentNameTEXT(); static lldb_private::ConstString GetSegmentNameDATA(); Index: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp =================================================================== --- lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp +++ lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp @@ -6153,6 +6153,14 @@ return num_loaded_sections > 0; } +// Temp struct used to combine contiguous memory regions with +// identical permissions. +struct page_object { + addr_t addr; + addr_t size; + uint32_t prot; +}; + bool ObjectFileMachO::SaveCore(const lldb::ProcessSP &process_sp, const FileSpec &outfile, Status &error) { if (!process_sp) @@ -6191,14 +6199,9 @@ Status range_error = process_sp->GetMemoryRegionInfo(0, range_info); const uint32_t addr_byte_size = target_arch.GetAddressByteSize(); const ByteOrder byte_order = target_arch.GetByteOrder(); + std::vector<page_object> pages_to_copy; if (range_error.Success()) { while (range_info.GetRange().GetRangeBase() != LLDB_INVALID_ADDRESS) { - const addr_t addr = range_info.GetRange().GetRangeBase(); - const addr_t size = range_info.GetRange().GetByteSize(); - - if (size == 0) - break; - // Calculate correct protections uint32_t prot = 0; if (range_info.GetReadable() == MemoryRegionInfo::eYes) @@ -6208,40 +6211,80 @@ if (range_info.GetExecutable() == MemoryRegionInfo::eYes) prot |= VM_PROT_EXECUTE; - if (prot != 0) { - uint32_t cmd_type = LC_SEGMENT_64; - uint32_t segment_size = sizeof(segment_command_64); - if (addr_byte_size == 4) { - cmd_type = LC_SEGMENT; - segment_size = sizeof(segment_command); + const addr_t addr = range_info.GetRange().GetRangeBase(); + const addr_t size = range_info.GetRange().GetByteSize(); + + if (size == 0) + break; + + int pagesize = range_info.GetPageSize(); + llvm::Optional<std::vector<addr_t>> dirty_page_list = + range_info.GetDirtyPageList(); + if (dirty_page_list.hasValue()) { + for (addr_t dirtypage : dirty_page_list.getValue()) { + page_object obj; + obj.addr = dirtypage; + obj.size = pagesize; + obj.prot = prot; + if (prot != 0) + pages_to_copy.push_back(obj); } - segment_command_64 segment = { - cmd_type, // uint32_t cmd; - segment_size, // uint32_t cmdsize; - {0}, // char segname[16]; - addr, // uint64_t vmaddr; // uint32_t for 32-bit Mach-O - size, // uint64_t vmsize; // uint32_t for 32-bit Mach-O - 0, // uint64_t fileoff; // uint32_t for 32-bit Mach-O - size, // uint64_t filesize; // uint32_t for 32-bit Mach-O - prot, // uint32_t maxprot; - prot, // uint32_t initprot; - 0, // uint32_t nsects; - 0}; // uint32_t flags; - segment_load_commands.push_back(segment); } else { - // No protections and a size of 1 used to be returned from old - // debugservers when we asked about a region that was past the - // last memory region and it indicates the end... - if (size == 1) - break; + page_object obj; + obj.addr = addr; + obj.size = size; + obj.prot = prot; + if (prot != 0) + pages_to_copy.push_back(obj); } - range_error = process_sp->GetMemoryRegionInfo( range_info.GetRange().GetRangeEnd(), range_info); if (range_error.Fail()) break; } + // Combine contiguous entries that have the same + // protections so we don't have an excess of + // load commands. + std::vector<page_object> combined_page_objects; + page_object last_obj; + last_obj.addr = LLDB_INVALID_ADDRESS; + for (page_object obj : pages_to_copy) { + if (last_obj.addr == LLDB_INVALID_ADDRESS) { + last_obj = obj; + continue; + } + if (last_obj.addr + last_obj.size == obj.addr && + last_obj.prot == obj.prot) { + last_obj.size += obj.size; + continue; + } + combined_page_objects.push_back(last_obj); + last_obj = obj; + } + + for (page_object obj : combined_page_objects) { + uint32_t cmd_type = LC_SEGMENT_64; + uint32_t segment_size = sizeof(segment_command_64); + if (addr_byte_size == 4) { + cmd_type = LC_SEGMENT; + segment_size = sizeof(segment_command); + } + segment_command_64 segment = { + cmd_type, // uint32_t cmd; + segment_size, // uint32_t cmdsize; + {0}, // char segname[16]; + obj.addr, // uint64_t vmaddr; // uint32_t for 32-bit Mach-O + obj.size, // uint64_t vmsize; // uint32_t for 32-bit Mach-O + 0, // uint64_t fileoff; // uint32_t for 32-bit Mach-O + obj.size, // uint64_t filesize; // uint32_t for 32-bit Mach-O + obj.prot, // uint32_t maxprot; + obj.prot, // uint32_t initprot; + 0, // uint32_t nsects; + 0}; // uint32_t flags; + segment_load_commands.push_back(segment); + } + StreamString buffer(Stream::eBinary, addr_byte_size, byte_order); mach_header_64 mach_header; @@ -6312,6 +6355,14 @@ mach_header.sizeofcmds += 8 + LC_THREAD_data.GetSize(); } + // LC_NOTE "all image infos" + mach_header.ncmds++; + mach_header.sizeofcmds += sizeof(struct note_command); + + // LC_NOTE "executing uuids" + mach_header.ncmds++; + mach_header.sizeofcmds += sizeof(struct note_command); + // Write the mach header buffer.PutHex32(mach_header.magic); buffer.PutHex32(mach_header.cputype); @@ -6327,10 +6378,59 @@ // Skip the mach header and all load commands and align to the next // 0x1000 byte boundary addr_t file_offset = buffer.GetSize() + mach_header.sizeofcmds; - if (file_offset & 0x00000fff) { - file_offset += 0x00001000ull; - file_offset &= (~0x00001000ull + 1); - } + + file_offset = llvm::alignTo(file_offset, 16); + + // Create the "all image infos" LC_NOTE payload + StreamString all_image_infos_payload(Stream::eBinary, addr_byte_size, + byte_order); + offset_t all_image_infos_payload_start = file_offset; + file_offset = CreateAllImageInfosPayload(process_sp, file_offset, + all_image_infos_payload); + + // Add the "all image infos" LC_NOTE load command + struct note_command all_image_info_note = { + LC_NOTE, /* uint32_t cmd */ + sizeof(struct note_command), /* uint32_t cmdsize */ + "all image infos", /* char data_owner[16] */ + all_image_infos_payload_start, /* uint64_t offset */ + file_offset - all_image_infos_payload_start /* uint64_t size */ + }; + buffer.PutHex32(all_image_info_note.cmd); + buffer.PutHex32(all_image_info_note.cmdsize); + buffer.PutRawBytes(all_image_info_note.data_owner, + sizeof(all_image_info_note.data_owner)); + buffer.PutHex64(all_image_info_note.offset); + buffer.PutHex64(all_image_info_note.size); + + // Align to a 16-byte boundary for the next payload, + // "executing uuids" + file_offset = llvm::alignTo(file_offset, 16); + + // Create the "executing uuids" LC_NOTE payload + StreamString executing_uuids_payload(Stream::eBinary, addr_byte_size, + byte_order); + offset_t executing_uuids_payload_start = file_offset; + file_offset = CreateExecutingUUIDsPayload(process_sp, file_offset, + executing_uuids_payload); + + // Add the "executing uuids" LC_NOTE load command + struct note_command executing_uuids_note = { + LC_NOTE, /* uint32_t cmd */ + sizeof(struct note_command), /* uint32_t cmdsize */ + "executing uuids", /* char data_owner[16] */ + executing_uuids_payload_start, /* uint64_t offset */ + file_offset - executing_uuids_payload_start /* uint64_t size */ + }; + buffer.PutHex32(executing_uuids_note.cmd); + buffer.PutHex32(executing_uuids_note.cmdsize); + buffer.PutRawBytes(executing_uuids_note.data_owner, + sizeof(executing_uuids_note.data_owner)); + buffer.PutHex64(executing_uuids_note.offset); + buffer.PutHex64(executing_uuids_note.size); + + // Align to 4096-byte page boundary for the LC_SEGMENTs. + file_offset = llvm::alignTo(file_offset, 4096); for (auto &segment : segment_load_commands) { segment.fileoff = file_offset; @@ -6347,14 +6447,6 @@ // Write out all of the segment load commands for (const auto &segment : segment_load_commands) { - printf("0x%8.8x 0x%8.8x [0x%16.16" PRIx64 " - 0x%16.16" PRIx64 - ") [0x%16.16" PRIx64 " 0x%16.16" PRIx64 - ") 0x%8.8x 0x%8.8x 0x%8.8x 0x%8.8x]\n", - segment.cmd, segment.cmdsize, segment.vmaddr, - segment.vmaddr + segment.vmsize, segment.fileoff, - segment.filesize, segment.maxprot, segment.initprot, - segment.nsects, segment.flags); - buffer.PutHex32(segment.cmd); buffer.PutHex32(segment.cmdsize); buffer.PutRawBytes(segment.segname, sizeof(segment.segname)); @@ -6389,6 +6481,33 @@ error = core_file.get()->Write(buffer.GetString().data(), bytes_written); if (error.Success()) { + + if (core_file.get()->SeekFromStart(all_image_info_note.offset) == + -1) { + error.SetErrorStringWithFormat( + "Unable to seek to corefile pos to write all iamge infos"); + return false; + } + + bytes_written = all_image_infos_payload.GetString().size(); + error = core_file.get()->Write( + all_image_infos_payload.GetString().data(), bytes_written); + if (!error.Success()) + return false; + + if (core_file.get()->SeekFromStart(executing_uuids_note.offset) == + -1) { + error.SetErrorStringWithFormat( + "Unable to seek to corefile pos to write executing uuids"); + return false; + } + + bytes_written = executing_uuids_payload.GetString().size(); + error = core_file.get()->Write( + executing_uuids_payload.GetString().data(), bytes_written); + if (!error.Success()) + return false; + // Now write the file data for all memory segments in the process for (const auto &segment : segment_load_commands) { if (core_file.get()->SeekFromStart(segment.fileoff) == -1) { @@ -6442,3 +6561,363 @@ } return false; } + +struct all_image_infos_header { + uint32_t version; // currently 1 + uint32_t imgcount; // number of binary images + uint64_t entries_fileoff; // file offset in the corefile of where the array of + // struct entry's begin. + uint32_t entries_size; // size of 'struct entry'. + uint32_t unused; +}; + +struct image_entry { + uint64_t filepath_offset; // offset in corefile to c-string of the file path, + // UINT64_MAX if unavailable. + uuid_t uuid; // uint8_t[16]. should be set to all zeroes if + // uuid is unknown. + uint64_t load_address; // UINT64_MAX if unknown. + uint64_t seg_addrs_offset; // offset to the array of struct segment_vmaddr's. + uint32_t segment_count; // The number of segments for this binary. + uint32_t unused; + + image_entry() { + filepath_offset = UINT64_MAX; + memset(&uuid, 0, sizeof(uuid_t)); + segment_count = 0; + load_address = UINT64_MAX; + seg_addrs_offset = UINT64_MAX; + unused = 0; + } + image_entry(const image_entry &rhs) { + filepath_offset = rhs.filepath_offset; + memcpy(&uuid, &rhs.uuid, sizeof(uuid_t)); + segment_count = rhs.segment_count; + seg_addrs_offset = rhs.seg_addrs_offset; + load_address = rhs.load_address; + unused = rhs.unused; + } +}; + +struct segment_vmaddr { + char segname[16]; + uint64_t vmaddr; + uint64_t unused; + + segment_vmaddr() { + memset(&segname, 0, 16); + vmaddr = UINT64_MAX; + unused = 0; + } + segment_vmaddr(const segment_vmaddr &rhs) { + memcpy(&segname, &rhs.segname, 16); + vmaddr = rhs.vmaddr; + unused = rhs.unused; + } +}; + +// Write the payload for the "all image infos" LC_NOTE into +// the supplied all_image_infos_payload, assuming that this +// will be written into the corefile starting at +// initial_file_offset. +// +// The placement of this payload is a little tricky. We're +// laying this out as +// +// 1. header (struct all_image_info_header) +// 2. Array of fixed-size (struct image_entry)'s, one +// per binary image present in the process. +// 3. Arrays of (struct segment_vmaddr)'s, a varying number +// for each binary image. +// 4. Variable length c-strings of binary image filepaths, +// one per binary. +// +// To compute where everything will be laid out in the +// payload, we need to iterate over the images and calculate +// how many segment_vmaddr structures each image will need, +// and how long each image's filepath c-string is. There +// are some multiple passes over the image list while calculating +// everything. + +offset_t ObjectFileMachO::CreateAllImageInfosPayload( + const lldb::ProcessSP &process_sp, offset_t initial_file_offset, + StreamString &all_image_infos_payload) { + Target &target = process_sp->GetTarget(); + const ModuleList &modules = target.GetImages(); + size_t modules_count = modules.GetSize(); + + struct all_image_infos_header infos; + infos.version = 1; + infos.imgcount = modules_count; + infos.entries_size = sizeof(image_entry); + infos.entries_fileoff = initial_file_offset + sizeof(all_image_infos_header); + infos.unused = 0; + + all_image_infos_payload.PutHex32(infos.version); + all_image_infos_payload.PutHex32(infos.imgcount); + all_image_infos_payload.PutHex64(infos.entries_fileoff); + all_image_infos_payload.PutHex32(infos.entries_size); + all_image_infos_payload.PutHex32(infos.unused); + + // First create the structures for all of the segment name+vmaddr vectors + // for each module, so we will know the size of them as we add the + // module entries. + std::vector<std::vector<segment_vmaddr>> modules_segment_vmaddrs; + for (size_t i = 0; i < modules_count; i++) { + ModuleSP module = modules.GetModuleAtIndex(i); + + SectionList *sections = module->GetSectionList(); + size_t sections_count = sections->GetSize(); + std::vector<segment_vmaddr> segment_vmaddrs; + for (size_t j = 0; j < sections_count; j++) { + SectionSP section = sections->GetSectionAtIndex(j); + if (!section->GetParent().get()) { + addr_t vmaddr = section->GetLoadBaseAddress(&target); + if (vmaddr == LLDB_INVALID_ADDRESS) + continue; + ConstString name = section->GetName(); + segment_vmaddr seg_vmaddr; + strncpy(seg_vmaddr.segname, name.AsCString(), + sizeof(seg_vmaddr.segname)); + seg_vmaddr.vmaddr = vmaddr; + seg_vmaddr.unused = 0; + segment_vmaddrs.push_back(seg_vmaddr); + } + } + modules_segment_vmaddrs.push_back(segment_vmaddrs); + } + + offset_t size_of_vmaddr_structs = 0; + for (size_t i = 0; i < modules_segment_vmaddrs.size(); i++) { + size_of_vmaddr_structs += + modules_segment_vmaddrs[i].size() * sizeof(segment_vmaddr); + } + + offset_t size_of_filepath_cstrings = 0; + for (size_t i = 0; i < modules_count; i++) { + ModuleSP module_sp = modules.GetModuleAtIndex(i); + size_of_filepath_cstrings += module_sp->GetFileSpec().GetPath().size() + 1; + } + + // Calculate the file offsets of our "all image infos" payload in the + // corefile. initial_file_offset the original value passed in to this method. + + offset_t start_of_entries = + initial_file_offset + sizeof(all_image_infos_header); + offset_t start_of_seg_vmaddrs = + start_of_entries + sizeof(image_entry) * modules_count; + offset_t start_of_filenames = start_of_seg_vmaddrs + size_of_vmaddr_structs; + + offset_t final_file_offset = start_of_filenames + size_of_filepath_cstrings; + + // Now write the one-per-module 'struct image_entry' into the + // StringStream; keep track of where the struct segment_vmaddr + // entries for each module will end up in the corefile. + + offset_t current_string_offset = start_of_filenames; + offset_t current_segaddrs_offset = start_of_seg_vmaddrs; + std::vector<struct image_entry> image_entries; + for (size_t i = 0; i < modules_count; i++) { + ModuleSP module_sp = modules.GetModuleAtIndex(i); + + struct image_entry ent; + memcpy(&ent.uuid, module_sp->GetUUID().GetBytes().data(), sizeof(ent.uuid)); + if (modules_segment_vmaddrs[i].size() > 0) { + ent.segment_count = modules_segment_vmaddrs[i].size(); + ent.seg_addrs_offset = current_segaddrs_offset; + } + ent.filepath_offset = current_string_offset; + ObjectFile *objfile = module_sp->GetObjectFile(); + if (objfile) { + Address base_addr(objfile->GetBaseAddress()); + if (base_addr.IsValid()) { + ent.load_address = base_addr.GetLoadAddress(&target); + } + } + + all_image_infos_payload.PutHex64(ent.filepath_offset); + all_image_infos_payload.PutRawBytes(ent.uuid, sizeof(ent.uuid)); + all_image_infos_payload.PutHex64(ent.load_address); + all_image_infos_payload.PutHex64(ent.seg_addrs_offset); + all_image_infos_payload.PutHex32(ent.segment_count); + all_image_infos_payload.PutHex32(ent.unused); + + current_segaddrs_offset += ent.segment_count * sizeof(segment_vmaddr); + current_string_offset += module_sp->GetFileSpec().GetPath().size() + 1; + } + + // Now write the struct segment_vmaddr entries into the StringStream. + + for (size_t i = 0; i < modules_segment_vmaddrs.size(); i++) { + if (modules_segment_vmaddrs[i].size() == 0) + continue; + for (struct segment_vmaddr segvm : modules_segment_vmaddrs[i]) { + all_image_infos_payload.PutRawBytes(segvm.segname, sizeof(segvm.segname)); + all_image_infos_payload.PutHex64(segvm.vmaddr); + all_image_infos_payload.PutHex64(segvm.unused); + } + } + + for (size_t i = 0; i < modules_count; i++) { + ModuleSP module_sp = modules.GetModuleAtIndex(i); + std::string filepath = module_sp->GetFileSpec().GetPath(); + all_image_infos_payload.PutRawBytes(filepath.data(), filepath.size() + 1); + } + + return final_file_offset; +} + +ObjectFile::MachOCorefileAllImageInfos +ObjectFileMachO::GetCorefileAllImageInfos() { + ObjectFile::MachOCorefileAllImageInfos image_infos; + + // Look for an "all image infos" LC_NOTE. + lldb::offset_t offset = MachHeaderSizeFromMagic(m_header.magic); + for (uint32_t i = 0; i < m_header.ncmds; ++i) { + const uint32_t cmd_offset = offset; + load_command lc; + if (m_data.GetU32(&offset, &lc.cmd, 2) == nullptr) + break; + if (lc.cmd == LC_NOTE) { + char data_owner[17]; + m_data.CopyData(offset, 16, data_owner); + data_owner[16] = '\0'; + offset += 16; + uint64_t fileoff = m_data.GetU64_unchecked(&offset); + offset += 4; /* size unused */ + + if (strcmp("all image infos", data_owner) == 0) { + offset = fileoff; + // Read the struct all_image_infos_header. + uint32_t version = m_data.GetU32(&offset); + if (version != 1) { + return image_infos; + } + uint32_t imgcount = m_data.GetU32(&offset); + uint64_t entries_fileoff = m_data.GetU64(&offset); + offset += 4; // uint32_t entries_size; + offset += 4; // uint32_t unused; + + offset = entries_fileoff; + for (uint32_t i = 0; i < imgcount; i++) { + // Read the struct image_entry. + offset_t filepath_offset = m_data.GetU64(&offset); + uuid_t uuid; + memcpy(&uuid, m_data.GetData(&offset, sizeof(uuid_t)), + sizeof(uuid_t)); + uint64_t load_address = m_data.GetU64(&offset); + offset_t seg_addrs_offset = m_data.GetU64(&offset); + uint32_t segment_count = m_data.GetU32(&offset); + offset += 4; // unused + + MachOCorefileImageEntry image_entry; + image_entry.filename = (const char *)m_data.GetCStr(&filepath_offset); + image_entry.uuid = UUID::fromData(uuid, sizeof(uuid_t)); + image_entry.load_address = load_address; + + offset_t seg_vmaddrs_offset = seg_addrs_offset; + for (uint32_t j = 0; j < segment_count; j++) { + char segname[17]; + m_data.CopyData(seg_vmaddrs_offset, 16, segname); + segname[16] = '\0'; + seg_vmaddrs_offset += 16; + uint64_t vmaddr = m_data.GetU64(&seg_vmaddrs_offset); + seg_vmaddrs_offset += 8; /* unused */ + + std::tuple<ConstString, addr_t> new_seg{ConstString(segname), + vmaddr}; + image_entry.segment_load_addresses.push_back(new_seg); + } + image_infos.all_image_infos.push_back(image_entry); + } + } + } + offset = cmd_offset + lc.cmdsize; + } + + return image_infos; +} + +offset_t ObjectFileMachO::CreateExecutingUUIDsPayload( + const lldb::ProcessSP &process_sp, offset_t file_offset, + StreamString &executing_uuids_payload) { + + std::set<std::string> uuids; + ThreadList &thread_list(process_sp->GetThreadList()); + for (uint32_t i = 0; i < thread_list.GetSize(); i++) { + ThreadSP thread_sp = thread_list.GetThreadAtIndex(i); + uint32_t stack_frame_count = thread_sp->GetStackFrameCount(); + for (uint32_t j = 0; j < stack_frame_count; j++) { + StackFrameSP stack_frame_sp = thread_sp->GetStackFrameAtIndex(j); + Address pc = stack_frame_sp->GetFrameCodeAddress(); + ModuleSP module_sp = pc.GetModule(); + if (module_sp) { + UUID uuid = module_sp->GetUUID(); + if (uuid.IsValid()) { + uuids.insert(uuid.GetAsString()); + } + } + } + } + + // struct executing_uuids { + // uint32_t version; // version is 1 + // uint32_t uuid_count; // number of UUIDs + // uuid_t uuid[]; // array of uuid_t', uuid_count of them + // // uuid_t is 16 bytes in a Mach-O file. + // }; + + executing_uuids_payload.PutHex32(1); + executing_uuids_payload.PutHex32(uuids.size()); + + for (std::string uuidstr : uuids) { + uuid_t uuid; + uuid_parse(uuidstr.c_str(), uuid); + executing_uuids_payload.PutRawBytes(uuid, sizeof(uuid)); + } + + file_offset += executing_uuids_payload.GetString().size(); + + return file_offset; +} + +std::vector<UUID> ObjectFileMachO::GetCorefileExecutingUUIDs() { + std::vector<UUID> uuids; + + // Look for an "executing uuids" LC_NOTE. + lldb::offset_t offset = MachHeaderSizeFromMagic(m_header.magic); + for (uint32_t i = 0; i < m_header.ncmds; ++i) { + const uint32_t cmd_offset = offset; + load_command lc; + if (m_data.GetU32(&offset, &lc.cmd, 2) == nullptr) + break; + if (lc.cmd == LC_NOTE) { + char data_owner[17]; + m_data.CopyData(offset, 16, data_owner); + data_owner[16] = '\0'; + offset += 16; + uint64_t fileoff = m_data.GetU64_unchecked(&offset); + offset += 4; /* size unused */ + + if (strcmp("executing uuids", data_owner) == 0) { + offset = fileoff; + // Read the struct executing_uuids. + uint32_t version = m_data.GetU32(&offset); + if (version != 1) { + return uuids; + } + uint32_t uuidcount = m_data.GetU32(&offset); + for (uint32_t i = 0; i < uuidcount; i++) { + // Read the struct image_entry. + uuid_t uuid; + memcpy(&uuid, m_data.GetData(&offset, sizeof(uuid_t)), + sizeof(uuid_t)); + uuids.push_back(UUID::fromData(uuid, sizeof(uuid))); + } + } + } + offset = cmd_offset + lc.cmdsize; + } + + return uuids; +} Index: lldb/include/lldb/Target/MemoryRegionInfo.h =================================================================== --- lldb/include/lldb/Target/MemoryRegionInfo.h +++ lldb/include/lldb/Target/MemoryRegionInfo.h @@ -10,8 +10,11 @@ #ifndef LLDB_TARGET_MEMORYREGIONINFO_H #define LLDB_TARGET_MEMORYREGIONINFO_H +#include <vector> + #include "lldb/Utility/ConstString.h" #include "lldb/Utility/RangeMap.h" +#include "llvm/ADT/Optional.h" #include "llvm/Support/FormatProviders.h" namespace lldb_private { @@ -33,7 +36,14 @@ void Clear() { m_range.Clear(); - m_read = m_write = m_execute = eDontKnow; + m_read = m_write = m_execute = m_mapped = m_flash = eDontKnow; + m_name.Clear(); + m_blocksize = 0; + m_pagesize = 0; + if (m_dirty_pages.hasValue()) { + m_dirty_pages.getValue().clear(); + } + m_dirty_pages.reset(); } const RangeType &GetRange() const { return m_range; } @@ -96,6 +106,27 @@ bool operator!=(const MemoryRegionInfo &rhs) const { return !(*this == rhs); } + /// Get the target system's VM page size in bytes. + /// \return + /// 0 is returned if this information is unavailable. + int GetPageSize() { return m_pagesize; } + + /// Get a vector of target VM pages that are dirty -- that have been + /// modified -- within this memory region. This is an Optional return + /// value; it will only be available if the remote stub was able to + /// detail this. + llvm::Optional<std::vector<lldb::addr_t>> GetDirtyPageList() { + return m_dirty_pages; + } + + void SetPageSize(int pagesize) { m_pagesize = pagesize; } + + void SetDirtyPageList(std::vector<lldb::addr_t> pagelist) { + if (m_dirty_pages.hasValue()) + m_dirty_pages.getValue().clear(); + m_dirty_pages = pagelist; + } + protected: RangeType m_range; OptionalBool m_read = eDontKnow; @@ -105,6 +136,8 @@ ConstString m_name; OptionalBool m_flash = eDontKnow; lldb::offset_t m_blocksize = 0; + int m_pagesize = 0; + llvm::Optional<std::vector<lldb::addr_t>> m_dirty_pages; }; inline bool operator<(const MemoryRegionInfo &lhs, Index: lldb/include/lldb/Symbol/ObjectFile.h =================================================================== --- lldb/include/lldb/Symbol/ObjectFile.h +++ lldb/include/lldb/Symbol/ObjectFile.h @@ -666,6 +666,44 @@ /// Creates a plugin-specific call frame info virtual std::unique_ptr<CallFrameInfo> CreateCallFrameInfo(); + /// A corefile may include metadata about all of the binaries that were + /// present in the process when the corefile was taken. This is only + /// implemented for Mach-O files for now; we'll generalize it when we + /// have other systems that can include the same. + struct MachOCorefileImageEntry { + std::string filename; + UUID uuid; + lldb::addr_t load_address = LLDB_INVALID_ADDRESS; + std::vector<std::tuple<ConstString, lldb::addr_t>> segment_load_addresses; + }; + + struct MachOCorefileAllImageInfos { + std::vector<MachOCorefileImageEntry> all_image_infos; + bool IsValid() { return all_image_infos.size() > 0; } + }; + + /// Get the list of binary images that were present in the process + /// when the corefile was produced. + /// \return + /// The MachOCorefileAllImageInfos object returned will have + /// IsValid() == false if the information is unavailable. + virtual MachOCorefileAllImageInfos GetCorefileAllImageInfos() { + return MachOCorefileAllImageInfos(); + } + + /// Get a list of UUIDs of binaries that were currently executing + /// at the time when the corefile was taken. A process may have + /// hundreds of binary images loaded, but only a dozen actually + /// executing code on any of the stacks. This allows for lldb to + /// do some more expensive searches for these binaries. + /// \return + /// A vector of UUIDs of binaries that were currnetly executing + /// is returned; an empty vector if this information was + /// unavailable. + virtual std::vector<UUID> GetCorefileExecutingUUIDs() { + return std::vector<UUID>(); + } + protected: // Member variables. FileSpec m_file; Index: lldb/docs/lldb-gdb-remote.txt =================================================================== --- lldb/docs/lldb-gdb-remote.txt +++ lldb/docs/lldb-gdb-remote.txt @@ -790,10 +790,14 @@ osmajor: optional, specifies the major version number of the OS (e.g. for macOS 10.12.2, it would be 10) osminor: optional, specifies the minor version number of the OS (e.g. for macOS 10.12.2, it would be 12) ospatch: optional, specifies the patch level number of the OS (e.g. for macOS 10.12.2, it would be 2) +vm-page-size: optional, specifies the target system VM page size, base 10. + Needed for the "dirty-pages:" list in the qMemoryRegionInfo + packet, where a list of dirty pages is sent from the remote + stub. This page size tells lldb how large each dirty page is. addressing_bits: optional, specifies how many bits in addresses are significant for addressing, base 10. If bits 38..0 in a 64-bit pointer are significant for addressing, - then the value is 39. This is needed on e.g. Aarch64 + then the value is 39. This is needed on e.g. AArch64 v8.3 ABIs that use pointer authentication, so lldb knows which bits to clear/set to get the actual addresses. @@ -1090,6 +1094,16 @@ // a hex encoded string value that // contains an error string + dirty-pages:[<hexaddr>][,<hexaddr]; // A list of memory pages within this + // region that are "dirty" -- they have been modified. + // Page addresses are in base16. The size of a page can + // be found from the qHostInfo's page-size key-value. + // + // If the stub supports identifying dirty pages within a + // memory region, this key should always be present. This + // key name with no pages listed ("dirty-pages:;") indicates + // no dirty pages in this memory region. + If the address requested is not in a mapped region (e.g. we've jumped through a NULL pointer and are at 0x0) currently lldb expects to get back the size of the unmapped region -- that is, the distance to the next valid region.
_______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits