Author: Jason Molenda
Date: 2025-04-21T10:58:58-07:00
New Revision: 1ab9e53e4910557f2a4c0f00c7a7f8b5ac493b48

URL: 
https://github.com/llvm/llvm-project/commit/1ab9e53e4910557f2a4c0f00c7a7f8b5ac493b48
DIFF: 
https://github.com/llvm/llvm-project/commit/1ab9e53e4910557f2a4c0f00c7a7f8b5ac493b48.diff

LOG: [lldb][Mach-O corefiles] Don't init Target arch to corefile (#136065)

This patch is making three changes, when loading a Mach-O corefile:

1. At the start of `DoLoadCore`, if a binary was provided in addition to
the corefile, initialize the Target's ArchSpec.

2. Before ProcessMachCore does its "exhaustive search" fallback, looking
through the corefile contents for a userland dyld or mach kernel, we
must make sure the Target has an ArchSpec, or methods that check the
address word size, or initialize a DataExtractor based on the Target
arch will not succeed.

3. Add logging when setting the Target's arch listing exactly what that
setting was based on -- the corefile itself, or the main binary.

Jonas landed a change last August (started with a patch from me) which
removed the Target ArchSpec initialization at the start of DoLoadCore,
in a scenario where the corefile had arch armv7 and the main binary had
arch armv7em (Cortex-M), and there was python code in the main binary's
dSYM which sets the operating system threads provider based on the
Target arch. It did different things for armv7 or armv7em, and so it
would fail.

Jonas' patch removed any ArchSpec setting at the start of DoLoadCore, so
we wouldn't have an incorrect arch value, but that broke the exhaustive
search for kernel binaries, because we didn't have an address word size
or endianness.

This patch should navigate the needs of both use cases.

I spent a good bit of time trying to construct a test to capture all of
these requirements -- but it turns out to be a good bit difficult,
encompassing both a genuine kernel corefiles and a microcontroller
firmware corefiles.

rdar://146821929

Added: 
    

Modified: 
    lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp

Removed: 
    


################################################################################
diff  --git a/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp 
b/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp
index 281f3a0db8f69..39584bad2d745 100644
--- a/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp
+++ b/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp
@@ -426,6 +426,22 @@ void ProcessMachCore::LoadBinariesViaExhaustiveSearch() {
   std::vector<addr_t> dylds_found;
   std::vector<addr_t> kernels_found;
 
+  // To do an exhaustive search, we'll need to create data extractors
+  // to get correctly sized/endianness fields.  If we had a main binary
+  // already, we would have set the Target to that - so here we'll use
+  // the corefile's cputype/cpusubtype as the best guess.
+  if (!GetTarget().GetArchitecture().IsValid()) {
+    // The corefile's architecture is our best starting point.
+    ArchSpec arch(m_core_module_sp->GetArchitecture());
+    if (arch.IsValid()) {
+      LLDB_LOGF(log,
+                "ProcessMachCore::%s: Setting target ArchSpec based on "
+                "corefile mach-o cputype/cpusubtype",
+                __FUNCTION__);
+      GetTarget().SetArchitecture(arch);
+    }
+  }
+
   const size_t num_core_aranges = m_core_aranges.GetSize();
   for (size_t i = 0; i < num_core_aranges; ++i) {
     const VMRangeToFileOffset::Entry *entry = 
m_core_aranges.GetEntryAtIndex(i);
@@ -569,6 +585,7 @@ Status ProcessMachCore::DoLoadCore() {
     error = Status::FromErrorString("invalid core module");
     return error;
   }
+  Log *log(GetLog(LLDBLog::DynamicLoader | LLDBLog::Target));
 
   ObjectFile *core_objfile = m_core_module_sp->GetObjectFile();
   if (core_objfile == nullptr) {
@@ -578,20 +595,47 @@ Status ProcessMachCore::DoLoadCore() {
 
   SetCanJIT(false);
 
+  // If we have an executable binary in the Target already,
+  // use that to set the Target's ArchSpec.
+  //
+  // Don't initialize the ArchSpec based on the corefile's cputype/cpusubtype
+  // here, the corefile creator may not know the correct subtype of the code
+  // that is executing, initialize the Target to that, and if the
+  // main binary has Python code which initializes based on the Target arch,
+  // get the wrong subtype value.
+  ModuleSP exe_module_sp = GetTarget().GetExecutableModule();
+  if (exe_module_sp && exe_module_sp->GetArchitecture().IsValid()) {
+    LLDB_LOGF(log,
+              "ProcessMachCore::%s: Was given binary + corefile, setting "
+              "target ArchSpec to binary to start",
+              __FUNCTION__);
+    GetTarget().SetArchitecture(exe_module_sp->GetArchitecture());
+  }
+
   CreateMemoryRegions();
 
   LoadBinariesAndSetDYLD();
 
   CleanupMemoryRegionPermissions();
 
-  ModuleSP exe_module_sp = GetTarget().GetExecutableModule();
+  exe_module_sp = GetTarget().GetExecutableModule();
   if (exe_module_sp && exe_module_sp->GetArchitecture().IsValid()) {
+    LLDB_LOGF(log,
+              "ProcessMachCore::%s: have executable binary in the Target "
+              "after metadata/scan.  Setting Target's ArchSpec based on "
+              "that.",
+              __FUNCTION__);
     GetTarget().SetArchitecture(exe_module_sp->GetArchitecture());
   } else {
     // The corefile's architecture is our best starting point.
     ArchSpec arch(m_core_module_sp->GetArchitecture());
-    if (arch.IsValid())
+    if (arch.IsValid()) {
+      LLDB_LOGF(log,
+                "ProcessMachCore::%s: Setting target ArchSpec based on "
+                "corefile mach-o cputype/cpusubtype",
+                __FUNCTION__);
       GetTarget().SetArchitecture(arch);
+    }
   }
 
   AddressableBits addressable_bits = core_objfile->GetAddressableBits();


        
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to