[Lldb-commits] [PATCH] D106584: [lldb] Improve checking of file cache read eligibility for mach-O

Jason Molenda via Phabricator via lldb-commits Sat, 24 Jul 2021 01:24:01 -0700

jasonmolenda added a comment.

Sorry for the delay in replying, I needed to think & look at this a bit.

In D106584#2900239 <https://reviews.llvm.org/D106584#2900239>, @augusto2112 
wrote:

> Hi Jason,
>
> I could place this check on the caller, but I think here is the right place 
> for it, maybe I can convince you.
>
>> the perf impact of forcing all reads to be out of memory would be 
>> dramatically bad for a remote system debug session, this can't be done.
>
> On macOS, this would only affect images that have the LC_SEGMENT_SPLIT_INFO 
> load command.

I tried expanding some shared caches and played around a bit to make sure I 
understood what we're dealing with.  You know all of this, but just to make 
sure everyone is following along, we have three possible states:

1. Binary on disk, before it gets included in a shared cache.  Will have 
LC_SEGMENT_SPLIT_INFO.
2. Binary embedded in the shared cache binary blog (either on disk or in memory)
3. Binary on disk, extracted from the shared cache.  LC_SEGMENT_SPLIT_INFO is 
absent, cannot be reconstructed from the shared cache binary blob.

With simulator process debugging, the debugger can find the pre-shared-cache 
distinct binaries on the local system, with the LC_SEGMENT_SPLIT_INFOs.

With iOS debugging, the debugger will have post-shared-cache distinct binaries 
on the local system, without LC_SEGMENT_SPLIT_INFOs.

I'm not sure what watch debugging looks like -- instead of copying the shared 
cache off of a watch and expanding it, Xcode downloads a disk image from Apple 
with the shared cache contents.  I don't know if that disk image has 
pre-shared-cache or post-shared-cache binaries; it's likely that 
pre-shared-cache binaries are included in the disk image.

In your bug, you've got a table which has relative offsets in it.  The offsets 
are to things in the binary image.  They may cross segment boundaries. The 
binary-on-disk will have TEXT, DATA, and LINKEDIT next to one another.  When 
it's incorporated into the shared cache, the segments will be separated - and 
these offsets will be updated in the process of creating the shared cache.  
(are the offsets in the pre-shared-cache binary even valid?  Not important)  
You have a bug where you're reading these offsets and getting them from the 
pre-shared-cache on-disk-binary, which are incorrect, and want to force reading 
them from the shared cache memory.

(post-shared-cache on-disk-binaries may have the same offsets as it was in the 
original shared cache -- but it's actually going to be a MORE subtle bug in 
that case because people plug in different iPhones and Xcode only extracts the 
shared cache for one of them if they're all running the same build.  But 
different model iphones will have different shared caches, so using the offsets 
from the post-shared-cache on-disk-binary-image from one device for another 
device will be wrong.)

The simulator version of this bug will be fixed, with less perf hit but still 
probably too much, by ignoring the pre-shared-cache on-disk-binary -- we'll 
have to read all of the symbol names out of memory via debugserver.  If this 
was over a USB cable, you'd be looking at a minute hit - but for a local debug 
session, it's probably closer to the order of 5-6 seconds.  I'm just guessing, 
I haven't measured it.  But it's a lot of packet traffic.

You'll still have the bug in the scenario outlined above, where I have two 
different iOS devices running the same iOS build.  Xcode will expand the shared 
cache for the first one I plug in, and lldb will use the same binary images 
(without LC_SEGMENT_SPLIT_INFOs) for both debug sessions.  And it's a 
questionmark about watch debugging - I suspect those will have 
LC_SEGMENT_SPLIT_INFOs and so app launch perf would take a gigantic hit to read 
symbol names all out of memory at debug time.

I think we need to concentrate on special casing the code that reads these 
offsets, to force it to use live memory every time.  I know in theory this 
problem could happen anywhere -- we could have these cross-segments offsets 
anywhere in the binary -- but in practice, we know it currently happens in one 
spot that the debugger looks at.

I admit I'd be a lot happier if we could always use the file cache for 
read-only Sections and know that the data in the file matches the data in the 
memory.  But if we only have one type of data where this isn't safe (so far?! 
:), I'm not willing to toss the optimization.  At the very least, we'd need to 
find ways to retain the same perf characteristics as we have right now, and I 
think that might be tricky.

>> I do like the debug-lldb check that the file cache read matches the live 
>> memory part. :)
>
> At least that we can keep then :) Do you think it's should be a crash, or 
> maybe a log? Wasn't sure what was best.

I think an assert is good, there's no scenario where this should get hit -- if 
it does, then we really have a bug in using the filecache data and we need to 
halt immediately in CI or on an lldb developer's desktop.  I'm a little less 
sure about thing this on for DEBUG builds all the time because it does change 
lldb's perf behavior a lot, specifically adding lots of packet traffic, but I 
think we should give it a try and see if it turns out to be too much of a 
change.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106584/new/

https://reviews.llvm.org/D106584

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D106584: [lldb] Improve checking of file cache read eligibility for mach-O

Reply via email to