llvmbot wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

<details>
<summary>Changes</summary>

This fixes an uncommon bug with debugserver controlling an inferior process 
that is hitting an internal breakpoint &amp; continuing when multiple 
interrupts are sent by SB API to lldb.  In the reproducing setup (which 
required a machine with unique timing characteristics), lldb is sent 
SBProcess::Stop and then shortly after, SBProcess::SendAsyncSignal.  The driver 
process only sees that the inferior is publicly running at this point, even 
though it's hitting an internal breakpoint, disabling it, step instructioning, 
re-enabling the breakpoint, then continuing.

The packet sequence lldb sends to debugserver looks like

1. vCont;s   // instruction step
2. ^c        // async interrupt
3. Z....     // re-enable breakpoint
4. c         // resume inferior execution
5. ^c        // async interrupt

When debugserver needs to interrupt a running process 
(`MachProcess::Interrupt`), the main thread in debugserver sends a SIGSTOP 
posix signal to the inferior process, and notes that it has sent this signal by 
setting `m_sent_interrupt_signo`.

When we send the first async interrupt while instruction stepping, the signal 
is sent (probably after the inferior has already stopped) but lldb can only 
*receive* the mach exception that includes the SIGSTOP when the process is 
running.  So at the point of step (3), we have a SIGSTOP outstanding in the 
kernel, and
`m_sent_interrupt_signo` is set to SIGSTOP.

When we resume the inferior (`c` in step 4), debugserver sees that 
`m_sent_interrupt_signo` is still set for an outstanding SIGSTOP, but at this 
point we've already stopped so it's an unnecessary stop. It records that (1) 
we've got a SIGSTOP still coming that debugserver sent and (2) we should ignore 
it by also setting `m_auto_resume_signo` to the same signal value.

Once we've resumed the process, the mach exception thread 
(`MachTask::ExceptionThread`) receives the outstanding mach exception, adds it 
to a queue to be processed
(`MachProcess::ExceptionMessageReceived`) and when we've collected all 
outstanding mach exceptions, it calls
`MachProcess::ExceptionMessageBundleComplete` top evaluate them.

`MachProcess::ExceptionMessageBundleComplete` halts the process (without 
updating the MachProcess `m_state`) while evaluating them. It sees that this 
incoming SIGSTOP was meant to be ignored (`m_auto_resume_signo` is set), so it 
`MachProcess::PrivateResume`'s the process again.

At the same time `MachTask::ExceptionThread` is receiving and processing the 
ME, `MachProcess::Interrupt` is called with another interrupt that debugserver 
received.  This method checks that we're still eStateRunning (we are) but then 
sees that we have an outstanding SIGSTOP already (`m_sent_interrupt_signo`) and 
does nothing, assuming that we will stop shortly from that one.  It then 
returns to call `RNBRemote::HandlePacket_last_signal` to print the status -- 
but because the process is still `eStateRunning`, this does nothing.

So the first ^c (resulting in a pending SIGSTOP) is received and we resume the 
process silently.  And the second ^c is ignored because we've got one interrupt 
already being processed.

The fix was very simple.  In `MachProcess::Interrupt` when we detect that we 
have a SIGSTOP out in the wild (`m_sent_interrupt_signo`), we need to clear 
`m_auto_resume_signo` which is used to indicate that this SIGSTOP is meant to 
be ignored, because it was from before our most recent resume.

MachProcess::Interrupt holds the `m_exception_and_signal_mutex` mutex already 
(after Jonas's commit last week), and all of 
`MachProcess::ExceptionMessageBundleComplete` holds that same mutex, so we know 
we can modify `m_auto_resume_signo` here and it will be handled correctly when 
the outstanding mach exception is finally processed.

rdar://145872120

---
Full diff: https://github.com/llvm/llvm-project/pull/132128.diff


1 Files Affected:

- (modified) lldb/tools/debugserver/source/MacOSX/MachProcess.mm (+4) 


``````````diff
diff --git a/lldb/tools/debugserver/source/MacOSX/MachProcess.mm 
b/lldb/tools/debugserver/source/MacOSX/MachProcess.mm
index 4742beeb6a4a3..447ebbe7fb9e5 100644
--- a/lldb/tools/debugserver/source/MacOSX/MachProcess.mm
+++ b/lldb/tools/debugserver/source/MacOSX/MachProcess.mm
@@ -1594,6 +1594,10 @@ static uint64_t bits(uint64_t value, uint32_t msbit, 
uint32_t lsbit) {
                          m_sent_interrupt_signo);
       }
     } else {
+      // We've requested that the process stop anew; if we had recorded this
+      // requested stop as being in place when we resumed (& therefore would
+      // throw it away), clear that.
+      m_auto_resume_signo = 0;
       DNBLogThreadedIf(LOG_PROCESS, "MachProcess::Interrupt() - previously "
                                     "sent an interrupt signal %i that hasn't "
                                     "been received yet, interrupt aborted",

``````````

</details>


https://github.com/llvm/llvm-project/pull/132128
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to