kpdev42 added a comment.

In D144392#4139001 <https://reviews.llvm.org/D144392#4139001>, @labath wrote:

> I'm afraid you're fixing this at the wrong end. This kind of complex thread 
> control does not belong inside the debug stub.
>
> IIUC, the problematic sequence of events is:
>
> 1. lldb sends a `vCont:s` packet
> 2. lldb-server resumes (PTRACE_SINGLESTEP)
> 3. process immediatelly stops again (without stepping) due to a pending signal
> 4. lldb-server decides to inject the signal and resumes (steps) again
> 5. process stops inside the signal handler
> 6. confusion ensues
>
> If that's the case, then I think the bug is at step 4, specifically in this 
> code:
>
>   // Check if debugger should stop at this signal or just ignore it and resume
>   // the inferior.
>   if (m_signals_to_ignore.contains(signo)) {
>      ResumeThread(thread, thread.GetState(), signo);
>      return;
>   }
>
> I believe we should not be attempting to inject the signal if the thread was 
> stepping. I think we should change this to report the signal to the client 
> and let it figure out what to do. I.e., change this code into something like:
>
>   if (m_signals_to_ignore.contains(signo) && thread.GetState() == 
> eStateRunning) {
>      ResumeThread(thread, eStateRunning, signo);
>      return;
>   }
>   // else report the signal as usual
>
> If that is not enough to fix the bug, then we can figure out what needs to be 
> done on the client. @jingham might have something to say about that.

The full problematic sequence is this:

- current pc is at a breakpoint, user hits continue (or step, etc.)
- `ThreadPlanStepOverBreakpoint` is pushed
- the plan wants `eStateStepping`, so lldb sens a `vCont;s` packet to the 
lldb-server
- lldb-server resumes with `PTRACE_SINGLESTEP`
- process stops due to a (pass=true, stop=false) signal

Then there are two possibilities, depending on signal filtering settings:

1. The signal is ignored (notify=false)
  - lldb-server injects the signal back to the process and resumes (steps again)
2. The signal is not ignored (notify=true)
  - lldb-server sends the stop reason with signal to the lldb client
  - lldb does not care, because it is configured to not stop, so wants to step 
again, sends the packet

After 1. and 2., the events are the same:

- process stops due to stepping into the signal handler, lldb client sees a 
successful step
- `StepOverBreakpoint` plan sees a pc != breakpoint address, thinks its job is 
done, pops off successfuly with an autocontinue
- process resumes, gets out of the handler right back to the breakpoint address
- the breakpoint hits, so the user sees a second breakpoint hit, but the 
instruction under that address was still not executed

Technically, this is correct, because the pc was at a breakpoint, the process 
did execute some instructions and went back to the breakpoint, and the program 
state could have changed. But this is very annoying when a lot of signals are 
received in the background (and the signal is not interesting to the user, as 
it, for example, comes from a low level library, the same reason real-time 
signals are `stop=false` `notify=false` by default right now: 
https://reviews.llvm.org/D12795)

So it does not depend on the signal filtering (it can be configured anyway), 
but the problem I would think is that client does not handle the situation with 
signals while stepping (at least stepping off a breakpoint) properly, and 
lldb-server does not help.

Gdb does this for example:

> GDB optimizes for stepping the mainline code. If a signal that has handle 
> nostop and handle pass set arrives while a stepping command (e.g., stepi, 
> step, next) is in progress, GDB lets the signal handler run and then resumes 
> stepping the mainline code once the signal handler returns. In other words, 
> GDB steps over the signal handler. This prevents signals that you’ve 
> specified as not interesting (with handle nostop) from changing the focus of 
> debugging unexpectedly.

(https://sourceware.org/gdb/onlinedocs/gdb/Signals.html), which seems 
reasonable.

If this logic does not belong in lldb-server, then it could be fixed right 
inside the `StepOverBreakpoint` plan, by enabling the breakpoint being stepped 
over when a signal is received, waiting for it to hit when the potential 
handler has returned, and then trying to step off again. But then doing a 
`step-into` from a line with a breakpoint will step over the signal handler, 
and from a line without a breakpoint will stop inside the handler, if a signal 
is received. Then probably creating a new `ThreadPlan` instead with the same 
logic, executing on top of the plan stack, is the way to go?

Anyway, in my attempts at fixing it on the client side, additionally, for some 
reason the following situation is often triggered:

- process steps onto a breakpoint, but right before executing the instruction 
and actually hitting (the `SIGTRAP` signal) another signal (`stop=false`) is 
delivered
- the process wants to automatically continue
- the following code 
(https://github.com/llvm/llvm-project/blob/959216f9b1f1fe0c8817a4e9104a38929247f987/lldb/source/Target/Thread.cpp#L639):

  BreakpointSiteSP bp_site_sp =
      GetProcess()->GetBreakpointSiteList().FindByAddress(thread_pc);
  if (bp_site_sp) {
  ...
      ThreadPlanSP step_bp_plan_sp(new ThreadPlanStepOverBreakpoint(*this));
      ...
          QueueThreadPlan(step_bp_plan_sp, false);
   

pushes a step over breakpoint plan, and the breakpoint is skipped. This should 
not happen, but I couldn't reproduce this without the patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144392/new/

https://reviews.llvm.org/D144392

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to