kpdev42 added a comment.
In D144392#4139001 <https://reviews.llvm.org/D144392#4139001>, @labath wrote:
> I'm afraid you're fixing this at the wrong end. This kind of complex thread
> control does not belong inside the debug stub.
>
> IIUC, the problematic sequence of events is:
>
> 1. lldb sends a `vCont:s` packet
> 2. lldb-server resumes (PTRACE_SINGLESTEP)
> 3. process immediatelly stops again (without stepping) due to a pending signal
> 4. lldb-server decides to inject the signal and resumes (steps) again
> 5. process stops inside the signal handler
> 6. confusion ensues
>
> If that's the case, then I think the bug is at step 4, specifically in this
> code:
>
> // Check if debugger should stop at this signal or just ignore it and resume
> // the inferior.
> if (m_signals_to_ignore.contains(signo)) {
> ResumeThread(thread, thread.GetState(), signo);
> return;
> }
>
> I believe we should not be attempting to inject the signal if the thread was
> stepping. I think we should change this to report the signal to the client
> and let it figure out what to do. I.e., change this code into something like:
>
> if (m_signals_to_ignore.contains(signo) && thread.GetState() ==
> eStateRunning) {
> ResumeThread(thread, eStateRunning, signo);
> return;
> }
> // else report the signal as usual
>
> If that is not enough to fix the bug, then we can figure out what needs to be
> done on the client. @jingham might have something to say about that.
The full problematic sequence is this:
- current pc is at a breakpoint, user hits continue (or step, etc.)
- `ThreadPlanStepOverBreakpoint` is pushed
- the plan wants `eStateStepping`, so lldb sens a `vCont;s` packet to the
lldb-server
- lldb-server resumes with `PTRACE_SINGLESTEP`
- process stops due to a (pass=true, stop=false) signal
Then there are two possibilities, depending on signal filtering settings:
1. The signal is ignored (notify=false)
- lldb-server injects the signal back to the process and resumes (steps again)
2. The signal is not ignored (notify=true)
- lldb-server sends the stop reason with signal to the lldb client
- lldb does not care, because it is configured to not stop, so wants to step
again, sends the packet
After 1. and 2., the events are the same:
- process stops due to stepping into the signal handler, lldb client sees a
successful step
- `StepOverBreakpoint` plan sees a pc != breakpoint address, thinks its job is
done, pops off successfuly with an autocontinue
- process resumes, gets out of the handler right back to the breakpoint address
- the breakpoint hits, so the user sees a second breakpoint hit, but the
instruction under that address was still not executed
Technically, this is correct, because the pc was at a breakpoint, the process
did execute some instructions and went back to the breakpoint, and the program
state could have changed. But this is very annoying when a lot of signals are
received in the background (and the signal is not interesting to the user, as
it, for example, comes from a low level library, the same reason real-time
signals are `stop=false` `notify=false` by default right now:
https://reviews.llvm.org/D12795)
So it does not depend on the signal filtering (it can be configured anyway),
but the problem I would think is that client does not handle the situation with
signals while stepping (at least stepping off a breakpoint) properly, and
lldb-server does not help.
Gdb does this for example:
> GDB optimizes for stepping the mainline code. If a signal that has handle
> nostop and handle pass set arrives while a stepping command (e.g., stepi,
> step, next) is in progress, GDB lets the signal handler run and then resumes
> stepping the mainline code once the signal handler returns. In other words,
> GDB steps over the signal handler. This prevents signals that you’ve
> specified as not interesting (with handle nostop) from changing the focus of
> debugging unexpectedly.
(https://sourceware.org/gdb/onlinedocs/gdb/Signals.html), which seems
reasonable.
If this logic does not belong in lldb-server, then it could be fixed right
inside the `StepOverBreakpoint` plan, by enabling the breakpoint being stepped
over when a signal is received, waiting for it to hit when the potential
handler has returned, and then trying to step off again. But then doing a
`step-into` from a line with a breakpoint will step over the signal handler,
and from a line without a breakpoint will stop inside the handler, if a signal
is received. Then probably creating a new `ThreadPlan` instead with the same
logic, executing on top of the plan stack, is the way to go?
Anyway, in my attempts at fixing it on the client side, additionally, for some
reason the following situation is often triggered:
- process steps onto a breakpoint, but right before executing the instruction
and actually hitting (the `SIGTRAP` signal) another signal (`stop=false`) is
delivered
- the process wants to automatically continue
- the following code
(https://github.com/llvm/llvm-project/blob/959216f9b1f1fe0c8817a4e9104a38929247f987/lldb/source/Target/Thread.cpp#L639):
BreakpointSiteSP bp_site_sp =
GetProcess()->GetBreakpointSiteList().FindByAddress(thread_pc);
if (bp_site_sp) {
...
ThreadPlanSP step_bp_plan_sp(new ThreadPlanStepOverBreakpoint(*this));
...
QueueThreadPlan(step_bp_plan_sp, false);
pushes a step over breakpoint plan, and the breakpoint is skipped. This should
not happen, but I couldn't reproduce this without the patch.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144392/new/
https://reviews.llvm.org/D144392
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits