I have a question about the handling of SendContinuePacketAndWaitForResponse
when an attempt to interrupt the process times out.
One oddity I noticed is in the handling of the eStateInvalid return from
SendContinuePacketAndWaitForResponse (in ProcessGDBRemote::AsyncThread). When
there's an async interrupt request in flight and the ReadPacket in SBPAWFR
wakes up from its periodic timeout and it's been "too long" with nothing from
debugserver, SBPAWFR returns eStateInvalid. The code handling that in the
AsyncThread does:
case eStateInvalid: {
// Check to see if we were trying to attach and if we got back
// the "E87" error code from debugserver -- this indicates that
// the process is not debuggable. Return a slightly more
// helpful error message about why the attach failed.
if (::strstr(continue_cstr, "vAttach") != nullptr &&
response.GetError() == 0x87) {
process->SetExitStatus(-1, "cannot attach to process due to "
"System Integrity Protection");
} else if (::strstr(continue_cstr, "vAttach") != nullptr &&
response.GetStatus().Fail()) {
process->SetExitStatus(-1, response.GetStatus().AsCString());
} else {
process->SetExitStatus(-1, "lost connection");
}
break;
}
So when an interrupt fails, we immediately give up on the process, setting the
exit status to -1.
Note, however, that this branch of the if doesn't set "done" to true. The
switch is inside a 'while (done)', so after setting the state to eStateInvalid,
we go back to wait for another packet from the debug stub.
SetExitStatus calls SetPrivateState(eStateExited), which will produce another
event, and normally we'll handle that and shut down nicely. But by returning
to fetch another packet from debugserver, we're leaving ourselves open to
getting a delayed stop we are no longer set up to deal with.
The background is that there's a longstanding but infrequent crash in lldb,
which I'm trying to fix. I've never been able to reproduce it locally, so I
only have the end state.
The crashing thread is the internal-state thread. It is crashing because it
was handling a stop event, and went to ask the thread what to do, and the
ThreadPlanStack is corrupt in some form or other. The symptoms vary, but they
all violate basic invariants in the ThreadPlanStack (like it will crash because
it has no elements, which is never true). At the time of the crash, there's
always another thread which is some ways past the "wait for the process to
stop" part of the Destroy operation on the process, and has begun tearing down
Process objects. That's pretty clearly why the ThreadPlanStack is in a bad
state...
So far as I can see, the only way that could happen is if the attempt to
interrupt timed out, but we didn't stop listening for packets, and didn't
immediately shut down the internal state thread. So then debugserver woke up,
and sent the stop packet, and the internal state thread tried to process it,
but we were already in mid Destroy, and that went poorly...
The fact that we go back for another packet after getting eStateInvalid seems
to be what allows this to happen.
lldb has lacked the "done = true;" that I would have expected here since at
least the great reformatting, so I don't have any history on this?
I added the "done = true;" and ran the testsuite, and there weren't any new
failures. But this code is pretty complex, so there might be some reason for
waiting to see if the stub returns another packet after the interrupt fails.
But I can't think of one.
Can anybody else think of a reason why not to set done to true here?
Jim
_______________________________________________
lldb-dev mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev