> On Jan 21, 2021, at 2:33 PM, Jim Ingham <jing...@apple.com> wrote:
> 
> 
> 
>> On Jan 21, 2021, at 12:51 PM, walter erquinigo via Phabricator 
>> <revi...@reviews.llvm.org> wrote:
>> 
>> wallace added a comment.
>> 
>> Sorry for returning late to this diff, but I have some additional 
>> information. This is what's happening:
>> 
>> - Before the exec, the base thread plan holds a reference to the thread 
>> pointer
>> - During the exec, the original thread is destroyed in 
>> ProcessGDBRemote::SetLastStopPacket, and later a new Thread pointer is 
>> created for the same tid. Notice that the base thread plan is still holding 
>> a reference to the previous pointer even though there's a new Thread pointer 
>> for the same tid. On Linux, exec preserves the tid. This step is where I 
>> think there's a wrong invariant.
>> - Then, at some point, the base thread plan is asked its stop reason, and 
>> it'll try to use the destroyed pointer, thus crashing.
> 
> Thanks for looking at this some more.
> 
> Sadly, I think I'm still missing something.  Here's how I think things should 
> happen:
> 
> 1) The program was stopped somewhere in the original program (before exec), 
> and the user continued the process
>   a) WillResume is called and all the thread plans clear their cached thread 
> pointers.  So they will have to look them up by ID when next asked a question.
>   b) Then the process resumes.
> 2) We get a stop packet with a reason of "exec" from the stub.
>   a) We process the stop packet, and get the new thread list
>   b) We start to build the new thread list

Ack, that was confusing.  Item b) should be "We start to build the new 
ThreadList".  The object in a) is the thread list the stub tells about, either 
in the stop packet or through subsequent queries.  The object in b) is the 
ThreadList in the Process that will describe the state at this particular stop 
point.

>      1) From what you are saying, in the case of exec, regardless of whether 
> the TID matches an extant one or not, we make a new Thread *, put it into the 
> new ThreadList and delete the old Thread *.
>   c) Then we start figuring out what to do after the stop.
>      i) If the TID of the new thread is the same as any of the TID's in the 
> old thread list (and thus of the ThreadPlanStack that was managing it) when 
> the new thread gets asked what to do, it will route the question to the 
> ThreadPlanStack for that TID.
>      ii) If this happens after the new thread list is in place, 
> FindThreadByID will return the new, valid thread, and there should be no 
> problem.
> 
> So for this to fail as you described, somewhere between the time that we 
> stopped for exec and the time we finalized the new thread list and discarded 
> the old threads, somebody must have asked the thread plan stack for that TID 
> a question that triggered it to cache the old thread's Thread *.
> 
> That really shouldn't happen, precisely for the reason you see here.  When we 
> stop, we have to finish setting up the current ThreadList before we start 
> asking questions of the ThreadPlanStack for threads, like ShouldStop or 
> whatever.  It makes no sense to ask the pre-stop thread list questions for 
> the current stop, and we shouldn't do it.
> 
> Do you know where we're calling ThreadPlan::GetThread while the thread list 
> still has the old threads in it?  I'd really like to stop that if we can.  
> And if you can move those queries till after the new thread list is 
> established, then we shouldn't need any of the alternatives you suggest.
> 
> About the alternatives:
> 
> First one: I'd rather not artificially clear out the ThreadPlanStack after 
> exec.  After all, if you knew that exec on Linux preserves the TID, you could 
> write a thread plan that carried over the exec and did something interesting. 
>  Be a shame to prevent that for implementation reasons.
> 
> Second one: Adding the IsValid check won't do any harm, but it implicitly 
> lets something happen that I don't think should happen, so it doesn't feel 
> right...
> 
> Third one: If there's some good reason why you have to first check through 
> the old thread list for stop reasons, then update the thread list, then ask 
> again with no Resume in between, I think the cleanest solution is to announce 
> that by explicitly clearing the ThreadPlans' cached thread pointer.  So I 
> think this one's the best option.  But again, I'd like to understand why it 
> was necessary.
> 
> Jim
> 
> 
>> 
>> I've found three possible fixes for this (I'm pretty sure that you can come 
>> up with more):
>> 
>> - The first one is what this diff does, which is clearing all the thread 
>> plans once the gdbremote client knows there's an exec.
>> - The second one is modifying the ThreadPlan::GetThread method, which 
>> currently looks like this
>> 
>> if (m_thread)
>>     return *m_thread;
>> 
>>   ThreadSP thread_sp = m_process.GetThreadList().FindThreadByID(m_tid);
>>   m_thread = thread_sp.get();
>>   return *m_thread;
>> 
>> Changing it into
>> 
>> if (m_thread && m_thread->IsValid()) // IsValid will return false for a 
>> deleted thread
>>     return *m_thread;
>> 
>>   ThreadSP thread_sp = m_process.GetThreadList().FindThreadByID(m_tid);
>>   m_thread = thread_sp.get();
>>   return *m_thread;
>> 
>> also works, as it will make the ThreadPlan grab the new thread pointer.
>> 
>> - Another solution is to create a method ThreadPlan::PrepareForExec() which 
>> just clears the m_thread pointer in the the thread plan, and call it from 
>> ProcessGDBRemote::SetLastStopPacket, where we handle the exec.
>> 
>> I prefer the first solution, as the thread plans are all cleared in 
>> preparation for the new process.
>> 
>> 
>> Repository:
>> rG LLVM Github Monorepo
>> 
>> CHANGES SINCE LAST ACTION
>> https://reviews.llvm.org/D93874/new/
>> 
>> https://reviews.llvm.org/D93874
>> 
> 

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to