[Lldb-commits] [PATCH] D71440: Extending step-over range past calls was causing deadlocks, fix that.

Jim Ingham via Phabricator via lldb-commits Thu, 12 Dec 2019 16:13:43 -0800

jingham created this revision.
jingham added a reviewer: clayborg.
jingham added a project: LLDB.
Herald added subscribers: lldb-commits, jfb.


This change:

https://reviews.llvm.org/D58678

made lldb extend it's "next branch breakpoint" over IsCall instructions as a 
way to speed up stepping.  However, the change wasn't quite complete.  When 
lldb runs to the "next branch breakpoint" it always stops other threads when 
doing so.  That is because as much as possible we want users to be able to 
focus on the thread they are debugging, and not have events from other threads 
disrupt their attention.

It was safe to do this for code in a function because, except in very odd 
circumstances, the actions that might deadlock are always function calls.

Before this change we would run to the call, step into a function, then step 
out.  And we always run all threads when we step out of a function call.

With the change, we were extending the range past the call, but treating it as 
"code in the function", which it no longer was.  That would mean we were 
running arbitrary code with only on thread running, and that causes deadlocks 
in the target, and then the "step-over" would never complete.

This patch adds a parameter to GetIndexOfNextBranchInstruction telling whether 
we did extend past a call or not.  Then in ThreadPlanStepOverRange, we check 
this parameter and run one thread or all threads depending on whether we 
extended past a call.

I also added a test case for this behavior.  It is a little unfortunate that 
the failure of the test will really be a timeout since if we do it wrong the 
step-over doesn't return.  But I can't think of a better way of doing it.  I 
don't want to add a watchdog timer because then I'd get spurious failures on 
slow machines or fight against the overall timeouts...


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D71440

Files:
  lldb/include/lldb/Core/Disassembler.h
  lldb/include/lldb/Target/ThreadPlanStepRange.h
  lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/Makefile
  
lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/TestStepOverDoesntBlock.py
  lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/locking.c
  lldb/source/Core/Disassembler.cpp
  lldb/source/Target/Process.cpp
  lldb/source/Target/ThreadPlanStepRange.cpp

Index: lldb/source/Target/ThreadPlanStepRange.cpp
===================================================================
--- lldb/source/Target/ThreadPlanStepRange.cpp
+++ lldb/source/Target/ThreadPlanStepRange.cpp
@@ -238,8 +238,18 @@
 }
 
 bool ThreadPlanStepRange::StopOthers() {
-  return (m_stop_others == lldb::eOnlyThisThread ||
-          m_stop_others == lldb::eOnlyDuringStepping);
+  switch (m_stop_others) {
+  case lldb::eOnlyThisThread:
+    return true;
+  case lldb::eOnlyDuringStepping:
+    // If there is a call in the range of the next branch breakpoint,
+    // then we should always run all threads, since a call can execute
+    // arbitrary code which might for instance take a lock that's held
+    // by another thread.
+    return !m_found_calls;
+  case lldb::eAllThreads:
+    return false;
+  }
 }
 
 InstructionList *ThreadPlanStepRange::GetInstructionsForAddress(
@@ -292,6 +302,7 @@
     GetTarget().RemoveBreakpointByID(m_next_branch_bp_sp->GetID());
     m_next_branch_bp_sp.reset();
     m_could_not_resolve_hw_bp = false;
+    m_found_calls = false;
   }
 }
 
@@ -305,6 +316,9 @@
   if (!m_use_fast_step)
     return false;
 
+  // clear the m_found_calls, we'll rediscover it for this range.
+  m_found_calls = false;
+  
   lldb::addr_t cur_addr = GetThread().GetRegisterContext()->GetPC();
   // Find the current address in our address ranges, and fetch the disassembly
   // if we haven't already:
@@ -317,9 +331,11 @@
   else {
     Target &target = GetThread().GetProcess()->GetTarget();
     const bool ignore_calls = GetKind() == eKindStepOverRange;
+    bool found_calls;
     uint32_t branch_index =
         instructions->GetIndexOfNextBranchInstruction(pc_index, target,
-                                                      ignore_calls);
+                                                      ignore_calls, 
+                                                      &m_found_calls);
 
     Address run_to_address;
 
Index: lldb/source/Target/Process.cpp
===================================================================
--- lldb/source/Target/Process.cpp
+++ lldb/source/Target/Process.cpp
@@ -5800,7 +5800,8 @@
 
   uint32_t branch_index =
       insn_list->GetIndexOfNextBranchInstruction(insn_offset, target,
-                                                 false /* ignore_calls*/);
+                                                 false /* ignore_calls*/,
+                                                 nullptr);
   if (branch_index == UINT32_MAX) {
     return retval;
   }
Index: lldb/source/Core/Disassembler.cpp
===================================================================
--- lldb/source/Core/Disassembler.cpp
+++ lldb/source/Core/Disassembler.cpp
@@ -1101,15 +1101,22 @@
 uint32_t
 InstructionList::GetIndexOfNextBranchInstruction(uint32_t start,
                                                  Target &target,
-                                                 bool ignore_calls) const {
+                                                 bool ignore_calls,
+                                                 bool *found_calls) const {
   size_t num_instructions = m_instructions.size();
 
   uint32_t next_branch = UINT32_MAX;
   size_t i;
+  
+  if (found_calls)
+    *found_calls = false;
   for (i = start; i < num_instructions; i++) {
     if (m_instructions[i]->DoesBranch()) {
-      if (ignore_calls && m_instructions[i]->IsCall())
+      if (ignore_calls && m_instructions[i]->IsCall()) {
+        if (found_calls)
+          *found_calls = true;
         continue;
+      }
       next_branch = i;
       break;
     }
Index: lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/locking.c
===================================================================
--- /dev/null
+++ lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/locking.c
@@ -0,0 +1,85 @@
+#include <pthread.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <stdio.h>
+
+pthread_mutex_t contended_mutex = PTHREAD_MUTEX_INITIALIZER;
+
+pthread_mutex_t control_mutex = PTHREAD_MUTEX_INITIALIZER;
+pthread_cond_t  control_condition;
+
+pthread_mutex_t thread_started_mutex = PTHREAD_MUTEX_INITIALIZER;
+pthread_cond_t  thread_started_condition;
+
+// This function runs in a thread.  The locking dance is to make sure that 
+// by the time the main thread reaches the pthread_join below, this thread
+// has for sure acquired the contended_mutex.  So then the call_me_to_get_lock
+// function will block trying to get the mutex, and only succeed once it
+// signals this thread, then lets it run to wake up from the cond_wait and
+// release the mutex.
+
+void *
+lock_acquirer_1 (void *input)
+{
+  pthread_mutex_lock (&contended_mutex);
+  
+  // Grab this mutex, that will ensure that the main thread
+  // is in its cond_wait for it (since that's when it drops the mutex.
+
+  pthread_mutex_lock (&thread_started_mutex);
+  pthread_mutex_unlock(&thread_started_mutex);
+
+  // Now signal the main thread that it can continue, we have the contended lock
+  // so the call to call_me_to_get_lock won't make any progress till  this
+  // thread gets a chance to run.
+
+  pthread_mutex_lock (&control_mutex);
+
+  pthread_cond_signal (&thread_started_condition);
+
+  pthread_cond_wait (&control_condition, &control_mutex);
+
+  pthread_mutex_unlock (&contended_mutex);
+  return NULL;
+}
+
+int
+call_me_to_get_lock (int ret_val)
+{
+  pthread_cond_signal (&control_condition);
+  pthread_mutex_lock (&contended_mutex);
+  return ret_val;
+}
+
+int
+get_int() {
+  return 567;
+}
+
+int main ()
+{
+  pthread_t thread_1;
+
+  pthread_cond_init (&control_condition, NULL);
+  pthread_cond_init (&thread_started_condition, NULL);
+
+  pthread_mutex_lock (&thread_started_mutex);
+
+  pthread_create (&thread_1, NULL, lock_acquirer_1, NULL);
+  
+  pthread_cond_wait (&thread_started_condition, &thread_started_mutex);
+
+  pthread_mutex_lock (&control_mutex);
+  pthread_mutex_unlock (&control_mutex);
+
+  // Break here.  At this point the other thread will have the contended_mutex,
+  // and be sitting in its cond_wait for the control condition.  So there is
+  // no way that our by-hand calling of call_me_to_get_lock will proceed
+  // without running the first thread at least somewhat.
+
+  int result = call_me_to_get_lock(get_int());
+  pthread_join (thread_1, NULL);
+
+  return 0;
+
+}
Index: lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/TestStepOverDoesntBlock.py
===================================================================
--- /dev/null
+++ lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/TestStepOverDoesntBlock.py
@@ -0,0 +1,30 @@
+"""
+Test that step over will let other threads run when necessary
+"""
+
+from __future__ import print_function
+
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class StepOverDoesntDeadlockTestCase(TestBase):
+
+    mydir = TestBase.compute_mydir(__file__)
+
+    def test_step_over(self):
+        """Test that when step over steps over a function it lets other threads run."""
+        self.build()
+        (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(self,
+                                                                            "without running the first thread at least somewhat",
+                                                                            lldb.SBFileSpec("locking.c"))
+        # This is just testing that the step over actually completes.
+        # If the test fails this step never return, so failure is really
+        # signaled by the test timing out.
+        
+        thread.StepOver()
+        state = process.GetState()
+        self.assertEqual(state, lldb.eStateStopped)
Index: lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/Makefile
===================================================================
--- /dev/null
+++ lldb/packages/Python/lldbsuite/test/lang/c/step_over_no_deadlock/Makefile
@@ -0,0 +1,4 @@
+C_SOURCES := locking.c
+ENABLE_THREADS := YES
+
+include Makefile.rules
Index: lldb/include/lldb/Target/ThreadPlanStepRange.h
===================================================================
--- lldb/include/lldb/Target/ThreadPlanStepRange.h
+++ lldb/include/lldb/Target/ThreadPlanStepRange.h
@@ -76,6 +76,12 @@
   lldb::BreakpointSP m_next_branch_bp_sp;
   bool m_use_fast_step;
   bool m_given_ranges_only;
+  bool m_found_calls = false; // When we set the next branch breakpoint for
+                              // step over, we now extend them past call insns
+                              // that directly return.  But if we do that we
+                              // need to run all threads, or we might cause
+                              // deadlocks.  This tells us whether we found
+                              // any calls in setting the next branch breakpoint.
 
 private:
   std::vector<lldb::DisassemblerSP> m_instruction_ranges;
Index: lldb/include/lldb/Core/Disassembler.h
===================================================================
--- lldb/include/lldb/Core/Disassembler.h
+++ lldb/include/lldb/Core/Disassembler.h
@@ -287,6 +287,10 @@
   ///     a function call (a branch that calls and returns to the next
   ///     instruction). If false, find the instruction index of any 
   ///     branch in the list.
+  ///     
+  /// @param[out] found_calls
+  ///     If non-null, this will be set to true if any calls were found in 
+  ///     extending the range.
   ///    
   /// @return
   ///     The instruction index of the first branch that is at or past
@@ -295,7 +299,8 @@
   //------------------------------------------------------------------
   uint32_t GetIndexOfNextBranchInstruction(uint32_t start,
                                            Target &target,
-                                           bool ignore_calls) const;
+                                           bool ignore_calls,
+                                           bool *found_calls) const;
 
   uint32_t GetIndexOfInstructionAtLoadAddress(lldb::addr_t load_addr,
                                               Target &target);

_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D71440: Extending step-over range past calls was causing deadlocks, fix that.

Reply via email to