Re: [Lldb-commits] [PATCH] D24610: LLDB Arm Watchpoints: Use single hardware watchpoint slot to watch multiple bytes where possible

Muhammad Omair Javaid via lldb-commits Mon, 19 Sep 2016 13:04:43 -0700

omjavaid added a comment.

Answers to comments. I will upload a updated patch after corrections and 
updates.



================
Comment at: 
packages/Python/lldbsuite/test/functionalities/watchpoint/multi_watchpoint_slots/main.c:23
@@ +22,3 @@
+    {
+        printf("About to write byteArray[%d] ...\n", i); // About to write 
byteArray
+
----------------
labath wrote:
> zturner wrote:
> > What's up with all the double spaced source code?  Is this intentional?
> Indeed. The spaces seem superfluous.
Agreed. Will be removed in next iteration.

================
Comment at: 
source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp:513-521
@@ -513,1 +512,11 @@
+
+  // Find out how many bytes we need to watch after 4-byte alignment boundary.
+  uint8_t watch_size = (addr & 0x03) + size;
+
+  // We cannot watch zero or more than 4 bytes after 4-byte alignment boundary.
+  if (size == 0 || watch_size > 4)
+    return LLDB_INVALID_INDEX32;
+
+  // Strip away last two bits of address for byte/half-word/word selection.
+  addr &= ~((lldb::addr_t)3);
 
----------------
labath wrote:
> zturner wrote:
> > This block of code is a bit confusing to me.  Is this equivalent to:
> > 
> > ```
> > lldb::addr_t start = llvm::alignDown(addr, 4);
> > lldb::addr_t end = addr + size;
> > if (start == end || (end-start)>4)
> >   return LLDB_INVALID_INDEX32;
> > ```
> I am not sure this is much clearer, especially, as we will later need a 
> separate varaible for `end-start` anyway.
> 
> +1 for `llvm::alignDown` though.
There is significant performance difference when we choose addr = addr & 
(~0x03); over llvm::alignDown

It will eventually effect responsiveness if we keep increasing code size like 
this. 

Even if we use -Os then alignDown is squeezed down to 8-9 instructions. I havnt 
tried clang though.

Instructions needed for addr = addr & (~0x03):
  4008bd:       48 8b 45 e8             mov    -0x18(%rbp),%rax
  4008c1:       48 83 e0 fc             and    $0xfffffffffffffffc,%rax
  4008c5:       48 89 45 e8             mov    %rax,-0x18(%rbp)

Call penalty for llvm::alignDown
400918: ba 00 00 00 00          mov    $0x0,%edx
  40091d:       48 89 ce                mov    %rcx,%rsi
  400920:       48 89 c7                mov    %rax,%rdi
  400923:       e8 ae 00 00 00          callq  4009d6 <_Z9alignDownmmm>
  400928:       49 89 c4                mov    %rax,%r12
  40092b:       48 8b 5d d8             mov    -0x28(%rbp),%rbx

Disassembly for llvm::alignDown
00000000004009d6 <_Z9alignDownmmm>:
  4009d6:       55                      push   %rbp
  4009d7:       48 89 e5                mov    %rsp,%rbp
  4009da:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
  4009de:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
  4009e2:       48 89 55 e8             mov    %rdx,-0x18(%rbp)
  4009e6:       48 8b 45 e8             mov    -0x18(%rbp),%rax
  4009ea:       ba 00 00 00 00          mov    $0x0,%edx
  4009ef:       48 f7 75 f0             divq   -0x10(%rbp)
  4009f3:       48 89 55 e8             mov    %rdx,-0x18(%rbp)
  4009f7:       48 8b 45 f8             mov    -0x8(%rbp),%rax
  4009fb:       48 2b 45 e8             sub    -0x18(%rbp),%rax
  4009ff:       ba 00 00 00 00          mov    $0x0,%edx
  400a04:       48 f7 75 f0             divq   -0x10(%rbp)
  400a08:       48 0f af 45 f0          imul   -0x10(%rbp),%rax
  400a0d:       48 89 c2                mov    %rax,%rdx
  400a10:       48 8b 45 e8             mov    -0x18(%rbp),%rax
  400a14:       48 01 d0                add    %rdx,%rax
  400a17:       5d                      pop    %rbp
  400a18:       c3                      retq   
  400a19:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Number of instructions generated for alignDown with gcc -Os

  400892:       48 8b 6c 24 08          mov    0x8(%rsp),%rbp
  400897:       48 8b 4c 24 10          mov    0x10(%rsp),%rcx
  40089c:       31 d2                   xor    %edx,%edx
  40089e:       be c2 0a 40 00          mov    $0x400ac2,%esi
  4008a3:       bf a0 11 60 00          mov    $0x6011a0,%edi
  4008a8:       48 89 e8                mov    %rbp,%rax
  4008ab:       48 f7 f1                div    %rcx
  4008ae:       48 0f af c1             imul   %rcx,%rax
  4008b2:       48 89 c3                mov    %rax,%rbx

================
Comment at: source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp:552
@@ -559,1 +551,3 @@
 
+  uint8_t current_watch_size, new_watch_size;
+  // Calculate overall size width to be watched by current hardware watchpoint 
slot.
----------------
labath wrote:
> Looks much better. Any reason for not using `NextPowerOf2` ? Among other 
> things, it is self-documenting, so you do not need the comment above that.
so llvm::NextPowerOf2 doesnt serve the intended behaviour.

llvm::NextPowerOf2 returns 2 for 1, 4 for 2 or 3, and 8 for 4.

We just want to make sure that our new_size is a power of 2 if its already not 
which will only be the case if size turns out to be 3.


================
Comment at: source/Plugins/Process/Linux/NativeThreadLinux.cpp:202
@@ +201,3 @@
+  // Invalidate watchpoint index map to re-sync watchpoint registers.
+  m_watchpoint_index_map.clear();
+
----------------
labath wrote:
> If you add this, then the comment below becomes obsolete.
> 
> Seems like a pretty elegant solution to the incremental watchpoint update 
> problem. I am wondering whether we need to do it on every resume though. I 
> think it should be enough to do it when a watchpoint gets deleted 
> (`NTL::RemoveWatchpoint`). Also, we should throw out the implementation of 
> `NativeRegisterContextLinux_arm::ClearHardwareWatchpoint` -- it's no longer 
> necessary, and it's not even correct anymore.
This can be improved for performance I intentionally didn't do it to minimize 
changes to the generic component. 


https://reviews.llvm.org/D24610



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Re: [Lldb-commits] [PATCH] D24610: LLDB Arm Watchpoints: Use single hardware watchpoint slot to watch multiple bytes where possible

Reply via email to