DavidSpickett added a comment.

> Do you have any explanation on this from AARM

Yes I do.

  linux arch/arm64/kernel/ptrace.c:
    /*
     * The PAC bits can differ across data and instruction pointers
     * depending on TCR_EL1.TBID*, which we may make use of in future, so
     * we expose separate masks.
     */
    unsigned long mask = ptrauth_user_pac_mask();
    struct user_pac_mask uregs = {
      .data_mask = mask,
      .insn_mask = mask,
    };

So currently we'll only ever see one value, in both masks. The control bit this 
refers to is:

  D13.2.131 TCR_EL1, Translation Control Register (EL1)
  
  TBID0, bit [51]
  
  0b0 TCR_EL1.TBI0 applies to Instruction and Data accesses.
  0b1 TCR_EL1.TBI0 applies to Data accesses only.

This is talked about earlier in the docs:

  Supported PAC field and relation to the use of address tagging
  
  When address tagging is used
  The PAC field is Xn[54:bottom_PAC_bit].
  
  When address tagging is not used
  The PAC field is Xn[63:56, 54:bottom_PAC_bit].

The upshot of that is that you could have top byte ignore and PAC for data, but 
only PAC for instruction addresses.

PAC itself is all or nothing, at the hardware level it's on or off. If you 
wanted to not use it for one of code or data
your runtime simply chooses not to sign any pointers. Like arm64e appears to do 
for data
(https://developer.apple.com/documentation/security/preparing_your_app_to_work_with_pointer_authentication).

The current masks that lldb shows, which have top byte ignore included already:

  (lldb) process status --verbose
  <...>
  Addressable code address mask: 0xff7f000000000000
  Addressable data address mask: 0xff7f000000000000

So the end result is the same for us. What could happen is a future extension 
that isn't top byte ignore could use
those bits instead of PAC, making the PAC specific mask 0x007f...

Though I don't know how Linux would reconsile enabling TBI for userspace then 
doing that. Maybe the amount of top byte
use is small enough it could be changed (especially top byte of code 
addresses). But chances are slim it seems to me.

So back to my ideas in the previous comment.

> Assume that they're the same, which does work for Linux, for now.

Would work fine for Linux for now and probably for a long time given that 
changing the TBI setting would be seen as an ABI issue.
And if someone decided to disable TBI completely and only use PAC, this still 
works because PAC extends into the top byte.

If they do decide to disable TBI for instructions then we're still fine given 
that the mask to extract the virtual address remains
the same. Yes the PAC mask has changed but the debugger is looking to remove 
*all* non-address bits.

E.g. If we disable TBI for instruction accesses the mask is 0xff7f000000000000 
because PAC claims the top byte.
Then the mask for data accesses is 0x007f000000000000 but we add TBI to get 
0xff7f000000000000. Same result in the end.

So we could just pick one of the methods and standardise on that for sitautions 
where you don't know for sure it'll be a code address.
This will have to be `FixDataAddress` due to Arm Thumb's mode bit 0. We don't 
want to be aligning all reads to 2 bytes.
(FWIW this matches what I've done so far, though that was unintentional)

Perhaps we add a third method to make that clear (name subject to change) 
`FixAnyAddress`. Then the Arm code can forward that to fixdata and AArch64
can pick either data or code. For situations where you're sure you can pick 
code or data e.g. code breakpoint on an address.

> Add a method that does both fixes, on the assumption that the virtual address 
> size for code and data is the same so no harm done and all bits will be 
> removed either way.

The Arm Thumb problem means this is not going to work. (not that those targets 
are likely to care about non-address bits but these Fix calls are made from 
generic code
so it does still matter)

> Extensively track whether addresses refer to code or data

Isn't realistic a lot of the time. Though there are some clear situations where 
FixCode or FixData makes more sense so we can do some of this, just not an lldb 
wide tracking
framework sort of thing.

So my suggestion for a solution would be to add a FixAnyAddress alongside 
FixCode and FixData, and use that whenever it could be either. Tricky things 
like Arm Thumb can
then choose what the most "safe" fix is.

Tell me if that logic makes sense.

> Which will mean we actually dont need two separate functions.

At the ABI plugin level we do simply due to Arm Thumb existing. Lower down yeah 
you could get away with reading just one of the PAC masks but it's not much of 
a saving.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118794/new/

https://reviews.llvm.org/D118794

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to