DavidSpickett added a comment. > Do you have any explanation on this from AARM
Yes I do. linux arch/arm64/kernel/ptrace.c: /* * The PAC bits can differ across data and instruction pointers * depending on TCR_EL1.TBID*, which we may make use of in future, so * we expose separate masks. */ unsigned long mask = ptrauth_user_pac_mask(); struct user_pac_mask uregs = { .data_mask = mask, .insn_mask = mask, }; So currently we'll only ever see one value, in both masks. The control bit this refers to is: D13.2.131 TCR_EL1, Translation Control Register (EL1) TBID0, bit [51] 0b0 TCR_EL1.TBI0 applies to Instruction and Data accesses. 0b1 TCR_EL1.TBI0 applies to Data accesses only. This is talked about earlier in the docs: Supported PAC field and relation to the use of address tagging When address tagging is used The PAC field is Xn[54:bottom_PAC_bit]. When address tagging is not used The PAC field is Xn[63:56, 54:bottom_PAC_bit]. The upshot of that is that you could have top byte ignore and PAC for data, but only PAC for instruction addresses. PAC itself is all or nothing, at the hardware level it's on or off. If you wanted to not use it for one of code or data your runtime simply chooses not to sign any pointers. Like arm64e appears to do for data (https://developer.apple.com/documentation/security/preparing_your_app_to_work_with_pointer_authentication). The current masks that lldb shows, which have top byte ignore included already: (lldb) process status --verbose <...> Addressable code address mask: 0xff7f000000000000 Addressable data address mask: 0xff7f000000000000 So the end result is the same for us. What could happen is a future extension that isn't top byte ignore could use those bits instead of PAC, making the PAC specific mask 0x007f... Though I don't know how Linux would reconsile enabling TBI for userspace then doing that. Maybe the amount of top byte use is small enough it could be changed (especially top byte of code addresses). But chances are slim it seems to me. So back to my ideas in the previous comment. > Assume that they're the same, which does work for Linux, for now. Would work fine for Linux for now and probably for a long time given that changing the TBI setting would be seen as an ABI issue. And if someone decided to disable TBI completely and only use PAC, this still works because PAC extends into the top byte. If they do decide to disable TBI for instructions then we're still fine given that the mask to extract the virtual address remains the same. Yes the PAC mask has changed but the debugger is looking to remove *all* non-address bits. E.g. If we disable TBI for instruction accesses the mask is 0xff7f000000000000 because PAC claims the top byte. Then the mask for data accesses is 0x007f000000000000 but we add TBI to get 0xff7f000000000000. Same result in the end. So we could just pick one of the methods and standardise on that for sitautions where you don't know for sure it'll be a code address. This will have to be `FixDataAddress` due to Arm Thumb's mode bit 0. We don't want to be aligning all reads to 2 bytes. (FWIW this matches what I've done so far, though that was unintentional) Perhaps we add a third method to make that clear (name subject to change) `FixAnyAddress`. Then the Arm code can forward that to fixdata and AArch64 can pick either data or code. For situations where you're sure you can pick code or data e.g. code breakpoint on an address. > Add a method that does both fixes, on the assumption that the virtual address > size for code and data is the same so no harm done and all bits will be > removed either way. The Arm Thumb problem means this is not going to work. (not that those targets are likely to care about non-address bits but these Fix calls are made from generic code so it does still matter) > Extensively track whether addresses refer to code or data Isn't realistic a lot of the time. Though there are some clear situations where FixCode or FixData makes more sense so we can do some of this, just not an lldb wide tracking framework sort of thing. So my suggestion for a solution would be to add a FixAnyAddress alongside FixCode and FixData, and use that whenever it could be either. Tricky things like Arm Thumb can then choose what the most "safe" fix is. Tell me if that logic makes sense. > Which will mean we actually dont need two separate functions. At the ABI plugin level we do simply due to Arm Thumb existing. Lower down yeah you could get away with reading just one of the PAC masks but it's not much of a saving. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118794/new/ https://reviews.llvm.org/D118794 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits