On 05/11/2021 08:52, Andrea Corallo via Gcc-patches wrote:
Hi all,
this patch enables address return signature and verification based on
Armv8.1-M Pointer Authentication [1].
To sign the return address, we use the PAC R12, LR, SP instruction
upon function entry. This is signing LR using SP and storing the
result in R12. R12 will be pushed into the stack.
During function epilogue R12 will be popped and AUT R12, LR, SP will
be used to verify that the content of LR is still valid before return.
Here an example of PAC instrumented function prologue and epilogue:
pac r12, lr, sp
push {r3, r7, lr}
push {r12}
sub sp, sp, #4
Which, as shown here, generates a stack which does not preserve 8-byte
alignment.
Also, what's wrong with
pac r12, lr, sp
push {r3, r7, ip, lr}
?
Which saves 2 bytes in the prologue and ...
[...] function body
add sp, sp, #4
pop {r12}
pop {r3, r7, lr}
aut r12, lr, sp
bx lr
pop {r3, r7, ip, lr}
aut r12, lr, sp
bx lr
which saves 4 bytes in the epilogue (repeated for each instance of the
epilogue).
The patch also takes care of generating a PACBTI instruction in place
of the sequence BTI+PAC when Branch Target Identification is enabled
contextually.
What about variadic functions?
What about functions where lr is live on entry (where it's used for
passing the closure in nested functions)?
These two patches apply on top of Tejas series posted here [2].
Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped.
Best Regards
Andrea
[1]
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
[2] <https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581176.html>
+static bool arm_pac_enabled_for_curr_function_p (void);
I really don't like that name. There are a lot of functions with
variations of 'current function' in the name already and this creates
yet another variant. Something like
arm_current_function_pac_enabled_p() would be preferable; or, if that
really is too long, use 'current_func' which already has usage within
the compiler.
+(define_insn "pac_ip_lr_sp"
+ [(set (reg:DI IP_REGNUM)
+ (unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+ UNSPEC_PAC_IP_LR_SP))]
+ ""
+ "pac\tr12, lr, sp")
+
+(define_insn "pacbti_ip_lr_sp"
+ [(set (reg:DI IP_REGNUM)
+ (unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+ UNSPEC_PACBTI_IP_LR_SP))]
+ ""
+ "pacbti\tr12, lr, sp")
+
+(define_insn "aut_ip_lr_sp"
+ [(unspec:DI [(reg:DI IP_REGNUM) (reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+ UNSPEC_AUT_IP_LR_SP)]
+ ""
+ "aut\tr12, lr, sp")
+
I think all these need a length attribute. Also, they should only be
enabled for thumb2 (certainly not in Arm state).
And when using explicit register names in an asm, prefix each name with
'%|', just in case the assembler dialect has a register name prefix.
The names are somewhat unweildy, can't we use something more usefully
descriptive, like 'pac_nop', 'pacbti_nop' and 'aut_nop', since all these
instructions are using the architectural NOP space.
Finally, I think we need some more tests that cover the various
frame-pointer flavours when used in combination with this feature and
for various corners of the PCS.
R.