Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64
+ Likun On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote: > > On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra wrote: > > > > On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote: > > > > > In the compiler part[4], most of the content is the same as Sami's > > > implementation[3], except for some minor differences, mainly including: > > > > > > 1. The function typeid is calculated differently and it is difficult > > > to be consistent. > > > > This means there is an effective ABI break between the compilers, which > > is sad :-( Is there really nothing to be done about this? > > I agree, this would be unfortunate, and would also be a compatibility > issue with rustc where there's ongoing work to support > clang-compatible CFI type hashes: > > https://github.com/rust-lang/rust/pull/105452 > > Sami
Re: [RFC/RFT, V2 0/3] Add compiler support for Kernel Control Flow Integrity
Hi Kees, Sincerely sorry, I just saw this email. Embarrassingly, due to another job change, my plan was postponed again :(. I may not be able to attend this year's GCC meeting. Is there any other way to let this get some traction in GCC? I really hope someone can help with this topic. BTW, I'm still looking at this and plan to finish it by the end of this year, but it's taking too long and there's a lot of uncertainty, so please just consider this only as a backup option. Thanks, Dan. On Thu, 22 Jun 2023 at 05:54, Kees Cook wrote: > > On Sat, Mar 25, 2023 at 01:11:14AM -0700, Dan Li wrote: > > This series of patches is mainly used to support the control flow > > integrity protection of the linux kernel [1], which is similar to > > -fsanitize=kcfi in clang 16.0 [2,3]. > > > > Any suggestion please let me know :). > > Hi Dan, > > It's been a couple months, and I didn't see any other feedback on this > proposal. I was curious what the status of this work is. Are you able to > attend GNU Cauldron[1] this year? I'd love to see this get some traction > in GCC. > > Thanks! > > -Kees > > [1] https://gcc.gnu.org/wiki/cauldron2023 > > -- > Kees Cook
Re: [RFC/RFT, V2 0/3] Add compiler support for Kernel Control Flow Integrity
Hi All, Embarrassingly, due to personal reasons, I may not be able to complete the series of patches on the forward side of GCC CFI for the time being. Please forgive me for not realizing that I should have sent this help email a long time ago :( This topic has been delayed for a long time, and I would be very grateful if someone can help complete this series of patches. BTW, please let me know if there are more groups I can cc for help. Thanks! Dan. On Sat, 25 Mar 2023 at 16:11, Dan Li wrote: > > This series of patches is mainly used to support the control flow > integrity protection of the linux kernel [1], which is similar to > -fsanitize=kcfi in clang 16.0 [2,3]. > > Any suggestion please let me know :). > > Thanks, Dan. > > [1] > https://lore.kernel.org/all/20220908215504.3686827-1-samitolva...@google.com/ > [2] https://clang.llvm.org/docs/ControlFlowIntegrity.html > [3] https://reviews.llvm.org/D119296 > > Signed-off-by: Dan Li > > --- > Dan Li (3): > [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to > 64 bits to support more features > [PR102768] Support CFI: Add basic support for Kernel Control Flow > Integrity > [PR102768] aarch64: Add support for Kernel Control Flow Integrity > > gcc/asan.h| 4 +- > gcc/c-family/c-attribs.cc | 10 +- > gcc/c-family/c-common.h | 2 +- > gcc/c/c-parser.cc | 4 +- > gcc/cfgexpand.cc | 26 ++ > gcc/cgraphunit.cc | 34 +++ > gcc/combine.cc| 1 + > gcc/common.opt| 4 +- > gcc/config/aarch64/aarch64.cc | 166 ++ > gcc/cp/typeck.cc | 2 +- > gcc/doc/invoke.texi | 36 > gcc/doc/tm.texi | 27 ++ > gcc/doc/tm.texi.in| 8 ++ > gcc/dwarf2asm.cc | 2 +- > gcc/emit-rtl.cc | 1 + > gcc/emit-rtl.h| 4 + > gcc/final.cc | 24 - > gcc/flag-types.h | 67 +++--- > gcc/gimple.cc | 11 +++ > gcc/gimple.h | 5 +- > gcc/opt-suggestions.cc| 2 +- > gcc/opts.cc | 26 +++--- > gcc/opts.h| 8 +- > gcc/output.h | 3 + > gcc/reg-notes.def | 1 + > gcc/target.def| 38 > gcc/toplev.cc | 4 + > gcc/tree-cfg.cc | 2 +- > gcc/tree.cc | 144 + > gcc/tree.h| 1 + > gcc/varasm.cc | 26 ++ > 31 files changed, 627 insertions(+), 66 deletions(-) > > -- > 2.17.1 >
[PING] AArch64: add R30_REGNUM into shrink-wrapping separate
Gentile ping for this :), thanks. Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html R30_REGNUM could also be used as a component in shrink-wrapping separate, this patch enables it in aarch64. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_get_separate_components): Remove bitmap clear of R30_REGNUM. (aarch64_components_for_bb): Support R30_REGNUM as a component. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shrink_wrap_separate_1.c: New test. --- gcc/config/aarch64/aarch64.cc | 4 ++-- .../gcc.target/aarch64/shrink_wrap_separate_1.c | 17 + 2 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 8bcee8be9eb..6e1589b0312 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8463,7 +8463,6 @@ aarch64_get_separate_components (void) if (reg1 != INVALID_REGNUM) bitmap_clear_bit (components, reg1); - bitmap_clear_bit (components, LR_REGNUM); bitmap_clear_bit (components, SP_REGNUM); return components; @@ -8500,7 +8499,8 @@ aarch64_components_for_bb (basic_block bb) /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */ for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (!fixed_regs[regno] - && !crtl->abi->clobbers_full_reg_p (regno) + && (regno == R30_REGNUM + || !crtl->abi->clobbers_full_reg_p (regno)) && (TEST_HARD_REG_BIT (extra_caller_saves, regno) || bitmap_bit_p (in, regno) || bitmap_bit_p (gen, regno) diff --git a/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c new file mode 100644 index 000..34002705ace --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fomit-frame-pointer -fdump-rtl-pro_and_epilogue" } */ + +void f(); + +int g(int x) +{ + if (x == 0) +{ + __asm__ ("":::"x19", "x20"); + return 1; +} + f(); + return 2; +} + +/* { dg-final { scan-rtl-dump {The components we wrap separately are \[sep 30\]} "pro_and_epilogue" } } */ -- 2.17.1
[PING^2] AArch64: add R30_REGNUM into shrink-wrapping separate
Gentile ping for this :), thanks. Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html R30_REGNUM could also be used as a component in shrink-wrapping separate, this patch enables it in aarch64. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_get_separate_components): Remove bitmap clear of R30_REGNUM. (aarch64_components_for_bb): Support R30_REGNUM as a component. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shrink_wrap_separate_1.c: New test. --- gcc/config/aarch64/aarch64.cc | 4 ++-- .../gcc.target/aarch64/shrink_wrap_separate_1.c | 17 + 2 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 8bcee8be9eb..6e1589b0312 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8463,7 +8463,6 @@ aarch64_get_separate_components (void) if (reg1 != INVALID_REGNUM) bitmap_clear_bit (components, reg1); - bitmap_clear_bit (components, LR_REGNUM); bitmap_clear_bit (components, SP_REGNUM); return components; @@ -8500,7 +8499,8 @@ aarch64_components_for_bb (basic_block bb) /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */ for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (!fixed_regs[regno] - && !crtl->abi->clobbers_full_reg_p (regno) + && (regno == R30_REGNUM + || !crtl->abi->clobbers_full_reg_p (regno)) && (TEST_HARD_REG_BIT (extra_caller_saves, regno) || bitmap_bit_p (in, regno) || bitmap_bit_p (gen, regno) diff --git a/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c new file mode 100644 index 000..34002705ace --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fomit-frame-pointer -fdump-rtl-pro_and_epilogue" } */ + +void f(); + +int g(int x) +{ + if (x == 0) +{ + __asm__ ("":::"x19", "x20"); + return 1; +} + f(); + return 2; +} + +/* { dg-final { scan-rtl-dump {The components we wrap separately are \[sep 30\]} "pro_and_epilogue" } } */ -- 2.17.1
[PING^3] AArch64: add R30 into shrink-wrapping separate
Gentile ping for this :), thanks. Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html R30_REGNUM could also be used as a component in shrink-wrapping separate, this patch enables it in aarch64. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_get_separate_components): Remove bitmap clear of R30_REGNUM. (aarch64_components_for_bb): Support R30_REGNUM as a component. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shrink_wrap_separate_1.c: New test. --- gcc/config/aarch64/aarch64.cc | 4 ++-- .../gcc.target/aarch64/shrink_wrap_separate_1.c | 17 + 2 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 8bcee8be9eb..6e1589b0312 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8463,7 +8463,6 @@ aarch64_get_separate_components (void) if (reg1 != INVALID_REGNUM) bitmap_clear_bit (components, reg1); - bitmap_clear_bit (components, LR_REGNUM); bitmap_clear_bit (components, SP_REGNUM); return components; @@ -8500,7 +8499,8 @@ aarch64_components_for_bb (basic_block bb) /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */ for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (!fixed_regs[regno] - && !crtl->abi->clobbers_full_reg_p (regno) + && (regno == R30_REGNUM + || !crtl->abi->clobbers_full_reg_p (regno)) && (TEST_HARD_REG_BIT (extra_caller_saves, regno) || bitmap_bit_p (in, regno) || bitmap_bit_p (gen, regno) diff --git a/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c new file mode 100644 index 000..34002705ace --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fomit-frame-pointer -fdump-rtl-pro_and_epilogue" } */ + +void f(); + +int g(int x) +{ + if (x == 0) +{ + __asm__ ("":::"x19", "x20"); + return 1; +} + f(); + return 2; +} + +/* { dg-final { scan-rtl-dump {The components we wrap separately are \[sep 30\]} "pro_and_epilogue" } } */ -- 2.17.1
Re: [PING] AArch64: add R30_REGNUM into shrink-wrapping separate
On 4/12/22 06:05, Richard Sandiford wrote: Dan Li writes: Gentile ping for this :), thanks. Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html Sorry, I should have realised this at the time, but I don't think we can do this after all. The ABI requires us to set up the frame chain before assigning to the frame pointer. Sorry again for the bogus suggestion. Thanks, Richard OK, thanks Richard :) But I think I still don't quite understand it, so could the x30 be shrinked alone when we don't need the frame chain? Thanks, Dan.
[PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Gentile ping for this again. Could someone please give me a quick reply first? :), thanks. Link: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586204.html Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): New. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): New decl. * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): New. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS. * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): New macro. (TARGET_CHECK_SCS_RESERVED_REGISTER): Likewise. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK):New macro. * doc/extend.texi: Document -fsanitize=shadow-call-stack. * doc/invoke.texi: Document attribute. * flag-types.h (enum sanitize_code):Add SANITIZE_SHADOW_CALL_STACK. * opts-global.c (handle_common_deferred_options): Add SCS compile option check. * opts.c (finish_options): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. --- gcc/c-family/c-attribs.c | 21 + gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 27 +++ gcc/config/aarch64/aarch64.h | 11 + gcc/config/aarch64/aarch64.md | 18 gcc/defaults.h| 4 ++ gcc/doc/extend.texi | 7 +++ gcc/doc/invoke.texi | 29 gcc/flag-types.h | 2 + gcc/opts-global.c | 6 +++ gcc/opts.c| 12 + .../gcc.target/aarch64/shadow_call_stack_1.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_2.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 18 15 files changed, 213 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 007b928c54b..9b3a35c06bf 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address",0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_shadow_call_stack", + 0, 0, true, false, false, false, + handle_no_sanitize_shadow_call_stack_attribute, + NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -1175,6 +1181,21 @@ handle_no_sanitize_address
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/20/22 04:02, Richard Sandiford wrote: Thanks for the patch and sorry for the (very) slow review. Thanks for the review, Richard :). +/* Handle a "no_sanitize_shadow_call_stack" attribute; arguments as in + struct attribute_spec.handler. */ +static tree +handle_no_sanitize_shadow_call_stack_attribute (tree *node, tree name, + tree, int, bool *no_add_attrs) +{ + *no_add_attrs = true; + if (TREE_CODE (*node) != FUNCTION_DECL) +warning (OPT_Wattributes, "%qE attribute ignored", name); + else +add_no_sanitize_value (*node, SANITIZE_SHADOW_CALL_STACK); + + return NULL_TREE; +} + Do we need this? I think these days the preference is to use the general "no_sanitize" attribute with an argument, which is also the syntax documented on the clang page. We have to support no_sanitize_foo attributes for some of the early sanitiser features, to avoid breaking backwards compatibility, but it doesn't look like clang ever supported no_sanitize_shadow_call_stack. Oh, "no_sanitize" is enough, I will delete this in the next version. +/* Return TRUE if shadow call stack should be enabled for the current + function, otherwise return FALSE. */ + +bool +aarch64_shadow_call_stack_enabled (void) +{ + /* This function should only be called after frame laid out. */ + gcc_assert (cfun->machine->frame.laid_out); + + if (crtl->calls_eh_return) +return false; + + /* We only deal with a function if its LR is pushed onto stack + and attribute no_sanitize_shadow_call_stack is not specified. */ (This would need to be updated if we do drop the specific attribute.) Ok. + /* Push return address to shadow call stack. */ + if (aarch64_shadow_call_stack_enabled ()) + emit_insn (gen_scs_push ()); Formatting nit: should only be indented by two spaces more than the “if”. Same below. Got it. I guess doing it like this means that we also continue to store x30 to the frame in the traditional way. And that's probably necessary, given that the saved x30 forms part of link chain. Yes, we just added an extra insn to the prologue like: + str x30, [x18], #8 stp x29, x30, [sp, #-16]! ... + if (flag_stack_usage_info) current_function_static_stack_size = constant_lower_bound (frame_size); @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) RTX_FRAME_RELATED_P (insn) = 1; } + /* Pop return address from shadow call stack. */ + if (aarch64_shadow_call_stack_enabled ()) + emit_insn (gen_scs_pop ()); + This looks correct, but following on from the above, I guess we continue to restore x30 from the frame in the traditional way first, and then overwrite it with the SCS value. I think the output code would be slightly neater if we suppressed the first restore of x30. Yes, the current epilogue is like: ... ldp x29, x30, [sp], #16 + ldr x30, [x18, #-8]! I think may be we can keep the two x30 pops here, for two reasons: 1) The implementation of scs in clang is to pop x30 twice, it might be better to be consistent with clang here[1]. 2) If we keep the first restore of x30, it may provide some flexibility for other scenarios. For example, we can directly patch the scs_push/pop insns in the binary to turn it into a binary without scs; [1] https://github.com/llvm/llvm-project/commit/f11eb3ebe77729426e562d7d4d7ebb1d5ff2e7c8 +/* This value represents whether the shadow call stack is implemented on + the target platform. */ +#define TARGET_SUPPORT_SHADOW_CALL_STACK true + +#define TARGET_CHECK_SCS_RESERVED_REGISTER() \ + do { \ +if (!fixed_regs[R18_REGNUM]) \ + error ("%<-fsanitize=shadow-call-stack%> only " \ +"allowed with %<-ffixed-x18%>"); \ + } while (0) + Please make these target hooks instead. The first one can use DEFHOOKPOD. Ok, I will add a field to gcc_target via DEFHOOKPOD. +;; Save X30 in the X18-based POST_INC stack (consistent with clang). +(define_insn "scs_push" + [(set (mem:DI (reg:DI R18_REGNUM)) (reg:DI R30_REGNUM)) + (set (reg:DI R18_REGNUM) (plus:DI (reg:DI R18_REGNUM) (const_int 8)))] + "" + "str\\tx30, [x18], #8" + [(set_attr "type" "store_8")] +) + +;; Load X30 form the X18-based PRE_DEC stack (consistent with clang). +(define_insn "scs_pop" + [(set (reg:DI R18_REGNUM) (minus:DI (reg:DI R18_REGNUM) (const_int 8))) + (set (reg:DI R30_REGNUM) (mem:DI (reg:DI R18_REGNUM)))] + "" + "ldr\\tx30, [x18, #-8]!" + [(set_attr "type" "load_8")] +) + The two SETs here work in parallel, with the define_insn as a whole following a "read input operands, act, write output operands" sequence. The (mem: …) address therefore sees the old value of r18 rather than the new one. Also, RTL rules say that subtractions need to be written as additions of the negative. Oh, sorry, I got it wrong here. I think the pattern would therefore be something l
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/25/22 02:19, Richard Sandiford wrote: Dan Li writes: + if (flag_stack_usage_info) current_function_static_stack_size = constant_lower_bound (frame_size); @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) RTX_FRAME_RELATED_P (insn) = 1; } + /* Pop return address from shadow call stack. */ + if (aarch64_shadow_call_stack_enabled ()) + emit_insn (gen_scs_pop ()); + This looks correct, but following on from the above, I guess we continue to restore x30 from the frame in the traditional way first, and then overwrite it with the SCS value. I think the output code would be slightly neater if we suppressed the first restore of x30. Yes, the current epilogue is like: ... ldp x29, x30, [sp], #16 + ldr x30, [x18, #-8]! I think may be we can keep the two x30 pops here, for two reasons: 1) The implementation of scs in clang is to pop x30 twice, it might be better to be consistent with clang here[1]. This doesn't seem a very compelling reason on its own, unless it's explicitly meant to be observable behaviour. (But then, observed how?) Well, probably sticking to pop x30 twice is not a good idea. AFAICT, there doesn't seem to be an explicit requirement that this behavior must be followed. BTW: Do we still need to emit the .cfi_restore 30 directive here if we want to save a pop x30? (Sorry I'm not familiar enough with DWARF.) Since the aarch64 linux kernel always enables -fno-omit-frame-pointer, the generated assembly code may be as follows: str x30, [x18], 8 ldp x29, x30, [sp], 16 .. ldr x29, [sp], 16 ## Do we still need a .cfi_restore 30 here .cfi_restore 29 .cfi_def_cfa_offset 0 ldr x30, [x18, -8]! ## Or may be a non-CFA based directive here ret 2) If we keep the first restore of x30, it may provide some flexibility for other scenarios. For example, we can directly patch the scs_push/pop insns in the binary to turn it into a binary without scs; Is that a supported (and ideally documented) use case? If it is, then it becomes important for correctness that the compiler emits exactly the opcodes in the original patch. If it isn't, then the compiler can emit other instructions that have the same effect. Oh, no, this is just a little trick that can be used for compatibility. (Maybe some scenarios such as our company sometimes need to be compatible with some non-open source binaries from different manufacturers, two pops could make life easier :). ) If this is not a consideration for our community, please ignore this request. I'd like a definitive ruling on this from the kernel folks before the patch goes in. Ok, I'll cc some kernel folks to make sure I didn't miss something. If binary patching is supposed to be possible then scs_push and scs_pop *do* need to be separate define_insns. But they also need to have some magic unspec that differentiates them from normal pushes and pops, e.g.: (set ...mem... (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) so that there is no chance that the pattern would be treated as a normal move and optimised accordingly. Yeah, this template looks more appropriate if it is to be treated as a special directive. Thanks for your suggestions, Dan
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) When clang enables scs, the following instructions are usually generated: str x30, [x18], 8 ldp x29, x30, [sp], 16 .. ldp x29, x30, [sp], 16 ## x30 pop ldr x30, [x18, -8]! ## x30 pop again ret The x30 register is popped twice here, Richard suggested that we can omit the first x30 pop here. AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm missing something with the kernel, could someone give some suggestions? The previous discussion can be found here [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html Thanks a lot! Dan On 1/25/22 22:51, Dan Li wrote: On 1/25/22 02:19, Richard Sandiford wrote: Dan Li writes: + if (flag_stack_usage_info) current_function_static_stack_size = constant_lower_bound (frame_size); @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) RTX_FRAME_RELATED_P (insn) = 1; } + /* Pop return address from shadow call stack. */ + if (aarch64_shadow_call_stack_enabled ()) + emit_insn (gen_scs_pop ()); + This looks correct, but following on from the above, I guess we continue to restore x30 from the frame in the traditional way first, and then overwrite it with the SCS value. I think the output code would be slightly neater if we suppressed the first restore of x30. Yes, the current epilogue is like: ... ldp x29, x30, [sp], #16 + ldr x30, [x18, #-8]! I think may be we can keep the two x30 pops here, for two reasons: 1) The implementation of scs in clang is to pop x30 twice, it might be better to be consistent with clang here[1]. This doesn't seem a very compelling reason on its own, unless it's explicitly meant to be observable behaviour. (But then, observed how?) Well, probably sticking to pop x30 twice is not a good idea. AFAICT, there doesn't seem to be an explicit requirement that this behavior must be followed. BTW: Do we still need to emit the .cfi_restore 30 directive here if we want to save a pop x30? (Sorry I'm not familiar enough with DWARF.) Since the aarch64 linux kernel always enables -fno-omit-frame-pointer, the generated assembly code may be as follows: str x30, [x18], 8 ldp x29, x30, [sp], 16 .. ldr x29, [sp], 16 ## Do we still need a .cfi_restore 30 here .cfi_restore 29 .cfi_def_cfa_offset 0 ldr x30, [x18, -8]! ## Or may be a non-CFA based directive here ret 2) If we keep the first restore of x30, it may provide some flexibility for other scenarios. For example, we can directly patch the scs_push/pop insns in the binary to turn it into a binary without scs; Is that a supported (and ideally documented) use case? If it is, then it becomes important for correctness that the compiler emits exactly the opcodes in the original patch. If it isn't, then the compiler can emit other instructions that have the same effect. Oh, no, this is just a little trick that can be used for compatibility. (Maybe some scenarios such as our company sometimes need to be compatible with some non-open source binaries from different manufacturers, two pops could make life easier :). ) If this is not a consideration for our community, please ignore this request. I'd like a definitive ruling on this from the kernel folks before the patch goes in. Ok, I'll cc some kernel folks to make sure I didn't miss something. If binary patching is supposed to be possible then scs_push and scs_pop *do* need to be separate define_insns. But they also need to have some magic unspec that differentiates them from normal pushes and pops, e.g.: (set ...mem... (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) so that there is no chance that the pattern would be treated as a normal move and optimised accordingly. Yeah, this template looks more appropriate if it is to be treated as a special directive. Thanks for your suggestions, Dan
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Thanks, Ard, On 1/26/22 00:10, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) When clang enables scs, the following instructions are usually generated: str x30, [x18], 8 ldp x29, x30, [sp], 16 .. ldp x29, x30, [sp], 16 ## x30 pop ldr x30, [x18, -8]! ## x30 pop again ret The x30 register is popped twice here, Richard suggested that we can omit the first x30 pop here. AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm missing something with the kernel, could someone give some suggestions? The previous discussion can be found here [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html As was pointed out in the discussion, binary patching is in fact a concern for the Linux kernel. E.g., Android uses generic binary builds, but we would like to be able to switch between pointer authentication and shadow call stack at boot time, rather than always support both, and take the SCS performance hit on systems that implement PAC as well. However, it seems more straight-forward to patch PACIASP and AUTIASP instructions into SCS push/pop instructions rather than the other way around, as we can force the use of these exact opcodes [in the NOP space]), as well as rely on existing unwind annotations to locate any such instruction in the binary. Well, then I think I don't need to submit a kernel patch to enable SCS for gcc :) BTW: Do we have a plan to submit patches of dynamic patch PAC into the kernel recently? So omitting the load of X30 from the ordinary stack seems fine to me. On 1/25/22 22:51, Dan Li wrote: On 1/25/22 02:19, Richard Sandiford wrote: Well, probably sticking to pop x30 twice is not a good idea. AFAICT, there doesn't seem to be an explicit requirement that Ok, I'll cc some kernel folks to make sure I didn't miss something. To Richard: Sorry for my mistake. Due to binary compatibility issues, SCS related code may not be directly merged into libgcc/glibc, do we still need to add this patch into GCC? (I'd like to finish it if that makes sense). Thanks all for your time! Dan If binary patching is supposed to be possible then scs_push and scs_pop *do* need to be separate define_insns. But they also need to have some magic unspec that differentiates them from normal pushes and pops, e.g.: (set ...mem... (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) so that there is no chance that the pattern would be treated as a normal move and optimised accordingly. Yeah, this template looks more appropriate if it is to be treated as a special directive. Thanks for your suggestions, Dan
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/26/22 03:09, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 11:40, Dan Li wrote: Thanks, Ard, On 1/26/22 00:10, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) When clang enables scs, the following instructions are usually generated: str x30, [x18], 8 ldp x29, x30, [sp], 16 .. ldp x29, x30, [sp], 16 ## x30 pop ldr x30, [x18, -8]! ## x30 pop again ret The x30 register is popped twice here, Richard suggested that we can omit the first x30 pop here. AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm missing something with the kernel, could someone give some suggestions? The previous discussion can be found here [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html As was pointed out in the discussion, binary patching is in fact a concern for the Linux kernel. E.g., Android uses generic binary builds, but we would like to be able to switch between pointer authentication and shadow call stack at boot time, rather than always support both, and take the SCS performance hit on systems that implement PAC as well. However, it seems more straight-forward to patch PACIASP and AUTIASP instructions into SCS push/pop instructions rather than the other way around, as we can force the use of these exact opcodes [in the NOP space]), as well as rely on existing unwind annotations to locate any such instruction in the binary. Well, then I think I don't need to submit a kernel patch to enable SCS for gcc :) Not entirely. BTW: Do we have a plan to submit patches of dynamic patch PAC into the kernel recently? At the moment, there are just some ideas floating around. I did implement a proof of concept that uses unwind data, but it hit some issues with cfi_negate_ra_state being emitted imprecisely (GCC) or not at all (Clang) in some cases. Using objtool would be another possibility. So in summary, getting SCS support proper into GCC is definitely worth the effort. Got it! And thanks for the suggestion, Ard :)
[PATCH] [PATCH, v3, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_layout_frame): Change callee_adjust when scs is enabled. (aarch64_restore_callee_saves): Avoid pop x30 twice when scs is enabled. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS, while preventing it from being popped from the regular stack again. (aarch64_override_options_internal): Add SCS compile option check. (TARGET_HAVE_SHADOW_CALL_STACK): New hook. * config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * doc/invoke.texi: Document -fsanitize=shadow-call-stack. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add hook have_shadow_call_stack. * flag-types.h (enum sanitize_code): Add SANITIZE_SHADOW_CALL_STACK. * opts.c: Add shadow-call-stack. * target.def: New hook. * toplev.c (process_options): Add SCS compile option check. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. * gcc.target/aarch64/shadow_call_stack_5.c: New test. * gcc.target/aarch64/shadow_call_stack_6.c: New test. * gcc.target/aarch64/shadow_call_stack_7.c: New test. * gcc.target/aarch64/shadow_call_stack_8.c: New test. --- V3: - Change scs_push/pop to standard move patterns. - Optimize scs_pop to avoid pop x30 twice when shadow stack is enabled. gcc/config/aarch64/aarch64.c | 66 +-- gcc/config/aarch64/aarch64.h | 4 ++ gcc/config/aarch64/aarch64.md | 10 +++ gcc/doc/invoke.texi | 30 + gcc/doc/tm.texi | 5 ++ gcc/doc/tm.texi.in| 2 + gcc/flag-types.h | 2 + gcc/opts.c| 1 + gcc/target.def| 8 +++ .../gcc.target/aarch64/shadow_call_stack_1.c | 6 ++ .../gcc.target/aarch64/shadow_call_stack_2.c | 6 ++ .../gcc.target/aarch64/shadow_call_stack_3.c | 45 + .../gcc.target/aarch64/shadow_call_stack_4.c | 20 ++ .../gcc.target/aarch64/shadow_call_stack_5.c | 18 + .../gcc.target/aarch64/shadow_call_stack_6.c | 18 + .../gcc.target/aarch64/shadow_call_stack_7.c | 18 + .../gcc.target/aarch64/shadow_call_stack_8.c | 24 +++ gcc/toplev.c | 10 +++ 18 files changed, 289 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_8.c diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 699c105a42a..461c205010e 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -79,6 +79,7 @@ #include "tree-ssa-loop-niter.h" #include "fractional-cost.h" #include "rtlanal.h" +#include "asan.h" /* This file should be included last. */ #include "target-def.h" @@ -7478,10 +7479,31 @@ aarch64_layout_frame (void) frame.sve_callee_adjust = 0; frame.callee_offset = 0; + /* Shadow call stack only deal with functions where the LR is pushed + onto the stack and without specifying the "no_sanitize" attribute + with the argument "shadow-call-stack". */ + frame.is_scs_enabled += (!crtl->calls_eh_return + && (sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK) + && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0))); + + /* When sh
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Hi, Richard, I have sent out my v3[1], and (probably) fixed the previous issues, please let me know if i got something wrong :) [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589471.html Thanks, Dan. On 1/25/22 02:19, Richard Sandiford wrote: Dan Li writes: + if (flag_stack_usage_info) current_function_static_stack_size = constant_lower_bound (frame_size); @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) RTX_FRAME_RELATED_P (insn) = 1; } + /* Pop return address from shadow call stack. */ + if (aarch64_shadow_call_stack_enabled ()) + emit_insn (gen_scs_pop ()); + This looks correct, but following on from the above, I guess we continue to restore x30 from the frame in the traditional way first, and then overwrite it with the SCS value. I think the output code would be slightly neater if we suppressed the first restore of x30. Yes, the current epilogue is like: ... ldp x29, x30, [sp], #16 + ldr x30, [x18, #-8]! I think may be we can keep the two x30 pops here, for two reasons: 1) The implementation of scs in clang is to pop x30 twice, it might be better to be consistent with clang here[1]. This doesn't seem a very compelling reason on its own, unless it's explicitly meant to be observable behaviour. (But then, observed how?) 2) If we keep the first restore of x30, it may provide some flexibility for other scenarios. For example, we can directly patch the scs_push/pop insns in the binary to turn it into a binary without scs; Is that a supported (and ideally documented) use case? If it is, then it becomes important for correctness that the compiler emits exactly the opcodes in the original patch. If it isn't, then the compiler can emit other instructions that have the same effect. I'd like a definitive ruling on this from the kernel folks before the patch goes in. If binary patching is supposed to be possible then scs_push and scs_pop *do* need to be separate define_insns. But they also need to have some magic unspec that differentiates them from normal pushes and pops, e.g.: (set ...mem... (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) so that there is no chance that the pattern would be treated as a normal move and optimised accordingly. +;; Save X30 in the X18-based POST_INC stack (consistent with clang). +(define_insn "scs_push" + [(set (mem:DI (reg:DI R18_REGNUM)) (reg:DI R30_REGNUM)) + (set (reg:DI R18_REGNUM) (plus:DI (reg:DI R18_REGNUM) (const_int 8)))] + "" + "str\\tx30, [x18], #8" + [(set_attr "type" "store_8")] +) + +;; Load X30 form the X18-based PRE_DEC stack (consistent with clang). +(define_insn "scs_pop" + [(set (reg:DI R18_REGNUM) (minus:DI (reg:DI R18_REGNUM) (const_int 8))) + (set (reg:DI R30_REGNUM) (mem:DI (reg:DI R18_REGNUM)))] + "" + "ldr\\tx30, [x18, #-8]!" + [(set_attr "type" "load_8")] +) + The two SETs here work in parallel, with the define_insn as a whole following a "read input operands, act, write output operands" sequence. The (mem: …) address therefore sees the old value of r18 rather than the new one. Also, RTL rules say that subtractions need to be written as additions of the negative. Oh, sorry, I got it wrong here. I think the pattern would therefore be something like: [(set (reg:DI R30_REGNUM) (mem:DI (plus:DI (reg:DI R18_REGNUM) (const_int -8] (set (reg:DI R18_REGNUM) (plus:DI (reg:DI R18_REGNUM) (const_int -8)))] However, since these are normal moves, I think we should be able to use the standard move patterns. E.g. the push could be: (define_expand "scs_push" [(set (mem:DI (pre_dec:DI (reg:DI R18_REGNUM))) (reg:DI R30_REGNUM))]) and similarly for the pop. Thanks, this template looks better. Since the push/pop of SCS and normal stack in clang are different (for example, scs push is post_inc, while normal stack is pre_dec), in order to maintain consistency, I think it can be changed to the following: (define_expand "scs_push" [(set (mem:DI (post_inc:DI (reg:DI R18_REGNUM))) (reg:DI R30_REGNUM))]) (define_expand "scs_pop" [(set (reg:DI R30_REGNUM) (mem:DI (pre_dec:DI (reg:DI R18_REGNUM]) Yeah, I copied the wrong name above, sorry. Thanks, Richard
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/31/22 08:26, Richard Sandiford wrote: Thanks for the discussion and sorry for the slow reply, was out most of last week. Dan Li writes: Thanks, Ard, On 1/26/22 00:10, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) So omitting the load of X30 from the ordinary stack seems fine to me. OK, thanks. Let's go with that for now then. There would still be time to change our minds before GCC 12 is released, if anyone feels that patching SCS code would be useful. Reading it back, I think my previous message came across as sounding like a complaint against binary patching, which wasn't the case at all. I think it would be fine to support patching, even if it was just for a single vendor rather than expected to be a common case. It's just that, if we did want to support it, we'd need to document it as a requirement (at least within GCC) and change the implementation accordingly. Got it, then I'll implement this feature as discussed above and see if we could add additional options for SCS later. Thanks, Dan
Re: [PATCH] [PATCH,v3,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/31/22 09:00, Richard Sandiford wrote: Dan Li writes: Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. /* This file should be included last. */ #include "target-def.h" @@ -7478,10 +7479,31 @@ aarch64_layout_frame (void) frame.sve_callee_adjust = 0; frame.callee_offset = 0; + /* Shadow call stack only deal with functions where the LR is pushed Typo: s/deal/deals/ Sorry for my non-standard English expression :) + onto the stack and without specifying the "no_sanitize" attribute + with the argument "shadow-call-stack". */ + frame.is_scs_enabled += (!crtl->calls_eh_return + && (sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK) + && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0))); Nit, but normal GCC style would be to use a single chain of &&s here: frame.is_scs_enabled = (!crtl->calls_eh_return && sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK) && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0)); Got it. + + /* When shadow call stack is enabled, the scs_pop in the epilogue will + restore x30, and we don't need to pop x30 again in the traditional + way. At this time, if candidate2 is x30, we need to adjust + max_push_offset to 256 to ensure that the offset meets the requirements + of emit_move_insn. Similarly, if candidate1 is x30, we need to set + max_push_offset to 0, because x30 is not popped up at this time, so + callee_adjust cannot be adjusted. */ HOST_WIDE_INT max_push_offset = 0; if (frame.wb_candidate2 != INVALID_REGNUM) -max_push_offset = 512; - else if (frame.wb_candidate1 != INVALID_REGNUM) +{ + if (frame.is_scs_enabled && frame.wb_candidate2 == R30_REGNUM) + max_push_offset = 256; + else + max_push_offset = 512; +} + else if ((frame.wb_candidate1 != INVALID_REGNUM) + && !(frame.is_scs_enabled && frame.wb_candidate1 == R30_REGNUM)) max_push_offset = 256; HOST_WIDE_INT const_size, const_outgoing_args_size, const_fp_offset; Maybe we should instead add separate fields for wb_push_candidate[12] and wb_pop_candidate[12]. The pop candidates would start out the same as the push candidates, but R30_REGNUM would get replaced with INVALID_REGNUM for SCS. This looks more reasonable, I'll change it in the next version. Admittedly, suppressing the restore of x30 is turning out to be a bit more difficult than I'd realised :-/ […] diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 2792bb29adb..1610a4fd74c 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -916,6 +916,10 @@ struct GTY (()) aarch64_frame unsigned spare_pred_reg; bool laid_out; + + /* Nonzero if shadow call stack should be enabled for the current + function, otherwise return FALSE. */ “True” seems better than “Nonzero” given that this is a bool. (A lot of GCC bools were originally ints, which is why “nonzero” still appears in non-obvious places.) I think we can just drop “otherwise return FALSE”: “return” doesn't seem appropriate here, given that it's a variable. Got it, thanks for the explanation. Looks great otherwise. Thanks especially for testing the corner cases. :-) One minor thing: +/* { dg-final { scan-assembler-times "str\tx30, \\\[x18\\\], \[#|$\]?8" 2 } } */ +/* { dg-final { scan-assembler-times "ldr\tx30, \\\[x18, \[#|$\]?-8\\\]!" 2 } } */ This sort of regexp can be easier to write if you quote them using {…} rather than "…", since it reduces the number of backslashes needed. E.g.: /* { dg-final { scan-assembler-times {str\tx30, \[x18\], [#|$]?8} 2 } } */ The current version is fine too though, and is widely used. Just mentioning it in case it's useful in future. Oh, thanks Richard, I didn't notice it before. Also, [#|$]? can be written #?. Ok. Thanks, Richard
[PATCH] [PATCH, v4, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/ChangeLog: * config/aarch64/aarch64.c (SLOT_REQUIRED): Rename wb_candidate[12] to wb_push_candidate[12]. (aarch64_layout_frame): Likewise, and change callee_adjust when scs is enabled. (aarch64_save_callee_saves): Rename wb_candidate[12] to wb_push_candidate[12]. (aarch64_restore_callee_saves): Likewise. (aarch64_get_separate_components): Likewise. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS, while preventing it from being popped from the regular stack again. (aarch64_override_options_internal): Add SCS compile option check. (TARGET_HAVE_SHADOW_CALL_STACK): New hook. * config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled, wb_pop_candidate[12], and rename wb_candidate[12] to wb_push_candidate[12]. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * doc/invoke.texi: Document -fsanitize=shadow-call-stack. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add hook have_shadow_call_stack. * flag-types.h (enum sanitize_code): Add SANITIZE_SHADOW_CALL_STACK. * opts.c: Add shadow-call-stack. * target.def: New hook. * toplev.c (process_options): Add SCS compile option check. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. * gcc.target/aarch64/shadow_call_stack_5.c: New test. * gcc.target/aarch64/shadow_call_stack_6.c: New test. * gcc.target/aarch64/shadow_call_stack_7.c: New test. * gcc.target/aarch64/shadow_call_stack_8.c: New test. --- V4: - Added wb_[push|pop]_candidates[12] to ensure push/pop can emit different registers. V3: - Change scs_push/pop to standard move patterns. - Optimize scs_pop to avoid pop x30 twice when shadow stack is enabled. gcc/config/aarch64/aarch64.c | 121 +- gcc/config/aarch64/aarch64.h | 21 ++- gcc/config/aarch64/aarch64.md | 10 ++ gcc/doc/invoke.texi | 30 + gcc/doc/tm.texi | 5 + gcc/doc/tm.texi.in| 2 + gcc/flag-types.h | 2 + gcc/opts.c| 1 + gcc/target.def| 8 ++ .../gcc.target/aarch64/shadow_call_stack_1.c | 6 + .../gcc.target/aarch64/shadow_call_stack_2.c | 6 + .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 20 +++ .../gcc.target/aarch64/shadow_call_stack_5.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_6.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_7.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_8.c | 24 gcc/toplev.c | 10 ++ 18 files changed, 332 insertions(+), 33 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_8.c diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 699c105a42a..f4d962917c4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -79,6 +79,7 @@ #include "tree-ssa-loop-niter.h" #include "fractional-cost.h" #include "rtlanal.h" +#include "asan.h" /* This file should be included last. */ #include "target-def.h" @@ -7291,8 +7292,8 @@ aarch64_layout_frame (void) #define SLOT_NOT_REQUIRED (-2) #define SLOT_REQUIRED (-1) - frame.wb_candidate1 = INVALID_REGNUM; - frame.wb_c
Re: [PATCH] [PATCH,v3,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Hi, Richard, I have sent out my v4[1], please let me know if i got something wrong :). [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589921.html Thanks, Dan. On 1/31/22 09:00, Richard Sandiford wrote: Dan Li writes: Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_layout_frame): Change callee_adjust when scs is enabled. (aarch64_restore_callee_saves): Avoid pop x30 twice when scs is enabled. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS, while preventing it from being popped from the regular stack again. (aarch64_override_options_internal): Add SCS compile option check. (TARGET_HAVE_SHADOW_CALL_STACK): New hook. * config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * doc/invoke.texi: Document -fsanitize=shadow-call-stack. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add hook have_shadow_call_stack. * flag-types.h (enum sanitize_code): Add SANITIZE_SHADOW_CALL_STACK. * opts.c: Add shadow-call-stack. * target.def: New hook. * toplev.c (process_options): Add SCS compile option check. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. * gcc.target/aarch64/shadow_call_stack_5.c: New test. * gcc.target/aarch64/shadow_call_stack_6.c: New test. * gcc.target/aarch64/shadow_call_stack_7.c: New test. * gcc.target/aarch64/shadow_call_stack_8.c: New test. --- V3: - Change scs_push/pop to standard move patterns. - Optimize scs_pop to avoid pop x30 twice when shadow stack is enabled. gcc/config/aarch64/aarch64.c | 66 +-- gcc/config/aarch64/aarch64.h | 4 ++ gcc/config/aarch64/aarch64.md | 10 +++ gcc/doc/invoke.texi | 30 + gcc/doc/tm.texi | 5 ++ gcc/doc/tm.texi.in| 2 + gcc/flag-types.h | 2 + gcc/opts.c| 1 + gcc/target.def| 8 +++ .../gcc.target/aarch64/shadow_call_stack_1.c | 6 ++ .../gcc.target/aarch64/shadow_call_stack_2.c | 6 ++ .../gcc.target/aarch64/shadow_call_stack_3.c | 45 + .../gcc.target/aarch64/shadow_call_stack_4.c | 20 ++ .../gcc.target/aarch64/shadow_call_stack_5.c | 18 + .../gcc.target/aarch64/shadow_call_stack_6.c | 18 + .../gcc.target/aarch64/shadow_call_stack_7.c | 18 + .../gcc.target/aarch64/shadow_call_stack_8.c | 24 +++ gcc/toplev.c | 10 +++ 18 files changed, 289 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_8.c diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 699c105a42a..461c205010e 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -79,6 +79,7 @@ #include "tree-ssa-loop-niter.h" #include "fractional-cost.h" #include "rtlanal.h" +#include "asan.h" /* This file should be included last. */ #include "target-def.h" @@ -7478,10 +7479,31 @@ aarch64_layout_frame (void) frame.sve_callee_adjust = 0; frame.callee_offset = 0; + /* Shadow call stack only deal with functions where the LR is pushed Typo: s/deal/deals/ + onto the stack and without specifying the "no_sa
[AArch64] Question about the condition of calls_alloca in aarch64_layout_frame
There is the following code in aarch64_layout_frame: else if (crtl->outgoing_args_size.is_constant (&const_outgoing_args_size) && frame.saved_regs_size.is_constant (&const_saved_regs_size) && const_outgoing_args_size + const_saved_regs_size < 512 && (!saves_below_hard_fp_p || const_outgoing_args_size == 0) // 1) && !(cfun->calls_alloca && frame.hard_fp_offset.is_constant (&const_fp_offset) && const_fp_offset < max_push_offset)) { /* Frame with small outgoing arguments: sub sp, sp, frame_size stp reg1, reg2, [sp, outgoing_args_size] stp reg3, reg4, [sp, outgoing_args_size + 16] */ frame.initial_adjust = frame.frame_size; frame.callee_offset = const_outgoing_args_size; } .. else if (frame.hard_fp_offset.is_constant (&const_fp_offset) && const_fp_offset < max_push_offset) { // 2) /* Frame with large outgoing arguments or SVE saves, but with a small local area: stp reg1, reg2, [sp, -hard_fp_offset]! stp reg3, reg4, [sp, 16] [sub sp, sp, below_hard_fp_saved_regs_size] [save SVE registers relative to SP] sub sp, sp, outgoing_args_size */ As described in 2), "Frame with large outgoing arguments or SVE saves, but with a small local area". But due to the condition at 1), the following code (small outgoing with a small local area) also uses the insns in 2), which is slightly different from the description: //aarch64-linux-gnu-gcc main.c -O0 -S main.s -g -w -fomit-frame-pointer #include #include #define REP9(X) X,X,X,X,X,X,X,X,X int alloc_size; int main(void) { outgoing (REP9(1)); char * y = alloca(alloc_size); return 0; } Could the condition in 1) be removed, or am I missing something? Thanks, Dan.
Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/9/22 08:08, Richard Sandiford wrote: Dan Li writes: + + /* When shadow call stack is enabled, the scs_pop in the epilogue will + restore x30, and we don't need to pop x30 again in the traditional + way. Pop candidates record the registers that need to be popped + eventually. */ + if (frame.is_scs_enabled) +{ + if (frame.wb_push_candidate2 == R30_REGNUM) + frame.wb_pop_candidate2 = INVALID_REGNUM; + else if (frame.wb_push_candidate1 == R30_REGNUM) + frame.wb_pop_candidate1 = INVALID_REGNUM; Although it makes no difference to the behaviour, I think it would be clearer to use pop rather than push in the checks here. Got it. @@ -7885,8 +7914,8 @@ aarch64_save_callee_saves (poly_int64 start_offset, bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno); if (skip_wb - && (regno == cfun->machine->frame.wb_candidate1 - || regno == cfun->machine->frame.wb_candidate2)) + && (regno == cfun->machine->frame.wb_push_candidate1 + || regno == cfun->machine->frame.wb_push_candidate2)) continue; if (cfun->machine->reg_is_wrapped_separately[regno]) @@ -7996,8 +8025,8 @@ aarch64_restore_callee_saves (poly_int64 start_offset, unsigned start, rtx reg, mem; if (skip_wb - && (regno == cfun->machine->frame.wb_candidate1 - || regno == cfun->machine->frame.wb_candidate2)) + && (regno == cfun->machine->frame.wb_push_candidate1 + || regno == cfun->machine->frame.wb_push_candidate2)) Shouldn't this be using pop rather than push? There might be a little difference: - Using push candidates means that a register to be ignored in pop candidates will not be emitted again during the "restore" (pop_candidates should always be a subset of push_candidates, since popping a register without a push might not make sense). - Using pop candidates means that a registers to be ignored in pop candidates will be re-emitted during the "restore". For example, if we specify to ignore the x20 register in pop: --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -7502,6 +7502,8 @@ aarch64_layout_frame (void) frame.wb_pop_candidate1 = INVALID_REGNUM; } + if (frame.wb_pop_candidate2 == R20_REGNUM) + frame.wb_pop_candidate2 = INVALID_REGNUM; /* If candidate2 is INVALID_REGNUM, we need to adjust max_push_offset to 256 to ensure that the offset meets the requirements of emit_move_insn. Similarly, if candidate1 is INVALID_REGNUM, we need to set With the test case: int main(void) { __asm__ ("":::"x19", "x20"); return 0; } When we use "pop_candidate[12]", one more insn is emitted: 00400604 : 400604: a9bf53f3stp x19, x20, [sp, #-16]! 400608: 5280mov w0, #0x0 + 40060c: f94007f4ldr x20, [sp, #8] 400610: f84107f3ldr x19, [sp], #16 400614: d65f03c0ret But in the case of ignoring a specific register (like scs ignores x30), there is no difference between the two (because we always need to explicitly specify which registers to ignore in the parameter of aarch64_restore_callee_saves). If pop looks better here, I'd like to change it to pop in the next version :). + /* When shadow call stack is enabled, the scs_pop in the epilogue will + restore x30, we don't need to restore x30 again in the traditional + way. */ + if (cfun->machine->frame.is_scs_enabled) +aarch64_restore_callee_saves (callee_offset - sve_callee_adjust, + R0_REGNUM, R29_REGNUM, + callee_adjust != 0, &cfi_ops); + else +aarch64_restore_callee_saves (callee_offset - sve_callee_adjust, + R0_REGNUM, R30_REGNUM, + callee_adjust != 0, &cfi_ops); + Very minor, but I think it would be better to have: unsigned int last_gpr = (cfun->machine->frame.is_scs_enabled ? R29_REGNUM : R30_REGNUM); so that we don't need to repeat the other arguments. There's then less risk of the two versions getting out of sync. Got it. if (need_barrier_p) emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx)); @@ -9066,6 +9109,17 @@ aarch64_expand_epilogue (bool for_sibcall) RTX_FRAME_RELATED_P (insn) = 1; } + /* Pop return address from shadow call stack. */ + if (cfun->machine->frame.is_scs_enabled) +{ + machine_mode mode = aarch64_reg_save_mode (R30_REGNUM); + rtx reg = gen_rtx_REG (mode, R30_REGNUM); + + insn = emit_insn (gen_scs_pop ()); + add_reg_note (insn, REG_CFA_RESTORE, reg); +
Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/10/22 01:55, Richard Sandiford wrote: There might be a little difference: - Using push candidates means that a register to be ignored in pop candidates will not be emitted again during the "restore" (pop_candidates should always be a subset of push_candidates, since popping a register without a push might not make sense). The push candidates are simply a subset of the saved registers though. Similarly, the pop candidates are simply a subset of the restored registers. So I think the requirement operates at that level: the restored registers must be a subset of the saved registers. In other circumstances it could have been the other way around: there might have been a change that stopped us from saving two registers during the allocation, but we wanted to carry on restoring two registers during the deallocation. I don't think there's a reason that the push candidates *have* to be a superset of the pop candidates (even though they are with the current change). Oh yeah, that sounds more reasonable. When we use "pop_candidate[12]", one more insn is emitted: 00400604 : 400604: a9bf53f3stp x19, x20, [sp, #-16]! 400608: 5280mov w0, #0x0 + 40060c: f94007f4ldr x20, [sp, #8] 400610: f84107f3ldr x19, [sp], #16 400614: d65f03c0ret But in the case of ignoring a specific register (like scs ignores x30), there is no difference between the two (because we always need to explicitly specify which registers to ignore in the parameter of aarch64_restore_callee_saves). I think this is the correct behaviour. If we don't want to restore a register at all then it should be excluded from the restore list somehow. In your case you're doing that be using a limit of X29_REGNUM instead of X30_REGNUM. Got it, I'll use pop candidates in the next version. FWIW, I did wonder whether aarch64_restore_callee_saves should be doing the scs pop, rather than aarch64_expand_epilogue, and in an earlier draft of the previous review I'd asked for that. It does seem conceptually cleaner, but in practice, it would probably have been awkward to implement. E.g. we'd need to explicitly stop an LDP being formed with X30 as the second register. Well, then I think I should keep it the same here :). But treating scs push and scs pop as part of the register save and restore sequences would have one advantage: it would allow the scs push and scs pop to be shrink-wrapped. Sorry for my limited knowledge of shrink warping, I don't think I get it here (I tried to find a case when compiling the kernel and some gcc test cases but I still don't have a clue.). I see that the bitmap of LR_REGNUM is cleared in aarch64_get_separate_components and scs push/pop are x18 based operations. If we handle them in aarch64_restore/save_callee_saves, could scs push/pop be shrink-wrapped in some cases? Thanks, Dan
Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/11/22 01:53, Richard Sandiford wrote: Dan Li writes: On 2/10/22 01:55, Richard Sandiford wrote: But treating scs push and scs pop as part of the register save and restore sequences would have one advantage: it would allow the scs push and scs pop to be shrink-wrapped. Sorry for my limited knowledge of shrink warping, I don't think I get it here (I tried to find a case when compiling the kernel and some gcc test cases but I still don't have a clue.). I see that the bitmap of LR_REGNUM is cleared in aarch64_get_separate_components and scs push/pop are x18 based operations. If we handle them in aarch64_restore/save_callee_saves, could scs push/pop be shrink-wrapped in some cases? Yeah, I think so. E.g. for: void f(); int g(int x) { if (x == 0) return 1; f(); return 2; } shrink wrapping would allow the scs push and pop to move along with the x30 save: g: cbnzw0, .L9 mov w0, 1 ret .L9: stp x29, x30, [sp, -16]! mov x29, sp bl f mov w0, 2 ldp x29, x30, [sp], 16 ret Thanks Richard, (to make sure I understand correctly :)) I think it means that the current patch could do a "shrink-wapping", but the X30 could not be treat as a "component", now it could gen code like: g: cbnzw0, .L9 mov w0, 1 ret .L9: str x30, [x18], 8 stp x29, x30, [sp, -16]! mov x29, sp bl f ldr x30, [x18, -8]! mov w0, 2 ldr x29, [sp], 16 ret The idea is that aarch64_save_callee_saves would treat the scs push as part of saving x30 (along with the normal store to the frame chain, when used). aarch64_restore_callee_saves would similarly treat the scs pop as the way of restoring x30 (instead of loading from the frame chain). This is in contrast to the current patch, where the scs push and pop are treated as fixed parts of the prologue and epilogue instead, and where aarch64_restore_callee_saves tries to avoid doing anything for x30. If shrink-wrapping decides to treat x30 as a separate “component”, as it does in the example above, then the scs push and pop would be emitted by aarch64_process_components instead. It would be more complex, but it would give better code. Following your idea, I made a poc to add x30 in component bitmap: diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 35f6f64f5b2..fc9b5e7af54 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8359,7 +8359,7 @@ aarch64_get_separate_components (void) if (reg1 != INVALID_REGNUM) bitmap_clear_bit (components, reg1); - bitmap_clear_bit (components, LR_REGNUM); bitmap_clear_bit (components, SP_REGNUM); return components; @@ -8396,7 +8396,7 @@ aarch64_components_for_bb (basic_block bb) /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */ for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (!fixed_regs[regno] - && !crtl->abi->clobbers_full_reg_p (regno) + && (!crtl->abi->clobbers_full_reg_p (regno) || regno == R30_REGNUM) && (TEST_HARD_REG_BIT (extra_caller_saves, regno) || bitmap_bit_p (in, regno) || bitmap_bit_p (gen, regno) And with a test code compiled with -fno-omit-frame-pointer: void f(); int g(int x) { if (x == 0) { __asm__ ("":::"x19", "x20"); return 1; } f(); return 2; } Then it seems X30 is treat as a "component" (the test result of aarch64.exp also seems fine). g: stp x19, x20, [sp, -32]! cbnzw0, .L2 mov w0, 1 ldp x19, x20, [sp], 32 ret .L2: str x30, [sp, 16] bl f ldr x30, [sp, 16] mov w0, 2 ldp x19, x20, [sp], 32 ret And I think maybe we could handle this through three patches: 1.Keep current patch (a V5) unchanged for scs. 2.Add shrink-warpping for X30: logically this might be a separate topic, and I think more testing might be needed here (Well, I'm a little worried about if there might be other effects, since I just read this part of the code roughly yesterday). 3.Add scs push/pop to shrink-wrapping (and maybe we can do the same for the PAC code in pro/epilogue, since it's also the operation of the X30). Thanks, Dan
Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/11/22 07:35, Richard Sandiford wrote: Dan Li writes: On 2/11/22 01:53, Richard Sandiford wrote: Dan Li writes: On 2/10/22 01:55, Richard Sandiford wrote: And I think maybe we could handle this through three patches: 1.Keep current patch (a V5) unchanged for scs. 2.Add shrink-warpping for X30: logically this might be a separate topic, and I think more testing might be needed here (Well, I'm a little worried about if there might be other effects, since I just read this part of the code roughly yesterday). 3.Add scs push/pop to shrink-wrapping (and maybe we can do the same for the PAC code in pro/epilogue, since it's also the operation of the X30). Yeah, that's fair. (Like I said earlier, I wasn't asking for the shrink-wrapping change. It was just a note in passing. But as you point out, the individual shrink-wrapping support would be even more work than I'd imagined.) Got it! Thanks, Dan
[PATCH] [PATCH, v5, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/ChangeLog: * config/aarch64/aarch64.cc (SLOT_REQUIRED): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_layout_frame): Likewise, and change callee_adjust when scs is enabled. (aarch64_save_callee_saves): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_restore_callee_saves): Change wb_candidate[12] to wb_pop_candidate[12]. (aarch64_get_separate_components): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS, while preventing it from being popped from the regular stack again. (aarch64_override_options_internal): Add SCS compile option check. (TARGET_HAVE_SHADOW_CALL_STACK): New hook. * config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled, wb_pop_candidate[12], and rename wb_candidate[12] to wb_push_candidate[12]. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * doc/invoke.texi: Document -fsanitize=shadow-call-stack. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add hook have_shadow_call_stack. * flag-types.h (enum sanitize_code): Add SANITIZE_SHADOW_CALL_STACK. * opts.cc: Add shadow-call-stack. * target.def: New hook. * toplev.cc (process_options): Add SCS compile option check. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. * gcc.target/aarch64/shadow_call_stack_5.c: New test. * gcc.target/aarch64/shadow_call_stack_6.c: New test. * gcc.target/aarch64/shadow_call_stack_7.c: New test. * gcc.target/aarch64/shadow_call_stack_8.c: New test. --- V5: - Modify part of wb_push_candidates to wb_pop_candidates. - Rebase to the mainline (20220210). V4: - Added wb_[push|pop]_candidates[12] to ensure push/pop can emit different registers. V3: - Change scs_push/pop to standard move patterns. - Optimize scs_pop to avoid pop x30 twice when shadow stack is enabled. gcc/config/aarch64/aarch64.cc | 113 +- gcc/config/aarch64/aarch64.h | 21 +++- gcc/config/aarch64/aarch64.md | 10 ++ gcc/doc/invoke.texi | 30 + gcc/doc/tm.texi | 5 + gcc/doc/tm.texi.in| 2 + gcc/flag-types.h | 2 + gcc/opts.cc | 1 + gcc/target.def| 8 ++ .../gcc.target/aarch64/shadow_call_stack_1.c | 6 + .../gcc.target/aarch64/shadow_call_stack_2.c | 6 + .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 20 .../gcc.target/aarch64/shadow_call_stack_5.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_6.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_7.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_8.c | 24 gcc/toplev.cc | 10 ++ 18 files changed, 326 insertions(+), 31 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_8.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index e3f18fbe7da..35f6f64f5b2 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -80,6 +80,7 @@ #include "fractional-cost.h" #include "rtlanal.h" #include "tree-dfa.h" +#include "asan.h" /* This file should be included last. *
Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/11/22 07:35, Richard Sandiford wrote: Dan Li writes: On 2/11/22 01:53, Richard Sandiford wrote: Dan Li writes: On 2/10/22 01:55, Richard Sandiford wrote: And I think maybe we could handle this through three patches: 1.Keep current patch (a V5) unchanged for scs. 2.Add shrink-warpping for X30: logically this might be a separate topic, and I think more testing might be needed here (Well, I'm a little worried about if there might be other effects, since I just read this part of the code roughly yesterday). 3.Add scs push/pop to shrink-wrapping (and maybe we can do the same for the PAC code in pro/epilogue, since it's also the operation of the X30). Yeah, that's fair. (Like I said earlier, I wasn't asking for the shrink-wrapping change. It was just a note in passing. But as you point out, the individual shrink-wrapping support would be even more work than I'd imagined.) Hi, Richard, I have sent out the v5[1] and rebased it to mainline at the same time, please let me know if there is anything else I need to do :) [1].https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590311.html Thanks, Dan
Re: [PATCH] [PATCH,v5,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/15/22 10:02, Richard Sandiford wrote: Dan Li writes: Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. Looks good, thanks. However, when I bootstrap it on aarch64-linux-gnu I get: .../gcc/ubsan.cc: In function ‘bool ubsan_expand_null_ifn(gimple_stmt_iterator*)’: .../gcc/ubsan.cc:835:50: error: enumerated and non-enumerated type in conditional expression [-Werror=extra] 835 | = (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT : 0) | ^~~~ .../gcc/ubsan.cc:836:51: error: enumerated and non-enumerated type in conditional expression [-Werror=extra] 836 | | (check_null ? SANITIZE_NULL : 0))) |~~~^~~ I think this is because you're taking the last available bit of the enum :-) A hacky fix is to add "+ 0" to SANITIZE_ALIGNMENT and SANITIZE_NULL in the code quoted above (i.e. the code in the error messages). That seems slightly more robust than a cast to unsigned int (say), since "+ 0" will work even if the values become 64-bit quantities in future. Richard Ah, apologize for my mistake! I specified --disable-werror in ./configure from the beginning, I didn't see this problem before. As you said, I use the following patch: diff --git a/gcc/ubsan.cc b/gcc/ubsan.cc index 5641d3cc3be..a858994c841 100644 --- a/gcc/ubsan.cc +++ b/gcc/ubsan.cc @@ -832,8 +832,8 @@ ubsan_expand_null_ifn (gimple_stmt_iterator *gsip) else { enum built_in_function bcode - = (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT : 0) - | (check_null ? SANITIZE_NULL : 0))) + = (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT + 0 : 0) + | (check_null ? SANITIZE_NULL + 0 : 0))) ? BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1 : BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1_ABORT; tree fn = builtin_decl_implicit (bcode); And tested fine in native compiling for x86_64 , I will change it in the next version. BTW: The platform I'm using is x86-64, so I'm trying to find a way to reproduce this issue when cross-compiling with aarch64, which I haven't found so far, the issue only seems to happen with native compilation. But most of the code changes are for the aarch64 platform, is it enough for me to do the following tests before submitting the patch? 1) A full compile of gcc under x86_64 platform (make; make install; make bootstrap;) 2) Test all testsuites in aarch64 cross-compile environment (make -k check) Thanks, Dan
[PATCH] [PATCH, v6, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/ChangeLog: * config/aarch64/aarch64.cc (SLOT_REQUIRED): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_layout_frame): Likewise, and change callee_adjust when scs is enabled. (aarch64_save_callee_saves): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_restore_callee_saves): Change wb_candidate[12] to wb_pop_candidate[12]. (aarch64_get_separate_components): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS, while preventing it from being popped from the regular stack again. (aarch64_override_options_internal): Add SCS compile option check. (TARGET_HAVE_SHADOW_CALL_STACK): New hook. * config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled, wb_pop_candidate[12], and rename wb_candidate[12] to wb_push_candidate[12]. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * doc/invoke.texi: Document -fsanitize=shadow-call-stack. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add hook have_shadow_call_stack. * flag-types.h (enum sanitize_code): Add SANITIZE_SHADOW_CALL_STACK. * opts.cc (parse_sanitizer_options): Add shadow-call-stack and exclude SANITIZE_SHADOW_CALL_STACK. * target.def: New hook. * toplev.cc (process_options): Add SCS compile option check. * ubsan.cc (ubsan_expand_null_ifn): Enum type conversion. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. * gcc.target/aarch64/shadow_call_stack_5.c: New test. * gcc.target/aarch64/shadow_call_stack_6.c: New test. * gcc.target/aarch64/shadow_call_stack_7.c: New test. * gcc.target/aarch64/shadow_call_stack_8.c: New test. --- V6: - Exclude SANITIZE_SHADOW_CALL_STACK. - Fix enum type conversion error. - Rebase to the mainline (20220216). V5: - Modify part of wb_push_candidates to wb_pop_candidates. - Rebase to the mainline (20220210). V4: - Added wb_[push|pop]_candidates[12] to ensure push/pop can emit different registers. V3: - Change scs_push/pop to standard move patterns. - Optimize scs_pop to avoid pop x30 twice when shadow stack is enabled. gcc/config/aarch64/aarch64.cc | 113 +- gcc/config/aarch64/aarch64.h | 21 +++- gcc/config/aarch64/aarch64.md | 10 ++ gcc/doc/invoke.texi | 30 + gcc/doc/tm.texi | 5 + gcc/doc/tm.texi.in| 2 + gcc/flag-types.h | 2 + gcc/opts.cc | 4 +- gcc/target.def| 8 ++ .../gcc.target/aarch64/shadow_call_stack_1.c | 6 + .../gcc.target/aarch64/shadow_call_stack_2.c | 6 + .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 20 .../gcc.target/aarch64/shadow_call_stack_5.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_6.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_7.c | 18 +++ .../gcc.target/aarch64/shadow_call_stack_8.c | 24 gcc/toplev.cc | 10 ++ gcc/ubsan.cc | 4 +- 19 files changed, 330 insertions(+), 34 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_8.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 37ed22
Re: [PATCH] [PATCH,v5,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/15/22 10:02, Richard Sandiford wrote: Dan Li writes: Shadow Call Stack can be used to protect the return address of a Looks good, thanks. However, when I bootstrap it on aarch64-linux-gnu I get: .../gcc/ubsan.cc: In function ‘bool ubsan_expand_null_ifn(gimple_stmt_iterator*)’: .../gcc/ubsan.cc:835:50: error: enumerated and non-enumerated type in conditional expression [-Werror=extra] 835 | = (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT : 0) | ^~~~ .../gcc/ubsan.cc:836:51: error: enumerated and non-enumerated type in conditional expression [-Werror=extra] 836 | | (check_null ? SANITIZE_NULL : 0))) |~~~^~~ I think this is because you're taking the last available bit of the enum :-) A hacky fix is to add "+ 0" to SANITIZE_ALIGNMENT and SANITIZE_NULL in the code quoted above (i.e. the code in the error messages). That seems slightly more robust than a cast to unsigned int (say), since "+ 0" will work even if the values become 64-bit quantities in future. Richard Hi, Richard, I have sent out the v6[1]. I re-tested native/cross-compile, and tested all test cases in both cases (make -k check). The test results of this patch are consistent with the current mainline (but in my test environment there are some test cases that fail in both). The current version seems fine to me, please let me know if I'm missing something :) [1]https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590602.html Thanks, Dan
[PATCH] [PATCH] AArch64: add R30_REGNUM into shrink-wrapping separate
R30_REGNUM could also be used as a component in shrink-wrapping separate, this patch enables it in aarch64. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_get_separate_components): Remove bitmap clear of R30_REGNUM. (aarch64_components_for_bb): Support R30_REGNUM as a component. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shrink_wrap_separate_1.c: New test. --- gcc/config/aarch64/aarch64.cc | 4 ++-- .../gcc.target/aarch64/shrink_wrap_separate_1.c | 17 + 2 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 8bcee8be9eb..6e1589b0312 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8463,7 +8463,6 @@ aarch64_get_separate_components (void) if (reg1 != INVALID_REGNUM) bitmap_clear_bit (components, reg1); - bitmap_clear_bit (components, LR_REGNUM); bitmap_clear_bit (components, SP_REGNUM); return components; @@ -8500,7 +8499,8 @@ aarch64_components_for_bb (basic_block bb) /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */ for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (!fixed_regs[regno] - && !crtl->abi->clobbers_full_reg_p (regno) + && (regno == R30_REGNUM + || !crtl->abi->clobbers_full_reg_p (regno)) && (TEST_HARD_REG_BIT (extra_caller_saves, regno) || bitmap_bit_p (in, regno) || bitmap_bit_p (gen, regno) diff --git a/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c new file mode 100644 index 000..34002705ace --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fomit-frame-pointer -fdump-rtl-pro_and_epilogue" } */ + +void f(); + +int g(int x) +{ + if (x == 0) +{ + __asm__ ("":::"x19", "x20"); + return 1; +} + f(); + return 2; +} + +/* { dg-final { scan-rtl-dump {The components we wrap separately are \[sep 30\]} "pro_and_epilogue" } } */ -- 2.17.1
Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/11/22 07:35, Richard Sandiford wrote: Dan Li writes: On 2/11/22 01:53, Richard Sandiford wrote: Dan Li writes: On 2/10/22 01:55, Richard Sandiford wrote: void f(); int g(int x) { if (x == 0) { __asm__ ("":::"x19", "x20"); return 1; } f(); return 2; } Then it seems X30 is treat as a "component" (the test result of aarch64.exp also seems fine). g: stp x19, x20, [sp, -32]! cbnzw0, .L2 mov w0, 1 ldp x19, x20, [sp], 32 ret .L2: str x30, [sp, 16] bl f ldr x30, [sp, 16] mov w0, 2 ldp x19, x20, [sp], 32 ret And I think maybe we could handle this through three patches: 1.Keep current patch (a V5) unchanged for scs. 2.Add shrink-warpping for X30: logically this might be a separate topic, and I think more testing might be needed here (Well, I'm a little worried about if there might be other effects, since I just read this part of the code roughly yesterday). 3.Add scs push/pop to shrink-wrapping (and maybe we can do the same for the PAC code in pro/epilogue, since it's also the operation of the X30). Yeah, that's fair. (Like I said earlier, I wasn't asking for the shrink-wrapping change. It was just a note in passing. But as you point out, the individual shrink-wrapping support would be even more work than I'd imagined.) Hi, Richard, As said before, I try to enable R30 as component[1] in shrink-wrapping separate. The test results of this patch in x86-64 native compile/aarch64 cross-compile are consistent with the mainline (but there are still few use cases not working properly in my test environment). please let me know if I'm missing something :) [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html
Re: [PATCH] [RFC][PR102768] aarch64: Add compiler support for Shadow Call Stack
Hi Szabolcs, First of all, apologies for my late reply (since I just had a new baby, I'm quite busy recently and also because I'm not familiar with C++ exception handling, it takes me some time to learn this part). On 11/3/21 8:00 PM, Szabolcs Nagy wrote: The 11/03/2021 00:24, Dan Li wrote: On 11/2/21 9:04 PM, Szabolcs Nagy wrote: The 11/02/2021 00:06, Dan Li via Gcc-patches wrote: Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as described in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 i'm not a gcc maintainer, but i prefer such feature to be in upstream gcc instead of in a plugin. it will require update to the documentation: which should mention that it depends on -ffixed-x18 (probably that should be enforced too) which is an important abi issue: functions following the normal pcs can clobber x18 and break scs. Thanks Szabolcs, I will update the documentation in next version. It sounds reasonable to enforced -ffixed-x18 with scs, but I see that clang doesn’t do that. Maybe it is better to be consistent with clang here? i mean gcc can issue a diagnostic if -ffixed-x18 is not passed. (it seems clang rejects scs too without -ffixed-x18) Oh, yes, you are right. Clang rejects scs without -ffixed-x18[1], I should add a similar check in next version. and that there is no unwinder support. Ok, let me try to add a support for this. i assume exception handling info has to change for scs to work (to pop the shadow stack when transferring control), so either scs must require -fno-exceptions or the eh info changes must be implemented. i think the kernel does not require exceptions and does not depend on the unwinder runtime in libgcc, so this is optional for the linux kernel use-case. I recompiled a glibc and gcc runtime library with -ffixed-x18 enabled. As you said, the scs stack needs to be popped at the same time during exception handling. I saw that Clang is processed by adding ".cfi_escape 0x16, 0x12, 0x02, 0x82, 0x78" directive (x18 -= 8;) after each emit of scs push[2]. But this directive has problems when executed in libgcc: 1)context->reg[x] in uw_init_context_1 are all based on cfa, most registers have no initial values by default. 2)Address of shadow call stack (x18) cannot(and should not) be calculated based on cfa, and I did not yet find a way to assign hardware register x18 to context->reg[18]. 3)This causes libgcc to crash when parsing .cfi_escape exp because of 0 address dereference (* x18) (execute_stack_op => case DW_OP_breg18: _Unwind_GetGR) 4)uw_install_context_1 does not restore all hardware registers by default before eh return, so context->reg[18] can't write directly to hw x18. (In clang, __unw_getcontext/__unw_resume will save/restore all hardware registers, so this directive works fine in my libunwind test.) I tried to fix this problem through a patch[3], the exception handling works fine in my test environment, but I'm not sure if this fix is ppropriate for two reasons: 1)libgcc does not push/pop all registers by default during exception handling. Is this change appropriate? 2)The test case may not be able to test this patch, because the test environment requires at least on glibc/gcc runtime compiled with -ffixed-x18. May be it's better to rely on -fno-exceptions for this patch first? and If the glibc/gcc runtime also supports SCS later, the problem can be fixed at the same time. PS: I'm still not familiar enough with exception handling in libgcc/libunwind, please correct me if there are any mistakes :) [1] https://github.com/llvm/llvm-project/commit/f11eb3ebe77729426e562d7d4d7ebb1d5ff2e7c8 [2] https://reviews.llvm.org/D54609 [3] https://gcc.gnu.org/bugzilla/attachment.cgi?id=51854&action=diff
Re: [PATCH] [RFC][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 11/23/21 6:51 PM, Szabolcs Nagy wrote: The 11/23/2021 16:32, Dan Li wrote: On 11/3/21 8:00 PM, Szabolcs Nagy wrote: i assume exception handling info has to change for scs to work (to pop the shadow stack when transferring control), so either scs must require -fno-exceptions or the eh info changes must be implemented. i think the kernel does not require exceptions and does not depend on the unwinder runtime in libgcc, so this is optional for the linux kernel use-case. I recompiled a glibc and gcc runtime library with -ffixed-x18 enabled. As you said, the scs stack needs to be popped at the same time during exception handling. I saw that Clang is processed by adding ".cfi_escape 0x16, 0x12, 0x02, 0x82, 0x78" directive (x18 -= 8;) after each emit of scs push[2]. But this directive has problems when executed in libgcc: 1)context->reg[x] in uw_init_context_1 are all based on cfa, most registers have no initial values by default. 2)Address of shadow call stack (x18) cannot(and should not) be calculated based on cfa, and I did not yet find a way to assign hardware register x18 to context->reg[18]. 3)This causes libgcc to crash when parsing .cfi_escape exp because of 0 address dereference (* x18) (execute_stack_op => case DW_OP_breg18: _Unwind_GetGR) 4)uw_install_context_1 does not restore all hardware registers by default before eh return, so context->reg[18] can't write directly to hw x18. (In clang, __unw_getcontext/__unw_resume will save/restore all hardware registers, so this directive works fine in my libunwind test.) I tried to fix this problem through a patch[3], the exception handling works fine in my test environment, but I'm not sure if this fix is ppropriate for two reasons: 1)libgcc does not push/pop all registers by default during exception handling. Is this change appropriate? 2)The test case may not be able to test this patch, because the test environment requires at least on glibc/gcc runtime compiled with -ffixed-x18. May be it's better to rely on -fno-exceptions for this patch first? and If the glibc/gcc runtime also supports SCS later, the problem can be fixed at the same time. i did not look at the exception handling in detail (that's difficult to understand for me too). to use scs, non-default abi is required anyway, so not supporting exceptions sounds fine to me. however it should be documented and ideally enforced (-fexceptions should be rejected, just like -fno-fixed-x18). Thanks Szabolcs, This sounds reasonable to me, and I'll fix it in the next version. i assume the linux kernel does not require -fexceptions. AFAIK, -fexceptions are not used in the linux kernel. PS: I'm still not familiar enough with exception handling in libgcc/libunwind, please correct me if there are any mistakes :) [1] https://github.com/llvm/llvm-project/commit/f11eb3ebe77729426e562d7d4d7ebb1d5ff2e7c8 [2] https://reviews.llvm.org/D54609 [3] https://gcc.gnu.org/bugzilla/attachment.cgi?id=51854&action=diff
[PATCH] [RFC, v2, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): New. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): New decl. * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): New. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS. * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): New macro. (TARGET_CHECK_SCS_RESERVED_REGISTER): Likewise. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK):New macro. * doc/extend.texi: Document -fsanitize=shadow-call-stack. * doc/invoke.texi: Document attribute. * flag-types.h (enum sanitize_code):Add SANITIZE_SHADOW_CALL_STACK. * opts-global.c (handle_common_deferred_options): Add SCS compile option check. * opts.c (finish_options): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. --- gcc/c-family/c-attribs.c | 21 + gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 27 +++ gcc/config/aarch64/aarch64.h | 11 + gcc/config/aarch64/aarch64.md | 18 gcc/defaults.h| 4 ++ gcc/doc/extend.texi | 7 +++ gcc/doc/invoke.texi | 29 gcc/flag-types.h | 2 + gcc/opts-global.c | 6 +++ gcc/opts.c| 12 + .../gcc.target/aarch64/shadow_call_stack_1.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_2.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 18 15 files changed, 213 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 007b928c54b..9b3a35c06bf 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address",0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_shadow_call_stack", + 0, 0, true, false, false, false, + handle_no_sanitize_shadow_call_stack_attribute, + NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -1175,6 +1181,21 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree, int, return NULL_TREE; } +/* Handle a "no_sanitize_shadow_call_stack" attribute; arguments as
[PATCH] [PATCH, v2, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): New. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): New decl. * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): New. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS. * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): New macro. (TARGET_CHECK_SCS_RESERVED_REGISTER): Likewise. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK):New macro. * doc/extend.texi: Document -fsanitize=shadow-call-stack. * doc/invoke.texi: Document attribute. * flag-types.h (enum sanitize_code):Add SANITIZE_SHADOW_CALL_STACK. * opts-global.c (handle_common_deferred_options): Add SCS compile option check. * opts.c (finish_options): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. --- gcc/c-family/c-attribs.c | 21 + gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 27 +++ gcc/config/aarch64/aarch64.h | 11 + gcc/config/aarch64/aarch64.md | 18 gcc/defaults.h| 4 ++ gcc/doc/extend.texi | 7 +++ gcc/doc/invoke.texi | 29 gcc/flag-types.h | 2 + gcc/opts-global.c | 6 +++ gcc/opts.c| 12 + .../gcc.target/aarch64/shadow_call_stack_1.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_2.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 18 15 files changed, 213 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 007b928c54b..9b3a35c06bf 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address",0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_shadow_call_stack", + 0, 0, true, false, false, false, + handle_no_sanitize_shadow_call_stack_attribute, + NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -1175,6 +1181,21 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree, int, return NULL_TREE; } +/* Handle a "no_sanitize_shadow_call_stack" attribute; arguments as
Re: [PATCH] [PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 12/6/21 10:41 AM, Dan Li wrote: Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li Add more reviewers :)
[PATCH] [PATCH] aarch64:fix redundant check in aut insn generation [PR103017] During the generation of the epilogue of aarch64(aarch64_expand_epilogue), the value of crtl->calls_eh_return does not nee
gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_expand_epilogue): * config/aarch64/aarch64.md: Signed-off-by: Dan Li --- gcc/config/aarch64/aarch64.c | 6 +- gcc/config/aarch64/aarch64.md | 3 +-- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 699c105a42a..8448e56443c 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9076,13 +9076,9 @@ aarch64_expand_epilogue (bool for_sibcall) 2) The RETAA instruction is not available before ARMv8.3-A, so if we are generating code for !TARGET_ARMV8_3 we can't use it and must explicitly authenticate. - - 3) On an eh_return path we make extra stack adjustments to update the - canonical frame address to be the exception handler's CFA. We want - to authenticate using the CFA of the function which calls eh_return. */ if (aarch64_return_address_signing_enabled () - && (for_sibcall || !TARGET_ARMV8_3 || crtl->calls_eh_return)) + && (for_sibcall || !TARGET_ARMV8_3)) { switch (aarch64_ra_sign_key) { diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1a39470a1fe..65ee6159d73 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -879,8 +879,7 @@ (define_insn "*do_return" { const char *ret = NULL; if (aarch64_return_address_signing_enabled () - && (TARGET_PAUTH) - && !crtl->calls_eh_return) + && (TARGET_PAUTH)) { if (aarch64_ra_sign_key == AARCH64_KEY_B) ret = "retab"; -- 2.17.1
[PATCH] [PR103017] aarch64:fix redundant check in aut insn generation
During the generation of the epilogue of aarch64(aarch64_expand_epilogue), the value of crtl->calls_eh_return does not need to be checked again. This value has been checked during aarch64_return_address_signing_enabled. gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_expand_epilogue): * config/aarch64/aarch64.md: Signed-off-by: Dan Li --- gcc/config/aarch64/aarch64.c | 6 +- gcc/config/aarch64/aarch64.md | 3 +-- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 699c105a42a..8448e56443c 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9076,13 +9076,9 @@ aarch64_expand_epilogue (bool for_sibcall) 2) The RETAA instruction is not available before ARMv8.3-A, so if we are generating code for !TARGET_ARMV8_3 we can't use it and must explicitly authenticate. - - 3) On an eh_return path we make extra stack adjustments to update the - canonical frame address to be the exception handler's CFA. We want - to authenticate using the CFA of the function which calls eh_return. */ if (aarch64_return_address_signing_enabled () - && (for_sibcall || !TARGET_ARMV8_3 || crtl->calls_eh_return)) + && (for_sibcall || !TARGET_ARMV8_3)) { switch (aarch64_ra_sign_key) { diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1a39470a1fe..65ee6159d73 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -879,8 +879,7 @@ (define_insn "*do_return" { const char *ret = NULL; if (aarch64_return_address_signing_enabled () - && (TARGET_PAUTH) - && !crtl->calls_eh_return) + && (TARGET_PAUTH)) { if (aarch64_ra_sign_key == AARCH64_KEY_B) ret = "retab"; -- 2.17.1
[PATCH] [RFC][PR102768] aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as described in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): (aarch64_expand_prologue): (aarch64_expand_epilogue): * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): * config/aarch64/aarch64.md (scs_push): (scs_pop): * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK): * flag-types.h (enum sanitize_code): * opts.c (finish_options): Signed-off-by: Dan Li --- gcc/c-family/c-attribs.c| 21 + gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c| 27 +++ gcc/config/aarch64/aarch64.h| 4 gcc/config/aarch64/aarch64.md | 18 ++ gcc/defaults.h | 4 gcc/flag-types.h| 2 ++ gcc/opts.c | 6 ++ 8 files changed, 83 insertions(+) diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 007b928c54b..9b3a35c06bf 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address",0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_shadow_call_stack", + 0, 0, true, false, false, false, + handle_no_sanitize_shadow_call_stack_attribute, + NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -1175,6 +1181,21 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree, int, return NULL_TREE; } +/* Handle a "no_sanitize_shadow_call_stack" attribute; arguments as in + struct attribute_spec.handler. */ +static tree +handle_no_sanitize_shadow_call_stack_attribute (tree *node, tree name, + tree, int, bool *no_add_attrs) +{ + *no_add_attrs = true; + if (TREE_CODE (*node) != FUNCTION_DECL) +warning (OPT_Wattributes, "%qE attribute ignored", name); + else +add_no_sanitize_value (*node, SANITIZE_SHADOW_CALL_STACK); + + return NULL_TREE; +} + /* Handle a "no_sanitize_thread" attribute; arguments as in struct attribute_spec.handler. */ diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 768e8fae136..150c015df21 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -893,6 +893,7 @@ void aarch64_register_pragmas (void); void aarch64_relayout_simd_types (void); void aarch64_reset_previous_fndecl (void); bool aarch64_return_address_signing_enabled (void); +bool aarch64_shadow_call_stack_enabled (void); bool aarch64_bti_enabled (void); void aarch64_save_restore_target_globals (tree); void aarch64_addti_scratch_regs (rtx, rtx, rtx *, diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 699c105a42a..5a36a459f4e 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -79,6 +79,7 @@ #include "tree-ssa-loop-niter.h" #include "fractional-cost.h" #include "rtlanal.h" +#include "asan.h" /* This file should be included last. */ #include "target-def.h" @@ -7799,6 +7800,24 @
Re: [PATCH] [RFC][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 11/2/21 9:04 PM, Szabolcs Nagy wrote: The 11/02/2021 00:06, Dan Li via Gcc-patches wrote: Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as described in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 i'm not a gcc maintainer, but i prefer such feature to be in upstream gcc instead of in a plugin. it will require update to the documentation: which should mention that it depends on -ffixed-x18 (probably that should be enforced too) which is an important abi issue: functions following the normal pcs can clobber x18 and break scs. Thanks Szabolcs, I will update the documentation in next version. It sounds reasonable to enforced -ffixed-x18 with scs, but I see that clang doesn’t do that. Maybe it is better to be consistent with clang here? and that there is no unwinder support. Ok, let me try to add a support for this. the abi issue means it is unlikely to be useful in linux user space (even if libc and unwinder support is implemented), but it can be still useful in freestanding code such as the linux kernel. thanks.
[RFC/RFT 0/3] Add compiler support for Control Flow Integrity
This series of patches is mainly used to support the control flow integrity protection of the linux kernel [1], which is similar to -fsanitize=kcfi in clang 16.0 [2,3]. I hope that this feature will also support user-mode CFI in the future (at least for developers who can recompile the runtime), so I use -fsanitize=cfi as a compilation option here. Any suggestion please let me know :). Thanks, Dan. [1] https://lore.kernel.org/all/20220908215504.3686827-1-samitolva...@google.com/ [2] https://clang.llvm.org/docs/ControlFlowIntegrity.html [3] https://reviews.llvm.org/D119296 Dan Li (3): [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features [PR102768] Support CFI: Add new pass for Control Flow Integrity [PR102768] aarch64: Add support for Control Flow Integrity Signed-off-by: Dan Li --- gcc/Makefile.in | 1 + gcc/asan.h| 4 +- gcc/c-family/c-attribs.cc | 10 +- gcc/c-family/c-common.h | 2 +- gcc/c/c-parser.cc | 4 +- gcc/cgraphunit.cc | 34 +++ gcc/common.opt| 4 +- gcc/config/aarch64/aarch64.cc | 106 gcc/cp/typeck.cc | 2 +- gcc/doc/invoke.texi | 35 +++ gcc/doc/passes.texi | 10 + gcc/doc/tm.texi | 27 +++ gcc/doc/tm.texi.in| 8 + gcc/dwarf2asm.cc | 2 +- gcc/flag-types.h | 67 ++--- gcc/opt-suggestions.cc| 2 +- gcc/opts.cc | 26 +- gcc/opts.h| 8 +- gcc/output.h | 3 + gcc/passes.def| 1 + gcc/target.def| 39 +++ .../aarch64/control_flow_integrity_1.c| 14 ++ .../aarch64/control_flow_integrity_2.c| 25 ++ .../aarch64/control_flow_integrity_3.c| 23 ++ gcc/toplev.cc | 4 + gcc/tree-cfg.cc | 2 +- gcc/tree-cfi.cc | 229 ++ gcc/tree-pass.h | 1 + gcc/tree.cc | 144 +++ gcc/tree.h| 1 + gcc/varasm.cc | 29 +++ 31 files changed, 803 insertions(+), 64 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c create mode 100644 gcc/tree-cfi.cc -- 2.17.1
[RFC/RFT 3/3] [PR102768] aarch64: Add support for Control Flow Integrity
In the AArch64 platform, typeid can be directly inserted in front of the function header (offset is -4). For all functions that will not be called indirectly, insert the reserved RESERVED_CFI_TYPEID (0x0) as typeid in front of them. If not, the attacker may use the instruction/data before the function as typeid to bypass CFI. All typeids ignore some bits (& AARCH64_UNALLOCATED_INSN_MASK) to avoid conflicts with the AArch64 instruction set. Signed-off-by: Dan Li gcc/ChangeLog: PR c/102768 * config/aarch64/aarch64.cc (RESERVED_CFI_TYPEID): Macro definition. (DEFAULT_CFI_TYPEID): Likewise. (AARCH64_UNALLOCATED_INSN_MASK): Likewise. (aarch64_gimple_get_func_cfi_typeid): Platform-dependent CFI function. (aarch64_calc_func_cfi_typeid): Likewise. (cgraph_indirectly_callable): Determine whether a funtion may be called indirectly. (aarch64_output_func_cfi_typeid): Platform-dependent CFI function. (TARGET_HAVE_CFI): New hook. (TARGET_CALC_FUNC_CFI_TYPEID): Likewise. (TARGET_ASM_OUTPUT_FUNC_CFI_TYPEID): Likewise. (TARGET_GIMPLE_GET_FUNC_CFI_TYPEID): Likewise. * doc/invoke.texi: Document -fsanitize=cfi. gcc/testsuite/ChangeLog: * gcc.target/aarch64/control_flow_integrity_1.c: New test. * gcc.target/aarch64/control_flow_integrity_2.c: New test. * gcc.target/aarch64/control_flow_integrity_3.c: New test. --- gcc/config/aarch64/aarch64.cc | 106 ++ gcc/doc/invoke.texi | 35 ++ .../aarch64/control_flow_integrity_1.c| 14 +++ .../aarch64/control_flow_integrity_2.c| 25 + .../aarch64/control_flow_integrity_3.c| 23 5 files changed, 203 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 5c9e7791a12..2796df0cdf3 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -81,6 +81,7 @@ #include "rtlanal.h" #include "tree-dfa.h" #include "asan.h" +#include "ssa.h" /* This file should be included last. */ #include "target-def.h" @@ -5450,6 +5451,99 @@ aarch64_output_sve_addvl_addpl (rtx offset) return buffer; } +/* Reserved for all functions that cannot be called indirectly. */ +#define RESERVED_CFI_TYPEID 0x0U + +/* If the typeid of a function that can be called indirectly is equal to + RESERVED_CFI_TYPEID, change it to DEFAULT_CFI_TYPEID. */ +#define DEFAULT_CFI_TYPEID 0x0ADAU + +/* Mask of reserved and unallocated instructions in AArch64 platform. */ +#define AARCH64_UNALLOCATED_INSN_MASK 0xE7FFU + +/* Generate gimple insns to return the callee's typeid to a tmp var, + for aarch64, like: + __cfi_tmp = *(fptr - 4); */ + +static tree +aarch64_gimple_get_func_cfi_typeid (gimple_seq *stmts, + location_t loc, tree fptr) +{ + gimple *stmt; + tree result, rhs; + + result = create_tmp_var (integer_type_node, "__cfi_tmp"); + result = make_ssa_name (result, NULL); + + rhs = build_pointer_type (integer_type_node); + rhs = build_int_cst_type (rhs, -4); + rhs = build2 (MEM_REF, integer_type_node, fptr, rhs); + + stmt = gimple_build_assign (result, rhs); + gimple_set_location (stmt, loc); + + SSA_NAME_DEF_STMT (result) = stmt; + + gimple_seq_add_stmt (stmts, stmt); + + return result; +} + +static unsigned int +aarch64_calc_func_cfi_typeid (const_tree fntype) +{ + unsigned int hash; + + /* The value of typeid has a probability of being the same as the encoding + of an instruction. If the attacker can find the same encoding as the + typeid in the assembly code, then he has found a usable jump location. + So here, a platform-related mask is used when generating a typeid to + avoid such conflicts as much as possible. */ + hash = unified_type_hash (fntype) & AARCH64_UNALLOCATED_INSN_MASK; + + /* RESERVED_CFI_TYPEID is reserved for functions that cannot + be called indirectly. */ + if (hash == RESERVED_CFI_TYPEID) +hash = DEFAULT_CFI_TYPEID; + + return hash; +} + +static bool +cgraph_indirectly_callable (struct cgraph_node *node, + void *data ATTRIBUTE_UNUSED) +{ + if (node->externally_visible || node->address_taken) +return true; + + return false; +} + +static void +aarch64_output_func_cfi_typeid (FILE * stream, tree decl) +{ + struct cgraph_node *node; + unsigned int cur_func_typeid; + + node = cgraph_node::get (decl); + + if (!node->call_for_symbol_thunks_and_aliases (cgraph_indirectly_callable, + NULL, true)) +/* CF
[RFC/RFT 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features
32-bit sanitize_code can no longer accommodate new options, extending it to 64-bit. Signed-off-by: Dan Li gcc/ChangeLog: PR c/102768 * asan.h (sanitize_flags_p): Promote to uint64_t. * common.opt: Likewise. * dwarf2asm.cc (dw2_output_indirect_constant_1): Likewise. * flag-types.h (enum sanitize_code): Likewise. * opt-suggestions.cc (option_proposer::build_option_suggestions): Likewise. * opts.cc (find_sanitizer_argument): Likewise. (report_conflicting_sanitizer_options): Likewise. (get_closest_sanitizer_option): Likewise. (parse_sanitizer_options): Likewise. (parse_no_sanitize_attribute): Likewise. * opts.h (parse_sanitizer_options): Likewise. (parse_no_sanitize_attribute): Likewise. * tree-cfg.cc (print_no_sanitize_attr_value): Likewise. gcc/c-family/ChangeLog: * c-attribs.cc (add_no_sanitize_value): Likewise. (handle_no_sanitize_attribute): Likewise. * c-common.h (add_no_sanitize_value): Likewise. gcc/c/ChangeLog: * c-parser.cc (c_parser_declaration_or_fndef): Likewise. gcc/cp/ChangeLog: * typeck.cc (get_member_function_from_ptrfunc): Likewise. --- gcc/asan.h| 4 +-- gcc/c-family/c-attribs.cc | 10 +++--- gcc/c-family/c-common.h | 2 +- gcc/c/c-parser.cc | 4 +-- gcc/common.opt| 4 +-- gcc/cp/typeck.cc | 2 +- gcc/dwarf2asm.cc | 2 +- gcc/flag-types.h | 65 --- gcc/opt-suggestions.cc| 2 +- gcc/opts.cc | 22 ++--- gcc/opts.h| 8 ++--- gcc/tree-cfg.cc | 2 +- 12 files changed, 64 insertions(+), 63 deletions(-) diff --git a/gcc/asan.h b/gcc/asan.h index d4ea49cb240..5b98172549b 100644 --- a/gcc/asan.h +++ b/gcc/asan.h @@ -233,9 +233,9 @@ asan_protect_stack_decl (tree decl) remove all flags mentioned in "no_sanitize" of DECL_ATTRIBUTES. */ static inline bool -sanitize_flags_p (unsigned int flag, const_tree fn = current_function_decl) +sanitize_flags_p (uint64_t flag, const_tree fn = current_function_decl) { - unsigned int result_flags = flag_sanitize & flag; + uint64_t result_flags = flag_sanitize & flag; if (result_flags == 0) return false; diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index 111a33f405a..a73e2364525 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -1118,23 +1118,23 @@ handle_cold_attribute (tree *node, tree name, tree ARG_UNUSED (args), /* Add FLAGS for a function NODE to no_sanitize_flags in DECL_ATTRIBUTES. */ void -add_no_sanitize_value (tree node, unsigned int flags) +add_no_sanitize_value (tree node, uint64_t flags) { tree attr = lookup_attribute ("no_sanitize", DECL_ATTRIBUTES (node)); if (attr) { - unsigned int old_value = tree_to_uhwi (TREE_VALUE (attr)); + uint64_t old_value = tree_to_uhwi (TREE_VALUE (attr)); flags |= old_value; if (flags == old_value) return; - TREE_VALUE (attr) = build_int_cst (unsigned_type_node, flags); + TREE_VALUE (attr) = build_int_cst (long_long_unsigned_type_node, flags); } else DECL_ATTRIBUTES (node) = tree_cons (get_identifier ("no_sanitize"), - build_int_cst (unsigned_type_node, flags), + build_int_cst (long_long_unsigned_type_node, flags), DECL_ATTRIBUTES (node)); } @@ -1145,7 +1145,7 @@ static tree handle_no_sanitize_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs) { - unsigned int flags = 0; + uint64_t flags = 0; *no_add_attrs = true; if (TREE_CODE (*node) != FUNCTION_DECL) { diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 52a85bfb783..eb91b9703db 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1500,7 +1500,7 @@ extern enum flt_eval_method excess_precision_mode_join (enum flt_eval_method, enum flt_eval_method); extern int c_flt_eval_method (bool ts18661_p); -extern void add_no_sanitize_value (tree node, unsigned int flags); +extern void add_no_sanitize_value (tree node, uint64_t flags); extern void maybe_add_include_fixit (rich_location *, const char *, bool); extern void maybe_suggest_missing_token_insertion (rich_location *richloc, diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index f679d53706a..9d55ea55fa6 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -2217,7 +2217,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok, start_init (NULL_TREE, asm_name, global_bindings_p (), &richloc); /* A parameter is initialized, which is invalid. Don't attempt to instrument the initializer. */ - int flag_sanitize_save = flag_sanitize; +
[RFC/RFT 2/3] [PR102768] Support CFI: Add new pass for Control Flow Integrity
The CFI sanitizer enabled with -fsanitize=cfi implements a forward edge control flow integrity scheme for indirect calls, roughly similar to -fsanitize=kcfi [1] in llvm. At compile time, it appends a uniform type identifier before the first instruction of each function and inserts check code before each indirect call in a function with protection enabled. At runtime, according to the code order, the check code for each indirect call will be executed first, and it will: 1. Dynamically obtain the typeid before the callee function. 2. Compare it to the expected typeid of the current call site (caller). 3. If the two match, continue to execute the indirect call, if not, call the user-defined callback function cfi_check_failed. A typeid (type identifier) is a 32-bit constant on all platforms, whose value depends on the function's prototype, and is invariant across compilation units. However, different platforms may ignore some of the bits to avoid conflicts with instructions. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, gcc generates a weak SHN_ABS __cfi_typeid_ symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one translation unit linked into the program takes the function address. It should be noted that on different platforms, the location of typeid insertion (the offset between it and the function header) may be different, such as [1], and this patch only implements the platform-independent part. [1]: https://reviews.llvm.org/D119296 Signed-off-by: Dan Li gcc/ChangeLog: PR c/102768 * Makefile.in: Add tree-cfi.o. * cgraphunit.cc (output_decl_cfi_typeid_symbol): Output the CFI typeid corresponding to each external declaration when necessary. (output_decl_cfi_typeid_symbols): Likewise. * doc/passes.texi: Document it. * doc/tm.texi: Regenerate. * doc/tm.texi.in: New hooks. * flag-types.h (enum sanitize_code): Add SANITIZE_CONTROL_FLOW_INTEGRITY. * opts.cc (parse_sanitizer_options): Add cfi and exclude SANITIZE_CONTROL_FLOW_INTEGRITY. * output.h (default_output_func_cfi_typeid): Declare. (default_calc_func_cfi_typeid): Declare. (default_gimple_get_func_cfi_typeid): Declare. * passes.def: Add pass_cfi. * target.def: Add new hooks. * toplev.cc (process_options): Add CFI compile option check. * tree-pass.h (make_pass_cfi): Declare. * tree.cc (tree_node_sizes[): Add the unified tree type hash calculation functions. (append_unified_type_hash): Likewise. (initialize_unified_tree_type_hash_table): Likewise. (append_unified_type_name_hash): Likewise. (append_unified_type_precision_hash): Likewise. (append_unified_function_ret_and_args_hash): Likewise. (unified_type_hash): Likewise. (init_ttree): Likewise. * tree.h (unified_type_hash): Declare. * varasm.cc (assemble_start_function): Output the CFI typeid of each function. (default_output_func_cfi_typeid): New. (default_gimple_get_func_cfi_typeid): New. (default_calc_func_cfi_typeid): New. * tree-cfi.cc: New file. --- gcc/Makefile.in | 1 + gcc/cgraphunit.cc | 34 +++ gcc/doc/passes.texi | 10 ++ gcc/doc/tm.texi | 27 ++ gcc/doc/tm.texi.in | 8 ++ gcc/flag-types.h| 2 + gcc/opts.cc | 4 +- gcc/output.h| 3 + gcc/passes.def | 1 + gcc/target.def | 39 gcc/toplev.cc | 4 + gcc/tree-cfi.cc | 229 gcc/tree-pass.h | 1 + gcc/tree.cc | 144 gcc/tree.h | 1 + gcc/varasm.cc | 29 ++ 16 files changed, 536 insertions(+), 1 deletion(-) create mode 100644 gcc/tree-cfi.cc diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 31ff95500c9..0d23bad6b63 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1610,6 +1610,7 @@ OBJS = \ tree-call-cdce.o \ tree-cfg.o \ tree-cfgcleanup.o \ + tree-cfi.o \ tree-chrec.o \ tree-complex.o \ tree-data-ref.o \ diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc index 76d541755b8..fb4999559ae 100644 --- a/gcc/cgraphunit.cc +++ b/gcc/cgraphunit.cc @@ -,6 +,37 @@ ipa_passes (void) bitmap_obstack_release (NULL); } +/* Output a weak symbol value of a decl's typeid (hash) to the + assembly file, like: + .weak __cfi_typeid_A + .set __cfi_typeid_A, 0x0ADA + typeid is platform-dependent, because the bits in typeid that conflicts + with the instruction set of the current platform needs to be ignored. */ + +static void +output_decl_cfi_typeid_symbol (FILE *stream, tree fndecl) +{ + unsigne
Re: [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
On 02/09, Hongtao Liu wrote: > On Mon, Dec 19, 2022 at 3:59 PM Dan Li via Gcc-patches > wrote: > > > > This series of patches is mainly used to support the control flow > > integrity protection of the linux kernel [1], which is similar to > > -fsanitize=kcfi in clang 16.0 [2,3]. > > > > I hope that this feature will also support user-mode CFI in the > > future (at least for developers who can recompile the runtime), > > so I use -fsanitize=cfi as a compilation option here. > > > > Any suggestion please let me know :). > Do you have this series as a branch somewhere that we could also try for x86? Hi Hongtao, I haven't tried this feature on the x86 platform, if possible, I will try it in the next version. Thanks, Dan. > -- > BR, > Hongtao
Re: [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
On 02/08, Peter Collingbourne wrote: > On Sun, Dec 18, 2022 at 10:06 PM Dan Li wrote: > > > > This series of patches is mainly used to support the control flow > > integrity protection of the linux kernel [1], which is similar to > > -fsanitize=kcfi in clang 16.0 [2,3]. > > > > I hope that this feature will also support user-mode CFI in the > > future (at least for developers who can recompile the runtime), > > so I use -fsanitize=cfi as a compilation option here. > > Please don't. The various CFI-related build flags are confusing enough > without also having this inconsistency between Clang and GCC. Hi Peter, Got it, as discussed before[1], in the next version I will use the same compile option. [1]. https://patchwork.kernel.org/project/linux-arm-kernel/patch/20221219061758.23321-1-ashimida.1...@gmail.com/ Thanks, Dan. > > Peter
[PING][PATCH, v2, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Gentile ping for this :), thanks. Link: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586204.html Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): New. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): New decl. * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): New. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS. * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): New macro. (TARGET_CHECK_SCS_RESERVED_REGISTER): Likewise. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK):New macro. * doc/extend.texi: Document -fsanitize=shadow-call-stack. * doc/invoke.texi: Document attribute. * flag-types.h (enum sanitize_code):Add SANITIZE_SHADOW_CALL_STACK. * opts-global.c (handle_common_deferred_options): Add SCS compile option check. * opts.c (finish_options): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. --- gcc/c-family/c-attribs.c | 21 + gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 27 +++ gcc/config/aarch64/aarch64.h | 11 + gcc/config/aarch64/aarch64.md | 18 gcc/defaults.h| 4 ++ gcc/doc/extend.texi | 7 +++ gcc/doc/invoke.texi | 29 gcc/flag-types.h | 2 + gcc/opts-global.c | 6 +++ gcc/opts.c| 12 + .../gcc.target/aarch64/shadow_call_stack_1.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_2.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 18 15 files changed, 213 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 007b928c54b..9b3a35c06bf 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address",0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_shadow_call_stack", + 0, 0, true, false, false, false, + handle_no_sanitize_shadow_call_stack_attribute, + NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -1175,6 +1181,21 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree
[PING^2][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Gentile ping for this again :), thanks. Link: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586204.html Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): New. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): New decl. * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): New. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS. * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): New macro. (TARGET_CHECK_SCS_RESERVED_REGISTER): Likewise. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK):New macro. * doc/extend.texi: Document -fsanitize=shadow-call-stack. * doc/invoke.texi: Document attribute. * flag-types.h (enum sanitize_code):Add SANITIZE_SHADOW_CALL_STACK. * opts-global.c (handle_common_deferred_options): Add SCS compile option check. * opts.c (finish_options): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. --- gcc/c-family/c-attribs.c | 21 + gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 27 +++ gcc/config/aarch64/aarch64.h | 11 + gcc/config/aarch64/aarch64.md | 18 gcc/defaults.h| 4 ++ gcc/doc/extend.texi | 7 +++ gcc/doc/invoke.texi | 29 gcc/flag-types.h | 2 + gcc/opts-global.c | 6 +++ gcc/opts.c| 12 + .../gcc.target/aarch64/shadow_call_stack_1.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_2.c | 6 +++ .../gcc.target/aarch64/shadow_call_stack_3.c | 45 +++ .../gcc.target/aarch64/shadow_call_stack_4.c | 18 15 files changed, 213 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 007b928c54b..9b3a35c06bf 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address",0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_shadow_call_stack", + 0, 0, true, false, false, false, + handle_no_sanitize_shadow_call_stack_attribute, + NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -1175,6 +1181,21 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree
[RFC/RFT, V2 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features
32-bit sanitize_code can no longer accommodate new options, extending it to 64-bit. Signed-off-by: Dan Li gcc/ChangeLog: PR c/102768 * asan.h (sanitize_flags_p): Promote to uint64_t. * common.opt: Likewise. * dwarf2asm.cc (dw2_output_indirect_constant_1): Likewise. * flag-types.h (enum sanitize_code): Likewise. * opt-suggestions.cc (option_proposer::build_option_suggestions): Likewise. * opts.cc (find_sanitizer_argument): Likewise. (report_conflicting_sanitizer_options): Likewise. (get_closest_sanitizer_option): Likewise. (parse_sanitizer_options): Likewise. (parse_no_sanitize_attribute): Likewise. * opts.h (parse_sanitizer_options): Likewise. (parse_no_sanitize_attribute): Likewise. * tree-cfg.cc (print_no_sanitize_attr_value): Likewise. gcc/c-family/ChangeLog: * c-attribs.cc (add_no_sanitize_value): Likewise. (handle_no_sanitize_attribute): Likewise. * c-common.h (add_no_sanitize_value): Likewise. gcc/c/ChangeLog: * c-parser.cc (c_parser_declaration_or_fndef): Likewise. gcc/cp/ChangeLog: * typeck.cc (get_member_function_from_ptrfunc): Likewise. --- gcc/asan.h| 4 +-- gcc/c-family/c-attribs.cc | 10 +++--- gcc/c-family/c-common.h | 2 +- gcc/c/c-parser.cc | 4 +-- gcc/common.opt| 4 +-- gcc/cp/typeck.cc | 2 +- gcc/dwarf2asm.cc | 2 +- gcc/flag-types.h | 65 --- gcc/opt-suggestions.cc| 2 +- gcc/opts.cc | 22 ++--- gcc/opts.h| 8 ++--- gcc/tree-cfg.cc | 2 +- 12 files changed, 64 insertions(+), 63 deletions(-) diff --git a/gcc/asan.h b/gcc/asan.h index d4ea49cb240..5b98172549b 100644 --- a/gcc/asan.h +++ b/gcc/asan.h @@ -233,9 +233,9 @@ asan_protect_stack_decl (tree decl) remove all flags mentioned in "no_sanitize" of DECL_ATTRIBUTES. */ static inline bool -sanitize_flags_p (unsigned int flag, const_tree fn = current_function_decl) +sanitize_flags_p (uint64_t flag, const_tree fn = current_function_decl) { - unsigned int result_flags = flag_sanitize & flag; + uint64_t result_flags = flag_sanitize & flag; if (result_flags == 0) return false; diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index 111a33f405a..a73e2364525 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -1118,23 +1118,23 @@ handle_cold_attribute (tree *node, tree name, tree ARG_UNUSED (args), /* Add FLAGS for a function NODE to no_sanitize_flags in DECL_ATTRIBUTES. */ void -add_no_sanitize_value (tree node, unsigned int flags) +add_no_sanitize_value (tree node, uint64_t flags) { tree attr = lookup_attribute ("no_sanitize", DECL_ATTRIBUTES (node)); if (attr) { - unsigned int old_value = tree_to_uhwi (TREE_VALUE (attr)); + uint64_t old_value = tree_to_uhwi (TREE_VALUE (attr)); flags |= old_value; if (flags == old_value) return; - TREE_VALUE (attr) = build_int_cst (unsigned_type_node, flags); + TREE_VALUE (attr) = build_int_cst (long_long_unsigned_type_node, flags); } else DECL_ATTRIBUTES (node) = tree_cons (get_identifier ("no_sanitize"), - build_int_cst (unsigned_type_node, flags), + build_int_cst (long_long_unsigned_type_node, flags), DECL_ATTRIBUTES (node)); } @@ -1145,7 +1145,7 @@ static tree handle_no_sanitize_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs) { - unsigned int flags = 0; + uint64_t flags = 0; *no_add_attrs = true; if (TREE_CODE (*node) != FUNCTION_DECL) { diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 52a85bfb783..eb91b9703db 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1500,7 +1500,7 @@ extern enum flt_eval_method excess_precision_mode_join (enum flt_eval_method, enum flt_eval_method); extern int c_flt_eval_method (bool ts18661_p); -extern void add_no_sanitize_value (tree node, unsigned int flags); +extern void add_no_sanitize_value (tree node, uint64_t flags); extern void maybe_add_include_fixit (rich_location *, const char *, bool); extern void maybe_suggest_missing_token_insertion (rich_location *richloc, diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index f679d53706a..9d55ea55fa6 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -2217,7 +2217,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok, start_init (NULL_TREE, asm_name, global_bindings_p (), &richloc); /* A parameter is initialized, which is invalid. Don't attempt to instrument the initializer. */ - int flag_sanitize_save = flag_sanitize; +
[RFC/RFT, V2 3/3] [PR102768] aarch64: Add support for Kernel Control Flow Integrity
In the AArch64 platform, typeid can be directly inserted in front of the function header (offset is patch_area_entry + 4), it should be assumed that patch_area_entry is the same for all functions. For all functions that will not be called indirectly, insert the reserved RESERVED_CFI_TYPEID (0x0) as typeid in front of them. If not, the attacker may use the instruction/data before the function as typeid to bypass CFI. All typeids ignore some bits (& AARCH64_UNALLOCATED_INSN_MASK) to avoid conflicts with the AArch64 instruction set (see AAPCS64 for details). Signed-off-by: Dan Li gcc/ChangeLog: * config/aarch64/aarch64.cc (RESERVED_CFI_TYPEID): Macro definition. (DEFAULT_CFI_TYPEID): Likewise. (AARCH64_UNALLOCATED_INSN_MASK): Likewise. (aarch64_calc_func_cfi_typeid): Platform-dependent CFI function. (cgraph_indirectly_callable): Determine whether a funtion may be called indirectly. (aarch64_output_func_kcfi_typeid): Platform-dependent CFI function. (aarch64_output_icall_kcfi_check): Likewise. (TARGET_HAVE_KCFI): New hook. (TARGET_CALC_FUNC_CFI_TYPEID): Likewise. (TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID): Likewise. (TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK): Likewise. * doc/invoke.texi: Document -fsanitize=kcfi. --- gcc/config/aarch64/aarch64.cc | 166 ++ gcc/doc/invoke.texi | 36 2 files changed, 202 insertions(+) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 5c9e7791a12..5b55541d437 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -5450,6 +5450,160 @@ aarch64_output_sve_addvl_addpl (rtx offset) return buffer; } +/* Reserved for all functions that cannot be called indirectly. */ +#define RESERVED_CFI_TYPEID 0x0U + +/* If the typeid of a function that can be called indirectly is equal to + RESERVED_CFI_TYPEID, change it to DEFAULT_CFI_TYPEID. */ +#define DEFAULT_CFI_TYPEID 0x0ADAU + +/* Mask of reserved and unallocated instructions in AArch64 platform. */ +#define AARCH64_UNALLOCATED_INSN_MASK 0xE7FFU + +static unsigned int +aarch64_calc_func_cfi_typeid (const_tree fntype) +{ + unsigned int hash; + + /* The value of typeid has a probability of being the same as the encoding + of an instruction. If the attacker can find the same encoding as the + typeid in the assembly code, then he has found a usable jump location. + So here, a platform-related mask is used when generating a typeid to + avoid such conflicts as much as possible. */ + hash = unified_type_hash (fntype) & AARCH64_UNALLOCATED_INSN_MASK; + + /* RESERVED_CFI_TYPEID is reserved for functions that cannot + be called indirectly. */ + if (hash == RESERVED_CFI_TYPEID) +hash = DEFAULT_CFI_TYPEID; + + return hash; +} + +static bool +cgraph_indirectly_callable (struct cgraph_node *node, + void *data ATTRIBUTE_UNUSED) +{ + if (node->externally_visible || node->address_taken) +return true; + + return false; +} + +static void +aarch64_output_func_kcfi_typeid (FILE * stream, tree decl) +{ + struct cgraph_node *node; + unsigned int cur_func_typeid; + + node = cgraph_node::get (decl); + + if (!node->call_for_symbol_thunks_and_aliases (cgraph_indirectly_callable, +NULL, true)) +/* CFI's typeid check always considers that there is a typeid before the + target function, so it is also necessary to output typeid for functions + that cannot be called indirectly to prevent attackers from bypassing + CFI by using instructions/data before those functions. + The typeid inserted before such a function is RESERVED_CFI_TYPEID, + and the calculation of the typeid must ensure that this value is always + reserved. */ +cur_func_typeid = RESERVED_CFI_TYPEID; + else +cur_func_typeid = aarch64_calc_func_cfi_typeid (TREE_TYPE (decl)); + + fprintf (stream, "__kcfi_%s:\n", get_name (decl)); + fprintf (stream, "\t.4byte %#010x\n", cur_func_typeid); +} + +/* This function outputs assembly instructions to check cfi typeid before + indirect call (blr Xn), which may destroy x16, x17, x9 registers (according + to the AAPCS64 specification, these registers do not need to be restored + after the function call). + The assembly code output by this function is as follows: + ldurw16, [x1, #-4] + movkw17, #13570 + movkw17, #17309, lsl #16 + cmp w16, w17 + b.eq.Lkcfi8 + brk #0x8221 +.Lkcfi8: + blr x1 + */ + +static void +aarch64_output_icall_kcfi_check (rtx reg, unsigned int value) +{ + unsigned int addr_reg, scratch_reg1, scratch_reg2; + unsigned int esr, addr_index, type_index; + char label_buf[256]; + const char *label_ptr; + unsigned HOST_WIDE_INT patch_area_entry =
[RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64
Based on Sami's patch[1], this patch makes the corresponding kernel configuration of CFI available when compiling the kernel with the gcc[2]. The code after enabling cfi (with -fsanitize=kcfi, -fpatchable-function-entry=3,1) is as follows: int (*p)(void); int func (int) { p(); } __kcfi_func: .4byte 0x439d3502 .global func .text .LPFE1: nop .type func, %function func: nop nop .LFB0: .. adrpx0, p add x0, x0, :lo12:p ldr x0, [x0] ldurw16, [x0, #-8] movkw17, #23592 movkw17, #17921, lsl #16 cmp w16, w17 b.eq.Lkcfi2 brk #0x8220 .Lkcfi2: blr x0 .. In the compiler part[4], most of the content is the same as Sami's implementation[3], except for some minor differences, mainly including: 1. The function typeid is calculated differently and it is difficult to be consistent. 2. A reserved typeid (such as 0x0U on the aarch64 platform) is always inserted in front of functions that should not be called indirectly. Functions that can be called indirectly will not use this hash value, which prevents instructions/data before the function from being used as a typeid by an attacker. 3. Some bits are ignored in the typeid to avoid conflicts between the typeid and the instruction set of a specific platform, thereby preventing an attacker from bypassing the CFI check by using the instruction as a typeid, such as on the aarch64 platform: * If the following instruction sequence exists: 400620: a9be7bfdstp x29, x30, [sp, #-32]! 400624: 910003fdmov x29, sp 400628: f9000bf3str x19, [sp, #16] * If the expected typeid of the indirect call is exactly 0x910003fd, the attacker can jump to the next instruction position of any "mov x29,sp" instruction (such as 0x400628 here). 4. Insert a symbol __kcfi_ before each function's typeid, which may be helpful for fine-grained KASLR implementations. I'm not sure if these distinctions need to be removed, they're kept in this version for now, and I'm happy to remove them in the next version if necessary (except for the first one). The current implementation of gcc only supports the aarch64 platform, the x86 version is still in progress. This produces the following oops on CFI failure (generated using lkdtm): / # echo CFI_FORWARD_PROTO > /sys/kernel/debug/provoke-crash/DIRECT [ 17.153364] lkdtm: Performing direct entry CFI_FORWARD_PROTO [ 17.153675] lkdtm: Calling matched prototype ... [ 17.154188] lkdtm: Calling mismatched prototype ... [ 17.154553] CFI failure at lkdtm_indirect_call+0x38/0x54 (target: lkdtm_increment_int+0x0/0x2c; expected type: 0xc59c68f1) [ 17.155627] Internal error: Oops - CFI: [#1] PREEMPT SMP [ 17.156025] Modules linked in: [ 17.156580] CPU: 1 PID: 110 Comm: sh Not tainted 6.1.0-1-g7ead10685f57-dirty #2 [ 17.156975] Hardware name: linux,dummy-virt (DT) [ 17.157290] pstate: 8045 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 17.157621] pc : lkdtm_indirect_call+0x38/0x54 [ 17.157902] lr : lkdtm_CFI_FORWARD_PROTO+0x48/0x7c [ 17.158142] sp : 8b083c80 [ 17.158316] x29: 8b083c80 x28: 068ccc40 x27: [ 17.158820] x26: x25: x24: 0bd069a0 [ 17.159239] x23: 053493c0 x22: 8b083df0 x21: 0012 [ 17.159673] x20: 8aa29150 x19: 06909000 x18: 8b035068 [ 17.160110] x17: c59c68f1 x16: a2d40772 x15: 0006 [ 17.160530] x14: x13: 2e2e2e2065707974 x12: 6f746f7270206465 [ 17.160947] x11: 686374616d73696d x10: 8a7254b8 x9 : 89268330 [ 17.161376] x8 : efff x7 : 8a7254b8 x6 : 8000f000 [ 17.161789] x5 : bff4 x4 : x3 : [ 17.162264] x2 : x1 : 88a4e248 x0 : 8abca500 [ 17.162865] Call trace: [ 17.163099] lkdtm_indirect_call+0x38/0x54 [ 17.163411] lkdtm_CFI_FORWARD_PROTO+0x48/0x7c [ 17.163663] lkdtm_do_action+0x48/0x5c [ 17.163876] direct_entry+0x144/0x160 [ 17.164084] full_proxy_write+0x84/0xe4 [ 17.164298] vfs_write+0xe8/0x32c The gcc-related patches[4] are based on tag: releases/gcc-12.2.0. This patch is based on tag: v6.2. Sorry for the long delay, the next version should not be that slow. Any suggestion please let me know :). Thanks, Dan. [1] https://lore.kernel.org/all/20220908215504.3686827-1-samitolva...@google.com/ [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [3] https://reviews.llvm.org/D119296 [4] https://lore.kernel.org/all/20230325081117.93245-1-ashimida.1...@gmail.com/ Signed-off-by: Dan Li --- RFC/RFT V2: - The CFI typeid check is changed from the f
[RFC/RFT, V2 2/3] [PR102768] Support CFI: Add basic support for Kernel Control Flow Integrity
The KCFI sanitizer enabled with -fsanitize=kcfi implements a forward edge control flow integrity scheme for indirect calls, similar to -fsanitize=kcfi [1] in llvm. At compile time, it appends a uniform type identifier before the first instruction of each function and inserts check code before each indirect call in a function with protection enabled. At runtime, according to the code order, the check code for each indirect call will be executed first, and it will: 1. Dynamically obtain the typeid before the callee function. 2. Compare it to the expected typeid of the current call site (caller). 3. If the two match, continue to execute the indirect call, if not, an exception will be generated, and its user (usually the low-level code such as the OS kernel) needs to support for the exception handling. A typeid (type identifier) is a 32-bit constant on all platforms, whose value depends on the function's prototype, and is invariant across compilation units. However, different platforms may ignore some of the bits to avoid conflicts with instructions. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, gcc generates a weak SHN_ABS __kcfi_typeid_ symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one translation unit linked into the program takes the function address. It should be noted that on different platforms, the location of typeid insertion (the offset between it and the function header) may be different, such as [1], and this patch only implements the platform-independent part. [1]: https://reviews.llvm.org/D119296 Signed-off-by: Dan Li gcc/ChangeLog: PR c/102768 * cfgexpand.cc (expand_call_stmt): Add CFI_TYPEID reg note for call insn. (pass_expand::execute): Check whether enable KCFI for current function according to the attribute. * cgraphunit.cc (output_decl_kcfi_typeid_symbol): Output the CFI typeid corresponding to each external declaration when necessary. (output_decl_kcfi_typeid_symbols): Likewise. * combine.cc (distribute_notes): Add REG_CALL_CFI_TYPEID. * doc/tm.texi: Regenerate. * doc/tm.texi.in: New hooks. * emit-rtl.cc (try_split): Add REG_CALL_CFI_TYPEID. * emit-rtl.h (struct rtl_data): Add is_kcfi_enabled. * final.cc (final_scan_insn_1): Output kcfi check for indirect calls. * flag-types.h (enum sanitize_code): Add SANITIZE_CONTROL_FLOW_INTEGRITY. * gimple.cc (gimple_call_set_cfi_typeid): New. (gimple_build_call_1): Calculate cfi_typeid when gcall is created. * gimple.h (struct GTY): Add new member cfi_typeid for gcall. * opts.cc (parse_sanitizer_options): Add cfi and exclude SANITIZE_CONTROL_FLOW_INTEGRITY. * output.h (default_output_func_kcfi_typeid): Declare. (default_output_icall_kcfi_check): Declare. (default_calc_func_cfi_typeid): Declare. * reg-notes.def (REG_NOTE): Add new REG_NOTE CALL_CFI_TYPEID. * target.def: Add new hooks. * toplev.cc (process_options): Add CFI compile option check. * tree.cc (tree_node_sizes[): Add the unified tree type hash calculation functions. (append_unified_type_hash): Likewise. (initialize_unified_tree_type_hash_table): Likewise. (append_unified_type_name_hash): Likewise. (append_unified_type_precision_hash): Likewise. (append_unified_function_ret_and_args_hash): Likewise. (unified_type_hash): Likewise. (init_ttree): Likewise. * tree.h (unified_type_hash): Declare. * varasm.cc (assemble_start_function): Output the CFI typeid of each function. (default_output_func_kcfi_typeid): New. (default_output_icall_kcfi_check): New. (default_calc_func_cfi_typeid): New. --- gcc/cfgexpand.cc | 26 gcc/cgraphunit.cc | 34 +++ gcc/combine.cc | 1 + gcc/doc/tm.texi| 27 + gcc/doc/tm.texi.in | 8 +++ gcc/emit-rtl.cc| 1 + gcc/emit-rtl.h | 4 ++ gcc/final.cc | 24 +++- gcc/flag-types.h | 2 + gcc/gimple.cc | 11 gcc/gimple.h | 5 +- gcc/opts.cc| 4 +- gcc/output.h | 3 + gcc/reg-notes.def | 1 + gcc/target.def | 38 gcc/toplev.cc | 4 ++ gcc/tree.cc| 144 + gcc/tree.h | 1 + gcc/varasm.cc | 26 19 files changed, 361 insertions(+), 3 deletions(-) diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index d3cc77d2ca9..69c5fa30c7e 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -2845,6 +2845,18 @@ expand_call_stmt (gcall *stmt) add_reg_note (last, REG_CALL_NOCF_CHECK, const0_rtx); } + if (flag_san
[RFC/RFT, V2 0/3] Add compiler support for Kernel Control Flow Integrity
This series of patches is mainly used to support the control flow integrity protection of the linux kernel [1], which is similar to -fsanitize=kcfi in clang 16.0 [2,3]. Any suggestion please let me know :). Thanks, Dan. [1] https://lore.kernel.org/all/20220908215504.3686827-1-samitolva...@google.com/ [2] https://clang.llvm.org/docs/ControlFlowIntegrity.html [3] https://reviews.llvm.org/D119296 Signed-off-by: Dan Li --- Dan Li (3): [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features [PR102768] Support CFI: Add basic support for Kernel Control Flow Integrity [PR102768] aarch64: Add support for Kernel Control Flow Integrity gcc/asan.h| 4 +- gcc/c-family/c-attribs.cc | 10 +- gcc/c-family/c-common.h | 2 +- gcc/c/c-parser.cc | 4 +- gcc/cfgexpand.cc | 26 ++ gcc/cgraphunit.cc | 34 +++ gcc/combine.cc| 1 + gcc/common.opt| 4 +- gcc/config/aarch64/aarch64.cc | 166 ++ gcc/cp/typeck.cc | 2 +- gcc/doc/invoke.texi | 36 gcc/doc/tm.texi | 27 ++ gcc/doc/tm.texi.in| 8 ++ gcc/dwarf2asm.cc | 2 +- gcc/emit-rtl.cc | 1 + gcc/emit-rtl.h| 4 + gcc/final.cc | 24 - gcc/flag-types.h | 67 +++--- gcc/gimple.cc | 11 +++ gcc/gimple.h | 5 +- gcc/opt-suggestions.cc| 2 +- gcc/opts.cc | 26 +++--- gcc/opts.h| 8 +- gcc/output.h | 3 + gcc/reg-notes.def | 1 + gcc/target.def| 38 gcc/toplev.cc | 4 + gcc/tree-cfg.cc | 2 +- gcc/tree.cc | 144 + gcc/tree.h| 1 + gcc/varasm.cc | 26 ++ 31 files changed, 627 insertions(+), 66 deletions(-) -- 2.17.1
Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64
On 03/27, Peter Zijlstra wrote: > On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote: > > > In the compiler part[4], most of the content is the same as Sami's > > implementation[3], except for some minor differences, mainly including: > > > > 1. The function typeid is calculated differently and it is difficult > > to be consistent. > > This means there is an effective ABI break between the compilers, which > is sad :-( Is there really nothing to be done about this? Hi Peter, I'm not so sure right now. According to Sami's tip, I need to learn more about related patches. I'll make it ABI compatible in the next version if possible :) Thanks, Dan
Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64
On 03/27, Sami Tolvanen wrote: > On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra wrote: > > > > On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote: > > > > > In the compiler part[4], most of the content is the same as Sami's > > > implementation[3], except for some minor differences, mainly including: > > > > > > 1. The function typeid is calculated differently and it is difficult > > > to be consistent. > > > > This means there is an effective ABI break between the compilers, which > > is sad :-( Is there really nothing to be done about this? > > I agree, this would be unfortunate, and would also be a compatibility > issue with rustc where there's ongoing work to support > clang-compatible CFI type hashes: > > https://github.com/rust-lang/rust/pull/105452 > > Sami Hi Sami, Thanks for the info, I need to learn about it :) Is there anything else that needs to be improved? Thanks, Dan