This patch introduces the stack clash protection options Changes since V2:
Adds two new params. The first controls the size of the guard area. This controls the threshold for when a function prologue requires probes. The second controls the probing interval -- ie, once probes are needed, how often do we emit them. These are really meant more for developers to experiment with than users. Regardless I did go ahead and document them./PARAM It also adds some sanity checking WRT combining stack clash protection with -fstack-check. Jeff
* common.opt (-fstack-clash-protection): New option. * flag-types.h (enum stack_check_type): Note difference between -fstack-check= and -fstack-clash-protection. * params.h (set_param_value): Verify stack-clash-protection params are powers of two. * params.def (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE): New PARAM. (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL): Likewise. * toplev.c (process_options): Issue warnings/errors for cases not handled with -fstack-clash-protection. * doc/invoke.texi (-fstack-clash-protection): Document new option. (-fstack-check): Note additional problem with -fstack-check=generic. Note that -fstack-check is primarily for Ada and refer users to -fstack-clash-protection for stack-clash-protection. Document new params for stack clash protection. * toplev.c (process_options): Handle -fstack-clash-protection. testsuite/ * gcc.dg/stack-check-2.c: New test. * lib/target-supports.exp (check_effective_target_supports_stack_clash_protection): New function. (check_effective_target_frame_pointer_for_non_leaf): Likewise. (check_effective_target_caller_implicit_probes): Likewise. diff --git a/gcc/common.opt b/gcc/common.opt index e81165c..cfaf2bc 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2306,6 +2306,11 @@ fstack-check Common Alias(fstack-check=, specific, no) Insert stack checking code into the program. Same as -fstack-check=specific. +fstack-clash-protection +Common Report Var(flag_stack_clash_protection) +Insert code to probe each page of stack space as it is allocated to protect +from stack-clash style attacks + fstack-limit Common Var(common_deferred_options) Defer diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 3e5cee8..8095dc2 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10128,6 +10128,20 @@ compilation without. The value for compilation with profile feedback needs to be more conservative (higher) in order to make tracer effective. +@item stack-clash-protection-guard-size +The size (in bytes) of the stack guard. Acceptable values are powers of 2 +between 4096 and 134217728 and defaults to 4096. Higher values may reduce the +number of explicit probes, but a value larger than the operating system +provided guard will leave code vulnerable to stack clash style attacks. + +@item stack-clash-protection-probe-interval +Stack clash protection involves probing stack space as it is allocated. This +param controls the maximum distance (in bytes) between probes into the stack. +Acceptable values are powers of 2 between 4096 and 65536 and defaults to +4096. Higher values may reduce the number of explicit probes, but a value +larger than the operating system provided guard will leave code vulnerable to +stack clash style attacks. + @item max-cse-path-length The maximum number of basic blocks on path that CSE considers. @@ -11333,7 +11347,8 @@ target support in the compiler but comes with the following drawbacks: @enumerate @item Modified allocation strategy for large objects: they are always -allocated dynamically if their size exceeds a fixed threshold. +allocated dynamically if their size exceeds a fixed threshold. Note this +may change the semantics of some code. @item Fixed limit on the size of the static frame of functions: when it is @@ -11348,6 +11363,25 @@ generic implementation, code performance is hampered. Note that old-style stack checking is also the fallback method for @samp{specific} if no target support has been added in the compiler. +@samp{-fstack-check=} is designed for Ada's needs to detect infinite recursion +and stack overflows. @samp{specific} is an excellent choice when compiling +Ada code. It is not generally sufficient to protect against stack-clash +attacks. To protect against those you want @samp{-fstack-clash-protection}. + +@item -fstack-clash-protection +@opindex fstack-clash-protection +Generate code to prevent stack clash style attacks. When this option is +enabled, the compiler will only allocate one page of stack space at a time +and each page is accessed immediately after allocation. Thus, it prevents +allocations from jumping over any stack guard page provided by the +operating system. + +Most targets do not fully support stack clash protection. However, on +those targets @option{-fstack-clash-protection} will protect dynamic stack +allocations. @option{-fstack-clash-protection} may also provide limited +protection for static stack allocations if the target supports +@option{-fstack-check=specific}. + @item -fstack-limit-register=@var{reg} @itemx -fstack-limit-symbol=@var{sym} @itemx -fno-stack-limit diff --git a/gcc/flag-types.h b/gcc/flag-types.h index 5faade5..8874cba 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -166,7 +166,14 @@ enum permitted_flt_eval_methods PERMITTED_FLT_EVAL_METHODS_C11 }; -/* Type of stack check. */ +/* Type of stack check. + + Stack checking is designed to detect infinite recursion for Ada + programs. Furthermore stack checking tries to ensure that scenario + that enough stack space is left to run a signal handler. + + -fstack-check= does not prevent stack-clash style attacks. For that + you want -fstack-clash-protection. */ enum stack_check_type { /* Do not check the stack. */ diff --git a/gcc/params.c b/gcc/params.c index fab0ffa..8afe4c4 100644 --- a/gcc/params.c +++ b/gcc/params.c @@ -209,6 +209,11 @@ set_param_value (const char *name, int value, error ("maximum value of parameter %qs is %u", compiler_params[i].option, compiler_params[i].max_value); + else if ((strcmp (name, "stack-clash-protection-guard-size") == 0 + || strcmp (name, "stack-clash-protection-probe-interval") == 0) + && exact_log2 (value) == -1) + error ("value of parameter %qs must be a power of 2", + compiler_params[i].option); else set_param_value_internal ((compiler_param) i, value, params, params_set, true); diff --git a/gcc/params.def b/gcc/params.def index 6b07518..2270031 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -213,6 +213,16 @@ DEFPARAM(PARAM_STACK_FRAME_GROWTH, "Maximal stack frame growth due to inlining (in percent).", 1000, 0, 0) +DEFPARAM(PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE, + "stack-clash-protection-guard-size", + "Minimum size (in bytes) of the stack guard, must be a power of 2 >= 4096.", + 4096, 4096, 134217728) + +DEFPARAM(PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL, + "stack-clash-protection-probe-interval", + "Interval (in bytes) in which to probe the stack.", + 4096, 1024, 65536) + /* The GCSE optimization will be disabled if it would require significantly more memory than this value. */ DEFPARAM(PARAM_MAX_GCSE_MEMORY, diff --git a/gcc/testsuite/gcc.dg/stack-check-2.c b/gcc/testsuite/gcc.dg/stack-check-2.c new file mode 100644 index 0000000..196c4bb --- /dev/null +++ b/gcc/testsuite/gcc.dg/stack-check-2.c @@ -0,0 +1,66 @@ +/* The goal here is to ensure that we never consider a call to a noreturn + function as a potential tail call. + + Right now GCC discovers potential tail calls by looking at the + predecessors of the exit block. A call to a non-return function + has no successors and thus can never match that first filter. + + But that could change one day and we want to catch it. The problem + is the compiler could potentially optimize a tail call to a nonreturn + function, even if the caller has a frame. That breaks the assumption + that calls probe *sp when saving the return address that some targets + depend on to elide stack probes. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -fstack-clash-protection -fdump-tree-tailc -fdump-tree-optimized" } */ +/* { dg-require-effective-target supports_stack_clash_protection } */ + +extern void foo (void) __attribute__ ((__noreturn__)); + + +void +test_direct_1 (void) +{ + foo (); +} + +void +test_direct_2 (void) +{ + return foo (); +} + +void (*indirect)(void)__attribute__ ((noreturn)); + + +void +test_indirect_1 () +{ + (*indirect)(); +} + +void +test_indirect_2 (void) +{ + return (*indirect)();; +} + + +typedef void (*pvfn)() __attribute__ ((noreturn)); + +void (*indirect_casted)(void); + +void +test_indirect_casted_1 () +{ + (*(pvfn)indirect_casted)(); +} + +void +test_indirect_casted_2 (void) +{ + return (*(pvfn)indirect_casted)(); +} +/* { dg-final { scan-tree-dump-not "tail call" "tailc" } } */ +/* { dg-final { scan-tree-dump-not "tail call" "optimized" } } */ + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index fe5e777..25c3c80 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -8468,3 +8468,78 @@ proc check_effective_target_arm_coproc4_ok { } { return [check_cached_effective_target arm_coproc4_ok \ check_effective_target_arm_coproc4_ok_nocache] } + +# Return 1 if the target has support for stack probing designed +# to avoid stack-clash style attacks. +# +# This is used to restrict the stack-clash mitigation tests to +# just those targets that have been explicitly supported. +# +# In addition to the prologue work on those targets, each target's +# properties should be described in the functions below so that +# tests do not become a mess of unreadable target conditions. +# +proc check_effective_target_supports_stack_clash_protection { } { + if { [istarget aarch*-*-*] || [istarget x86_64-*-*] + || [istarget i?86-*-*] || [istarget s390*-*-*] + || [istarget powerpc*-*-*] || [istarget rs6000*-*-*] } { + return 1 + } + return 0 +} + +# Return 1 if the target creates a frame pointer for non-leaf functions +# Note we ignore cases where we apply tail call optimization here. +proc check_effective_target_frame_pointer_for_non_leaf { } { + if { [istarget aarch*-*-*] } { + return 1 + } + return 0 +} + +# Return 1 if the target's calling sequence or its ABI +# create implicit stack probes at or prior to function entry. +proc check_effective_target_caller_implicit_probes { } { + + # On x86/x86_64 the call instruction itself pushes the return + # address onto the stack. That is an implicit probe of *sp. + if { [istarget x86_64-*-*] || [istarget i?86-*-*] } { + return 1 + } + + # On PPC, the ABI mandates that the address of the outer + # frame be stored at *sp. Thus each allocation of stack + # space is itself an implicit probe of *sp. + if { [istarget powerpc*-*-*] || [istarget rs6000*-*-*] } { + return 1 + } + + # s390's ABI has a register save area allocated by the + # caller for use by the callee. The mere existence does + # not constitute a probe by the caller, but when the slots + # used by the callee those stores are implicit probes. + if { [istarget s390*-*-*] } { + return 1 + } + + # Not strictly true on aarch64, but we have agreed that we will + # consider any function that pushes SP more than 3kbytes into + # the guard page as broken. This essentially means that we can + # consider the aarch64 as having a caller implicit probe at + # *(sp + 1k). + if { [istarget aarch64*-*-*] } { + return 1; + } + + return 0 +} + +# Targets that potentially realign the stack pointer often cause residual +# stack allocations and make it difficult to elimination loops or residual +# allocations for dynamic stack allocations +proc check_effective_target_callee_realigns_stack { } { + if { [istarget x86_64-*-*] || [istarget i?86-*-*] } { + return 1 + } + return 0 +} diff --git a/gcc/toplev.c b/gcc/toplev.c index e6c69a4..24a00df 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1591,6 +1591,26 @@ process_options (void) flag_associative_math = 0; } + /* -fstack-clash-protection is not currently supported on targets + where the stack grows up. */ + if (flag_stack_clash_protection && !STACK_GROWS_DOWNWARD) + { + warning_at (UNKNOWN_LOCATION, 0, + "-fstack-clash_protection is not supproted on targets " + "where the stack grows from lower to higher addresses"); + flag_stack_clash_protection = 0; + } + + /* We can not support -fstack-check= and -fstack-clash-protection at + the same time. */ + if (flag_stack_check != NO_STACK_CHECK && flag_stack_clash_protection) + { + warning_at (UNKNOWN_LOCATION, 0, + "-fstack-check= and -fstack-clash_protection are mutually " + "exclusive. Disabling -fstack-check="); + flag_stack_check = NO_STACK_CHECK; + } + /* With -fcx-limited-range, we do cheap and quick complex arithmetic. */ if (flag_cx_limited_range) flag_complex_method = 0;