> On Sep 29, 2016, at 11:14 AM, Torsten Duwe <d...@suse.de> wrote: > > In case anybody missed it, the Linux kernel side to make use > of this has also been finished meanwhile. Of course it can not > be accepted without compiler support; and this feature patch > is much more versatile than just Linux kernel live patching > on a single architecture.
Hi Torsten, Good job moving -fprolog-pad= forward! I've reviewed your patch, and it looks OK with the minor comments inline. I've CC'ed Richard E. since you will need a global maintainer approval for this change. Ideally, I want to improve support for -fprolog-pad=N and __attribute__((prolog_pad(N))) to provide functionality to also output pad before the function label to address use-cases for s390, sparc, etc (what Jose E. Marchesi was referring to). I.e., -fprolog-pad= option would accept both -fprolog-pad=N and -fprolog-pad=M,N forms -- issue M nops before function label and N nops after function label. Similarly for __attribute__((prolog_pad(N,M))). I (or you :-) ) can attempt to implement this functionality before stage1 closes, but it should not block this initial patch. Comments on the patch below. > > Changes since the previous version > (which in turn was based on Maxim's suggestion): > > * Document the feature in *.texi > > * Automatically disable IPA-RA, like normal profiling does. > You never know in advance what the code patched in at run time will do. > Any optimisation here is potentially wrong. > > * record a prolog_nop_pad_size value specified on the command line > in each function's attributes, so that it survives an LTO pipe. > > Signed-off-by: Torsten Duwe <d...@suse.de> > > diff --git a/gcc/attribs.c b/gcc/attribs.c > index 16996e9..a5cf2aa 100644 > --- a/gcc/attribs.c > +++ b/gcc/attribs.c > @@ -365,6 +365,21 @@ decl_attributes (tree *node, tree attributes, int flags) > if (!attributes_initialized) > init_attributes (); > > + /* If we're building NOP pads because of a command line arg, note the size > + for LTO builds, unless the attribute has already been overridden. */ > + if (TREE_CODE (*node) == FUNCTION_DECL && prolog_nop_pad_size > 0) > + { > + tree pp_attr = lookup_attribute ("prolog_pad", attributes); > + if (! pp_attr ) Coding style: should be "if (!pp_attr)" Also, this clause implies that attribute takes precedence over command-line option, and the two are not attempted to be merged. I agree that this is a reasonable approach, but consider adding a warning if prolog_nop_pad_size is bigger than the value of the existing attribute. Something like ... > + { > + tree pp_size = build_int_cstu (integer_type_node, > prolog_nop_pad_size); > + > + attributes = tree_cons (get_identifier ("prolog_pad"), > + tree_cons ( NULL_TREE, pp_size, NULL_TREE), > + attributes); > + } ... here: else if (ATTRIBUTE_VALUE (pp_attr) > prolog_nop_pad_size) warning (OPT_Wattributes, "Prologue pad might be truncated: <FUNCTION_NAME>", node); Also, I suggest to issue a warning here if ipa-ra is not disabled, and adding a comment to the documentation about issues with interoperability of -fprolog-pad/__attribute__((prolog_pad)) and IPA RA.. > + } > + > /* If this is a function and the user used #pragma GCC optimize, add the > options to the attribute((optimize(...))) list. */ > if (TREE_CODE (*node) == FUNCTION_DECL && current_optimize_pragma) > diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c > index f2846bb..278a99e 100644 > --- a/gcc/c-family/c-common.c > +++ b/gcc/c-family/c-common.c > @@ -393,6 +393,7 @@ static tree handle_designated_init_attribute (tree *, > tree, tree, int, bool *); > static tree handle_bnd_variable_size_attribute (tree *, tree, tree, int, bool > *); > static tree handle_bnd_legacy (tree *, tree, tree, int, bool *); > static tree handle_bnd_instrument (tree *, tree, tree, int, bool *); > +static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *); > > static void check_nonnull_arg (void *, tree, unsigned HOST_WIDE_INT); > static bool nonnull_check_p (tree, unsigned HOST_WIDE_INT); > @@ -834,6 +835,8 @@ const struct attribute_spec c_common_attribute_table[] = > handle_bnd_legacy, false }, > { "bnd_instrument", 0, 0, true, false, false, > handle_bnd_instrument, false }, > + { "prolog_pad", 1, 1, false, true, true, > + handle_prolog_pad_attribute, false }, The patch also needs to document new attribute in doc/extend.texi > { NULL, 0, 0, false, false, false, NULL, false } > }; > > @@ -9687,6 +9690,14 @@ handle_designated_init_attribute (tree *node, tree > name, tree, int, > return NULL_TREE; > } > > +static tree > +handle_prolog_pad_attribute (tree *, tree, tree, int, > + bool *) > +{ > + /* Nothing to be done here. */ > + return NULL_TREE; > +} > + > > /* Check for valid arguments being passed to a function with FNTYPE. > There are NARGS arguments in the array ARGARRAY. LOC should be used for > diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c > index fec58bc..0220faa 100644 > --- a/gcc/c-family/c-opts.c > +++ b/gcc/c-family/c-opts.c > @@ -536,6 +536,10 @@ c_common_handle_option (size_t scode, const char *arg, > int value, > cpp_opts->ext_numeric_literals = value; > break; > > + case OPT_fprolog_pad_: > + prolog_nop_pad_size = value; > + break; > + > case OPT_idirafter: > add_path (xstrdup (arg), AFTER, 0, true); > break; > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 88038a0..804c654 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -1427,6 +1427,10 @@ fpreprocessed > C ObjC C++ ObjC++ > Treat the input file as already preprocessed. > > +fprolog-pad= > +C ObjC C++ ObjC++ RejectNegative Joined UInteger Var(prolog_nop_pad_size) > Init(0) > +Pad NOPs before each function prolog > + > ftrack-macro-expansion > C ObjC C++ ObjC++ JoinedOrMissing RejectNegative UInteger > ; converted into ftrack-macro-expansion= > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 2ed9285..463a5a5 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -10387,6 +10387,16 @@ of the function name, it is considered to be a > match. For C99 and C++ > extended identifiers, the function name must be given in UTF-8, not > using universal character names. > > +@item -fprolog-pad=N > +@opindex fprolog-pad > +Generate a pad of N nops right at the beginning > +of each function, which can be used to patch in any desired > +instrumentation at run time, provided that the code segment > +is writeable. For run time identification, the starting addresses > +of these pads, which correspond to their respective functions, > +are additionally collected in the @code{__prolog_pads_loc} section > +of the resulting binary. ... Note that value of @code{__attribute__((prolog_pad(N)))} takes precedence over command-line option -fprolog_pad=N. > + > @end table > > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index 745910f..f6f9d59 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -4553,6 +4553,10 @@ will select the smallest suitable mode. > This section describes the macros that output function entry > (@dfn{prologue}) and exit (@dfn{epilogue}) code. > > +@deftypefn {Target Hook} void TARGET_ASM_PRINT_PROLOG_PAD (FILE *@var{file}, > unsigned HOST_WIDE_INT @var{pad_size}, bool @var{record_p}) > +Generate prologue pad > +@end deftypefn > + > @deftypefn {Target Hook} void TARGET_ASM_FUNCTION_PROLOGUE (FILE *@var{file}, > HOST_WIDE_INT @var{size}) > If defined, a function that outputs the assembler code for entry to a > function. The prologue is responsible for setting up the stack frame, > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > index f31c763..b9585da 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -3662,6 +3662,8 @@ will select the smallest suitable mode. > This section describes the macros that output function entry > (@dfn{prologue}) and exit (@dfn{epilogue}) code. > > +@hook TARGET_ASM_PRINT_PROLOG_PAD > + > @hook TARGET_ASM_FUNCTION_PROLOGUE > > @hook TARGET_ASM_FUNCTION_END_PROLOGUE > diff --git a/gcc/final.c b/gcc/final.c > index 55cf509..94bbf8f 100644 > --- a/gcc/final.c > +++ b/gcc/final.c > @@ -1754,6 +1754,7 @@ void > final_start_function (rtx_insn *first, FILE *file, > int optimize_p ATTRIBUTE_UNUSED) > { > + unsigned HOST_WIDE_INT pad_size = prolog_nop_pad_size; Declaration of pad_size can be moved under "if (prolog_pad_attr)" clause, since we don't need value of prolog_nop_pad_size here anymore (it's already encoded in the __attribute__). > block_depth = 0; > > this_is_asm_operands = 0; > @@ -1766,6 +1767,21 @@ final_start_function (rtx_insn *first, FILE *file, > > high_block_linenum = high_function_linenum = last_linenum; > > + tree prolog_pad_attr > + = lookup_attribute ("prolog_pad", TYPE_ATTRIBUTES (TREE_TYPE > (current_function_decl))); > + if (prolog_pad_attr) > + { > + tree prolog_pad_value = TREE_VALUE (TREE_VALUE (prolog_pad_attr)); > + > + if (tree_fits_uhwi_p (prolog_pad_value)) > + pad_size = tree_to_uhwi (prolog_pad_value); > + else > + gcc_unreachable (); > + > + } > + if (pad_size > 0) > + targetm.asm_out.print_prolog_pad (file, pad_size, true); Similarly, this clause can be moved inside "if (prolog_pad_attr)" clause. > + > if (flag_sanitize & SANITIZE_ADDRESS) > asan_function_start (); > > diff --git a/gcc/target.def b/gcc/target.def > index 20f2b32..e0a4cc4 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -288,6 +288,12 @@ hidden, protected or internal visibility as specified by > @var{visibility}.", > void, (tree decl, int visibility), > default_assemble_visibility) > > +DEFHOOK > +(print_prolog_pad, > + "Generate prologue pad", > + void, (FILE *file, unsigned HOST_WIDE_INT pad_size, bool record_p), > + default_print_prolog_pad) > + > /* Output the assembler code for entry to a function. */ > DEFHOOK > (function_prologue, > diff --git a/gcc/targhooks.c b/gcc/targhooks.c > index a342277..41e9850 100644 > --- a/gcc/targhooks.c > +++ b/gcc/targhooks.c > @@ -1503,6 +1503,22 @@ default_use_by_pieces_infrastructure_p (unsigned > HOST_WIDE_INT size, > return move_by_pieces_ninsns (size, alignment, max_size + 1) < ratio; > } > > +void > +default_print_prolog_pad (FILE *file, unsigned HOST_WIDE_INT pad_size, > + bool record_p) > +{ > + if (record_p) > + fprintf (file, "1:"); > + for (unsigned i = 0; i < pad_size; ++i) > + fprintf (file, "\tnop\n"); > + if (record_p) > + { > + fprintf (file, "\t.section __prolog_pads_loc, \"a\",@progbits\n"); > + fprintf (file, "\t.quad 1b\n"); > + fprintf (file, "\t.previous\n"); > + } > +} > + > bool > default_profile_before_prologue (void) > { > diff --git a/gcc/targhooks.h b/gcc/targhooks.h > index 7687c39..6bb41c4 100644 > --- a/gcc/targhooks.h > +++ b/gcc/targhooks.h > @@ -200,6 +200,7 @@ extern bool default_use_by_pieces_infrastructure_p > (unsigned HOST_WIDE_INT, > enum by_pieces_operation, > bool); > > +extern void default_print_prolog_pad (FILE *, unsigned HOST_WIDE_INT , bool); > extern bool default_profile_before_prologue (void); > extern reg_class_t default_preferred_reload_class (rtx, reg_class_t); > extern reg_class_t default_preferred_output_reload_class (rtx, reg_class_t); > diff --git a/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c > b/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c > new file mode 100644 > index 0000000..7404dc5 > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > + > +void f1(void) __attribute__((prolog_pad(1))); > +void f2(void) __attribute__((prolog_pad(2))); > + > +void > +f1 (void) > +{ > + f2 (); > +} > + > +void f2 (void) > +{ > + f1 (); > +} > diff --git a/gcc/toplev.c b/gcc/toplev.c > index 8979d26..2339ce8 100644 > --- a/gcc/toplev.c > +++ b/gcc/toplev.c > @@ -1580,8 +1580,10 @@ process_options (void) > } > > /* Do not use IPA optimizations for register allocation if profiler is active > + or prolog-pads are inserted for run-time instrumentation > or port does not emit prologue and epilogue as RTL. */ > - if (profile_flag || !targetm.have_prologue () || !targetm.have_epilogue ()) > + if (profile_flag || prolog_nop_pad_size || > + !targetm.have_prologue () || !targetm.have_epilogue ()) > flag_ipa_ra = 0; > > /* Enable -Werror=coverage-mismatch when -Werror and -Wno-error Thanks! -- Maxim Kuvyrkov www.linaro.org