On Fri, Mar 27, 2020 at 2:54 PM Richard Biener <richard.guent...@gmail.com> wrote: > > On Fri, Mar 27, 2020 at 2:04 PM Kito Cheng <kito.ch...@sifive.com> wrote: > > > > Hi Richard: > > > > The local variable alignment adjustment was removed at this commit: > > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=26d7a5e690169ac04acde90070b0092c41b71c7e
Note this idn't really remove the adjustment but it delayed it until RTL expansion. /* Compute the byte alignment to use for DECL. Ignore alignment we can't do with expected alignment of the stack boundary. */ static unsigned int align_local_variable (tree decl, bool really_expand) { unsigned int align; if (TREE_CODE (decl) == SSA_NAME) align = TYPE_ALIGN (TREE_TYPE (decl)); else { align = LOCAL_DECL_ALIGNMENT (decl); /* Don't change DECL_ALIGN when called from estimated_stack_frame_size. That is done before IPA and could bump alignment based on host backend even for offloaded code which wants different LOCAL_DECL_ALIGNMENT. */ if (really_expand) SET_DECL_ALIGN (decl, align); here we should assert that we never lower a decls alignment since earlier optimization already made use of the alignment of a decl (which also means doing layout as far as alignment is concerned earlier will improve generated code). The documentation of LOCAL_DECL_ALIGNMENT says the purpose is to increase alignment. Only x86 implements LOCAL_DECL_ALIGNMENT, others just LOCAL_ALIGNMENT (x86 also implements that). The difference must be quite subtle ... Oh, and LOCAL_ALIGNMENT seems completely unused (besides providing the default implementation for LOCAL_DECL_ALIGNMENT). Huh. > > And it's a little bit indirectly, it called from > > estimate_stack_frame_size, call stack like: > > estimate_stack_frame_size -> expand_one_var -> add_stack_var -> > > align_local_variable > > > > I created a new pass for local variable alignment adjustment, and then > > Andrew Pinski suggest me can added into ipa_increase_alignment, > > because that doing similar things but for global variable. > > > > However the ipa_increase_alignment belong tree-vectorizer.c, which is > > weird place to adjust local variable alignment, so I moved out into a > > new file, so do you think just create a stand alone simple pass for > > doing that is better than this way? > > I do think that local variable layout probably doesn't belong in that IPA pass > but elsewhere (way earlier). But my main complaint was that the diff > doesn't show changes you made to the pass because it first and foremost > shows moving all the code. That makes reviewing the change impossible. > > So at least split the patch into one moving the pass to a separate file and > one doing the actual changes. > > Thanks, > Richard. > > > Thanks :) > > > > > > On Fri, Mar 27, 2020 at 8:33 PM Richard Biener > > <richard.guent...@gmail.com> wrote: > > > > > > On Fri, Mar 27, 2020 at 1:28 PM Kito Cheng <kito.ch...@sifive.com> wrote: > > > > > > > > - The alignment for local variable was adjust during > > > > estimate_stack_frame_size, > > > > however it seems wrong spot to adjust that. > > > > > > > > - So we must adjust at some point, and here is already a pass to adjust > > > > alignment, which is ipa-increase-alignment, used for tree > > > > vectorization. > > > > > > > > - We move out the pass from tree-vectorizer.c into > > > > ipa-increase-alignment.cc. > > > > > > > > - This patch fix gfortran.dg/pr45636.f90 for arm and riscv. > > > > > > > > - Regression test on riscv32/riscv64 and x86_64-linux-gnu, no new fail > > > > introduced. > > > > > > Sorry, but the diff is completely illegible due to the move to a > > > different file. > > > > > > I also dont' see that you remove local variable alignment adjustment from > > > estimate_stack_frame_size (but I would agree that adjusting there is > > > wrong). > > > > > > Please do not move the code and re-post. > > > > > > Richard. > > > > > > > gcc/ChangeLog > > > > > > > > PR target/90811 > > > > * Makefile.in (OBJS): Add ipa-increase-alignment.o. > > > > * tree-vectorizer.c (get_vec_alignment_for_type): Moved to > > > > ipa-increase-alignment.cc. > > > > (type_align_map): Ditto. > > > > (get_vec_alignment_for_array_type): Ditto. > > > > (get_vec_alignment_for_record_type): Ditto. > > > > (increase_alignment): Ditto. > > > > (pass_data_ipa_increase_alignment): Ditto. > > > > (pass_ipa_increase_alignment): Ditto. > > > > (make_pass_ipa_increase_alignment): Ditto. > > > > * ipa-increase-alignment.cc (get_vec_alignment_for_type) Moved > > > > from ipa-increase-alignment.cc. > > > > (type_align_map): Ditto. > > > > (get_vec_alignment_for_array_type): Ditto. > > > > (get_vec_alignment_for_record_type): Ditto. > > > > (increase_alignment): Ditto. > > > > (pass_data_ipa_increase_alignment): Ditto. > > > > (pass_ipa_increase_alignment): Ditto. > > > > (make_pass_ipa_increase_alignment): Ditto. > > > > (increase_alignment): New. > > > > (increase_alignment_local_var): New. > > > > (increase_alignment_global_var): New. > > > > --- > > > > gcc/Makefile.in | 1 + > > > > gcc/ipa-increase-alignment.cc | 251 ++++++++++++++++++++++++++++++++++ > > > > gcc/tree-vectorizer.c | 189 ------------------------- > > > > 3 files changed, 252 insertions(+), 189 deletions(-) > > > > create mode 100644 gcc/ipa-increase-alignment.cc > > > > > > > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > > > > index fa9923bb270..2c11252911c 100644 > > > > --- a/gcc/Makefile.in > > > > +++ b/gcc/Makefile.in > > > > @@ -1408,6 +1408,7 @@ OBJS = \ > > > > ipa-inline.o \ > > > > ipa-comdats.o \ > > > > ipa-visibility.o \ > > > > + ipa-increase-alignment.o \ > > > > ipa-inline-analysis.o \ > > > > ipa-inline-transform.o \ > > > > ipa-predicate.o \ > > > > diff --git a/gcc/ipa-increase-alignment.cc > > > > b/gcc/ipa-increase-alignment.cc > > > > new file mode 100644 > > > > index 00000000000..d09917c185f > > > > --- /dev/null > > > > +++ b/gcc/ipa-increase-alignment.cc > > > > @@ -0,0 +1,251 @@ > > > > +/* Vectorizer > > > > + Copyright (C) 2003-2020 Free Software Foundation, Inc. > > > > + Contributed by Dorit Naishlos <do...@il.ibm.com> > > > > + > > > > +This file is part of GCC. > > > > + > > > > +GCC is free software; you can redistribute it and/or modify it under > > > > +the terms of the GNU General Public License as published by the Free > > > > +Software Foundation; either version 3, or (at your option) any later > > > > +version. > > > > + > > > > +GCC is distributed in the hope that it will be useful, but WITHOUT ANY > > > > +WARRANTY; without even the implied warranty of MERCHANTABILITY or > > > > +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > > > > +for more details. > > > > + > > > > +You should have received a copy of the GNU General Public License > > > > +along with GCC; see the file COPYING3. If not see > > > > +<http://www.gnu.org/licenses/>. */ > > > > + > > > > +#include "config.h" > > > > +#include "system.h" > > > > +#include "coretypes.h" > > > > +#include "backend.h" > > > > +#include "target.h" > > > > +#include "tree.h" > > > > +#include "gimple.h" > > > > +#include "tree-pass.h" > > > > +#include "cgraph.h" > > > > +#include "fold-const.h" > > > > +#include "gimple-iterator.h" > > > > +#include "tree-cfg.h" > > > > +#include "cfgloop.h" > > > > +#include "tree-vectorizer.h" > > > > +#include "tm_p.h" > > > > + > > > > + > > > > +/* Increase alignment of global arrays to improve vectorization > > > > potential. > > > > + TODO: > > > > + - Consider also structs that have an array field. > > > > + - Use ipa analysis to prune arrays that can't be vectorized? > > > > + This should involve global alignment analysis and in the future > > > > also > > > > + array padding. */ > > > > + > > > > +static unsigned get_vec_alignment_for_type (tree); > > > > +static hash_map<tree, unsigned> *type_align_map; > > > > + > > > > +/* Return alignment of array's vector type corresponding to scalar > > > > type. > > > > + 0 if no vector type exists. */ > > > > +static unsigned > > > > +get_vec_alignment_for_array_type (tree type) > > > > +{ > > > > + gcc_assert (TREE_CODE (type) == ARRAY_TYPE); > > > > + poly_uint64 array_size, vector_size; > > > > + > > > > + tree scalar_type = strip_array_types (type); > > > > + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, > > > > scalar_type); > > > > + if (!vectype > > > > + || !poly_int_tree_p (TYPE_SIZE (type), &array_size) > > > > + || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size) > > > > + || maybe_lt (array_size, vector_size)) > > > > + return 0; > > > > + > > > > + return TYPE_ALIGN (vectype); > > > > +} > > > > + > > > > +/* Return alignment of field having maximum alignment of vector type > > > > + corresponding to it's scalar type. For now, we only consider fields > > > > whose > > > > + offset is a multiple of it's vector alignment. > > > > + 0 if no suitable field is found. */ > > > > +static unsigned > > > > +get_vec_alignment_for_record_type (tree type) > > > > +{ > > > > + gcc_assert (TREE_CODE (type) == RECORD_TYPE); > > > > + > > > > + unsigned max_align = 0, alignment; > > > > + HOST_WIDE_INT offset; > > > > + tree offset_tree; > > > > + > > > > + if (TYPE_PACKED (type)) > > > > + return 0; > > > > + > > > > + unsigned *slot = type_align_map->get (type); > > > > + if (slot) > > > > + return *slot; > > > > + > > > > + for (tree field = first_field (type); > > > > + field != NULL_TREE; > > > > + field = DECL_CHAIN (field)) > > > > + { > > > > + /* Skip if not FIELD_DECL or if alignment is set by user. */ > > > > + if (TREE_CODE (field) != FIELD_DECL > > > > + || DECL_USER_ALIGN (field) > > > > + || DECL_ARTIFICIAL (field)) > > > > + continue; > > > > + > > > > + /* We don't need to process the type further if offset is > > > > variable, > > > > + since the offsets of remaining members will also be variable. > > > > */ > > > > + if (TREE_CODE (DECL_FIELD_OFFSET (field)) != INTEGER_CST > > > > + || TREE_CODE (DECL_FIELD_BIT_OFFSET (field)) != INTEGER_CST) > > > > + break; > > > > + > > > > + /* Similarly stop processing the type if offset_tree > > > > + does not fit in unsigned HOST_WIDE_INT. */ > > > > + offset_tree = bit_position (field); > > > > + if (!tree_fits_uhwi_p (offset_tree)) > > > > + break; > > > > + > > > > + offset = tree_to_uhwi (offset_tree); > > > > + alignment = get_vec_alignment_for_type (TREE_TYPE (field)); > > > > + > > > > + /* Get maximum alignment of vectorized field/array among those > > > > members > > > > + whose offset is multiple of the vector alignment. */ > > > > + if (alignment > > > > + && (offset % alignment == 0) > > > > + && (alignment > max_align)) > > > > + max_align = alignment; > > > > + } > > > > + > > > > + type_align_map->put (type, max_align); > > > > + return max_align; > > > > +} > > > > + > > > > +/* Return alignment of vector type corresponding to decl's scalar type > > > > + or 0 if it doesn't exist or the vector alignment is lesser than > > > > + decl's alignment. */ > > > > +static unsigned > > > > +get_vec_alignment_for_type (tree type) > > > > +{ > > > > + if (type == NULL_TREE) > > > > + return 0; > > > > + > > > > + gcc_assert (TYPE_P (type)); > > > > + > > > > + static unsigned alignment = 0; > > > > + switch (TREE_CODE (type)) > > > > + { > > > > + case ARRAY_TYPE: > > > > + alignment = get_vec_alignment_for_array_type (type); > > > > + break; > > > > + case RECORD_TYPE: > > > > + alignment = get_vec_alignment_for_record_type (type); > > > > + break; > > > > + default: > > > > + alignment = 0; > > > > + break; > > > > + } > > > > + > > > > + return (alignment > TYPE_ALIGN (type)) ? alignment : 0; > > > > +} > > > > + > > > > +/* Adjust alignment for local variable. */ > > > > +static void > > > > +increase_alignment_local_var (void) > > > > +{ > > > > + size_t i; > > > > + tree var; > > > > + struct cgraph_node *node; > > > > + unsigned int align; > > > > + > > > > + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) > > > > + { > > > > + function *fun = node->get_fun (); > > > > + FOR_EACH_LOCAL_DECL (fun, i, var) > > > > + { > > > > + align = LOCAL_DECL_ALIGNMENT (var); > > > > + > > > > + SET_DECL_ALIGN (var, align); > > > > + } > > > > + } > > > > +} > > > > + > > > > +/* Adjust alignment for global variable, only used for tree > > > > vectorization > > > > + currently. */ > > > > +static void > > > > +increase_alignment_global_var (void) > > > > +{ > > > > + if (!(flag_section_anchors && flag_tree_loop_vectorize)) > > > > + return; > > > > + > > > > + varpool_node *vnode; > > > > + > > > > + vect_location = dump_user_location_t (); > > > > + type_align_map = new hash_map<tree, unsigned>; > > > > + > > > > + /* Increase the alignment of all global arrays for vectorization. */ > > > > + FOR_EACH_DEFINED_VARIABLE (vnode) > > > > + { > > > > + tree decl = vnode->decl; > > > > + unsigned int alignment; > > > > + > > > > + if ((decl_in_symtab_p (decl) > > > > + && !symtab_node::get (decl)->can_increase_alignment_p ()) > > > > + || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)) > > > > + continue; > > > > + > > > > + alignment = get_vec_alignment_for_type (TREE_TYPE (decl)); > > > > + if (alignment && vect_can_force_dr_alignment_p (decl, alignment)) > > > > + { > > > > + vnode->increase_alignment (alignment); > > > > + if (dump_enabled_p ()) > > > > + dump_printf (MSG_NOTE, "Increasing alignment of decl: > > > > %T\n", decl); > > > > + } > > > > + } > > > > + > > > > + delete type_align_map; > > > > +} > > > > + > > > > +/* Entry point to increase_alignment pass. */ > > > > +static unsigned int > > > > +increase_alignment (void) > > > > +{ > > > > + increase_alignment_local_var (); > > > > + increase_alignment_global_var (); > > > > + return 0; > > > > +} > > > > + > > > > + > > > > +namespace { > > > > + > > > > +const pass_data pass_data_ipa_increase_alignment = > > > > +{ > > > > + SIMPLE_IPA_PASS, /* type */ > > > > + "increase_alignment", /* name */ > > > > + OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */ > > > > + TV_IPA_OPT, /* tv_id */ > > > > + 0, /* properties_required */ > > > > + 0, /* properties_provided */ > > > > + 0, /* properties_destroyed */ > > > > + 0, /* todo_flags_start */ > > > > + 0, /* todo_flags_finish */ > > > > +}; > > > > + > > > > +class pass_ipa_increase_alignment : public simple_ipa_opt_pass > > > > +{ > > > > +public: > > > > + pass_ipa_increase_alignment (gcc::context *ctxt) > > > > + : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt) > > > > + {} > > > > + > > > > + virtual unsigned int execute (function *) { return > > > > increase_alignment (); } > > > > + > > > > +}; // class pass_ipa_increase_alignment > > > > + > > > > +} // anon namespace > > > > + > > > > +simple_ipa_opt_pass * > > > > +make_pass_ipa_increase_alignment (gcc::context *ctxt) > > > > +{ > > > > + return new pass_ipa_increase_alignment (ctxt); > > > > +} > > > > diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c > > > > index 8f9444d58a3..e9caf91e174 100644 > > > > --- a/gcc/tree-vectorizer.c > > > > +++ b/gcc/tree-vectorizer.c > > > > @@ -1340,195 +1340,6 @@ make_pass_slp_vectorize (gcc::context *ctxt) > > > > return new pass_slp_vectorize (ctxt); > > > > } > > > > > > > > - > > > > -/* Increase alignment of global arrays to improve vectorization > > > > potential. > > > > - TODO: > > > > - - Consider also structs that have an array field. > > > > - - Use ipa analysis to prune arrays that can't be vectorized? > > > > - This should involve global alignment analysis and in the future > > > > also > > > > - array padding. */ > > > > - > > > > -static unsigned get_vec_alignment_for_type (tree); > > > > -static hash_map<tree, unsigned> *type_align_map; > > > > - > > > > -/* Return alignment of array's vector type corresponding to scalar > > > > type. > > > > - 0 if no vector type exists. */ > > > > -static unsigned > > > > -get_vec_alignment_for_array_type (tree type) > > > > -{ > > > > - gcc_assert (TREE_CODE (type) == ARRAY_TYPE); > > > > - poly_uint64 array_size, vector_size; > > > > - > > > > - tree scalar_type = strip_array_types (type); > > > > - tree vectype = get_related_vectype_for_scalar_type (VOIDmode, > > > > scalar_type); > > > > - if (!vectype > > > > - || !poly_int_tree_p (TYPE_SIZE (type), &array_size) > > > > - || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size) > > > > - || maybe_lt (array_size, vector_size)) > > > > - return 0; > > > > - > > > > - return TYPE_ALIGN (vectype); > > > > -} > > > > - > > > > -/* Return alignment of field having maximum alignment of vector type > > > > - corresponding to it's scalar type. For now, we only consider fields > > > > whose > > > > - offset is a multiple of it's vector alignment. > > > > - 0 if no suitable field is found. */ > > > > -static unsigned > > > > -get_vec_alignment_for_record_type (tree type) > > > > -{ > > > > - gcc_assert (TREE_CODE (type) == RECORD_TYPE); > > > > - > > > > - unsigned max_align = 0, alignment; > > > > - HOST_WIDE_INT offset; > > > > - tree offset_tree; > > > > - > > > > - if (TYPE_PACKED (type)) > > > > - return 0; > > > > - > > > > - unsigned *slot = type_align_map->get (type); > > > > - if (slot) > > > > - return *slot; > > > > - > > > > - for (tree field = first_field (type); > > > > - field != NULL_TREE; > > > > - field = DECL_CHAIN (field)) > > > > - { > > > > - /* Skip if not FIELD_DECL or if alignment is set by user. */ > > > > - if (TREE_CODE (field) != FIELD_DECL > > > > - || DECL_USER_ALIGN (field) > > > > - || DECL_ARTIFICIAL (field)) > > > > - continue; > > > > - > > > > - /* We don't need to process the type further if offset is > > > > variable, > > > > - since the offsets of remaining members will also be variable. > > > > */ > > > > - if (TREE_CODE (DECL_FIELD_OFFSET (field)) != INTEGER_CST > > > > - || TREE_CODE (DECL_FIELD_BIT_OFFSET (field)) != INTEGER_CST) > > > > - break; > > > > - > > > > - /* Similarly stop processing the type if offset_tree > > > > - does not fit in unsigned HOST_WIDE_INT. */ > > > > - offset_tree = bit_position (field); > > > > - if (!tree_fits_uhwi_p (offset_tree)) > > > > - break; > > > > - > > > > - offset = tree_to_uhwi (offset_tree); > > > > - alignment = get_vec_alignment_for_type (TREE_TYPE (field)); > > > > - > > > > - /* Get maximum alignment of vectorized field/array among those > > > > members > > > > - whose offset is multiple of the vector alignment. */ > > > > - if (alignment > > > > - && (offset % alignment == 0) > > > > - && (alignment > max_align)) > > > > - max_align = alignment; > > > > - } > > > > - > > > > - type_align_map->put (type, max_align); > > > > - return max_align; > > > > -} > > > > - > > > > -/* Return alignment of vector type corresponding to decl's scalar type > > > > - or 0 if it doesn't exist or the vector alignment is lesser than > > > > - decl's alignment. */ > > > > -static unsigned > > > > -get_vec_alignment_for_type (tree type) > > > > -{ > > > > - if (type == NULL_TREE) > > > > - return 0; > > > > - > > > > - gcc_assert (TYPE_P (type)); > > > > - > > > > - static unsigned alignment = 0; > > > > - switch (TREE_CODE (type)) > > > > - { > > > > - case ARRAY_TYPE: > > > > - alignment = get_vec_alignment_for_array_type (type); > > > > - break; > > > > - case RECORD_TYPE: > > > > - alignment = get_vec_alignment_for_record_type (type); > > > > - break; > > > > - default: > > > > - alignment = 0; > > > > - break; > > > > - } > > > > - > > > > - return (alignment > TYPE_ALIGN (type)) ? alignment : 0; > > > > -} > > > > - > > > > -/* Entry point to increase_alignment pass. */ > > > > -static unsigned int > > > > -increase_alignment (void) > > > > -{ > > > > - varpool_node *vnode; > > > > - > > > > - vect_location = dump_user_location_t (); > > > > - type_align_map = new hash_map<tree, unsigned>; > > > > - > > > > - /* Increase the alignment of all global arrays for vectorization. */ > > > > - FOR_EACH_DEFINED_VARIABLE (vnode) > > > > - { > > > > - tree decl = vnode->decl; > > > > - unsigned int alignment; > > > > - > > > > - if ((decl_in_symtab_p (decl) > > > > - && !symtab_node::get (decl)->can_increase_alignment_p ()) > > > > - || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)) > > > > - continue; > > > > - > > > > - alignment = get_vec_alignment_for_type (TREE_TYPE (decl)); > > > > - if (alignment && vect_can_force_dr_alignment_p (decl, alignment)) > > > > - { > > > > - vnode->increase_alignment (alignment); > > > > - if (dump_enabled_p ()) > > > > - dump_printf (MSG_NOTE, "Increasing alignment of decl: > > > > %T\n", decl); > > > > - } > > > > - } > > > > - > > > > - delete type_align_map; > > > > - return 0; > > > > -} > > > > - > > > > - > > > > -namespace { > > > > - > > > > -const pass_data pass_data_ipa_increase_alignment = > > > > -{ > > > > - SIMPLE_IPA_PASS, /* type */ > > > > - "increase_alignment", /* name */ > > > > - OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */ > > > > - TV_IPA_OPT, /* tv_id */ > > > > - 0, /* properties_required */ > > > > - 0, /* properties_provided */ > > > > - 0, /* properties_destroyed */ > > > > - 0, /* todo_flags_start */ > > > > - 0, /* todo_flags_finish */ > > > > -}; > > > > - > > > > -class pass_ipa_increase_alignment : public simple_ipa_opt_pass > > > > -{ > > > > -public: > > > > - pass_ipa_increase_alignment (gcc::context *ctxt) > > > > - : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt) > > > > - {} > > > > - > > > > - /* opt_pass methods: */ > > > > - virtual bool gate (function *) > > > > - { > > > > - return flag_section_anchors && flag_tree_loop_vectorize; > > > > - } > > > > - > > > > - virtual unsigned int execute (function *) { return > > > > increase_alignment (); } > > > > - > > > > -}; // class pass_ipa_increase_alignment > > > > - > > > > -} // anon namespace > > > > - > > > > -simple_ipa_opt_pass * > > > > -make_pass_ipa_increase_alignment (gcc::context *ctxt) > > > > -{ > > > > - return new pass_ipa_increase_alignment (ctxt); > > > > -} > > > > - > > > > /* If the condition represented by T is a comparison or the SSA name > > > > result of a comparison, extract the comparison's operands. > > > > Represent > > > > T as NE_EXPR <T, 0> otherwise. */ > > > > -- > > > > 2.25.2 > > > >