On Sat, Mar 17, 2018 at 11:45 AM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > This PR is another regression caused by the removal of the simple_iv > check in dr_analyze_innermost for BB analysis. Without splitting out > the step, we weren't able to find an underlying object whose alignment > could be increased. > > As with PR81635, I think the simple_iv was only handling one special > case of something that ought to be more general. The more general > thing here is that if the address can be analysed as a scalar > evolution, and if all updates preserve alignment N, it's possible > to align the address to N by increasing the alignment of the base > object to N. That applies also to outer loops, and to both loop > and BB analysis. > > I wasn't sure where the new functions ought to live, but tree-data-ref.c > seemed OK since (a) that already does scev analysis on addresses and > (b) you'd want to use dr_analyze_innermost first if you were analysing > a reference. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? > > Richard > > > 2018-03-17 Richard Sandiford <richard.sandif...@linaro.org> > > gcc/ > PR tree-optimization/84005 > * tree-data-ref.h (get_base_for_alignment): Declare. > * tree-data-ref.c (get_base_for_alignment_1): New function. > (get_base_for_alignment): Likewise. > * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use > get_base_for_alignment to find a suitable base object, instead > of always using drb->base_address. > > gcc/testsuite/ > PR tree-optimization/84005 > * gcc.dg/vect/bb-slp-1.c: Make sure there is no message about > failing to force the alignment. > > Index: gcc/tree-data-ref.h > =================================================================== > --- gcc/tree-data-ref.h 2018-01-13 18:02:00.948360274 +0000 > +++ gcc/tree-data-ref.h 2018-03-17 10:43:42.544669549 +0000 > @@ -463,6 +463,7 @@ extern bool compute_all_dependences (vec > extern tree find_data_references_in_bb (struct loop *, basic_block, > vec<data_reference_p> *); > extern unsigned int dr_alignment (innermost_loop_behavior *); > +extern tree get_base_for_alignment (tree, unsigned int *); > > /* Return the alignment in bytes that DR is guaranteed to have at all > times. */ > Index: gcc/tree-data-ref.c > =================================================================== > --- gcc/tree-data-ref.c 2018-02-14 13:14:36.268006810 +0000 > +++ gcc/tree-data-ref.c 2018-03-17 10:43:42.544669549 +0000 > @@ -5200,6 +5200,75 @@ dr_alignment (innermost_loop_behavior *d > return alignment; > } > > +/* If BASE is a pointer-typed SSA name, try to find the object that it > + is based on. Return this object X on success and store the alignment > + in bytes of BASE - &X in *ALIGNMENT_OUT. */ > + > +static tree > +get_base_for_alignment_1 (tree base, unsigned int *alignment_out) > +{ > + if (TREE_CODE (base) != SSA_NAME || !POINTER_TYPE_P (TREE_TYPE (base))) > + return NULL_TREE; > + > + gimple *def = SSA_NAME_DEF_STMT (base); > + base = analyze_scalar_evolution (loop_containing_stmt (def), base);
I think it is an error to use the def stmt of base here. Instead you need to pass down the point/loop of the use. For the same reason you also want to instantiate parameters after analyzing the evolution. In the end, if the BB to be vectorized is contained in a loop nest we want to instantiate up to the point where (eventually) a DECL we can re-align appears, but using the containing loop for now looks OK. > + /* Peel chrecs and record the minimum alignment preserved by > + all steps. */ > + unsigned int alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT; > + while (TREE_CODE (base) == POLYNOMIAL_CHREC) > + { > + unsigned int step_alignment = highest_pow2_factor (CHREC_RIGHT (base)); > + alignment = MIN (alignment, step_alignment); > + base = CHREC_LEFT (base); > + } > + > + /* Punt if the expression is too complicated to handle. */ > + if (tree_contains_chrecs (base, NULL) || !POINTER_TYPE_P (TREE_TYPE > (base))) > + return NULL_TREE; > + > + /* Analyze the base to which the steps we peeled were applied. */ > + poly_int64 bitsize, bitpos, bytepos; > + machine_mode mode; > + int unsignedp, reversep, volatilep; > + tree offset; > + base = get_inner_reference (build_fold_indirect_ref (base), ick :/ what happens if you simply use get_pointer_alignment here and strip down POINTER_PLUS_EXPRs to the ultimate LHS? (basically what get_pointer_alignment_1 does) After all get_base_for_alignment_1 itself only deals with plain SSA_NAMEs, not, say, &a_1->b.c or &MEM[a_1 + 4]. Thanks, Richard. > + &bitsize, &bitpos, &offset, &mode, > + &unsignedp, &reversep, &volatilep); > + if (!base || !multiple_p (bitpos, BITS_PER_UNIT, &bytepos)) > + return NULL_TREE; > + > + /* Restrict the alignment to that guaranteed by the offsets. */ > + unsigned int bytepos_alignment = known_alignment (bytepos); > + if (bytepos_alignment != 0) > + alignment = MIN (alignment, bytepos_alignment); > + if (offset) > + { > + unsigned int offset_alignment = highest_pow2_factor (offset); > + alignment = MIN (alignment, offset_alignment); > + } > + > + *alignment_out = alignment; > + return base; > +} > + > +/* Return the object whose alignment would need to be changed in order > + to increase the alignment of ADDR. Store the maximum achievable > + alignment in *MAX_ALIGNMENT. */ > + > +tree > +get_base_for_alignment (tree addr, unsigned int *max_alignment) > +{ > + tree base = get_base_for_alignment_1 (addr, max_alignment); > + if (base) > + return base; > + > + if (TREE_CODE (addr) == ADDR_EXPR) > + addr = TREE_OPERAND (addr, 0); > + *max_alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT; > + return addr; > +} > + > /* Recursive helper function. */ > > static bool > Index: gcc/tree-vect-data-refs.c > =================================================================== > --- gcc/tree-vect-data-refs.c 2018-03-02 09:45:40.478265383 +0000 > +++ gcc/tree-vect-data-refs.c 2018-03-17 10:43:42.544669549 +0000 > @@ -957,11 +957,11 @@ vect_compute_data_ref_alignment (struct > > if (base_alignment < vector_alignment) > { > - tree base = drb->base_address; > - if (TREE_CODE (base) == ADDR_EXPR) > - base = TREE_OPERAND (base, 0); > - if (!vect_can_force_dr_alignment_p (base, > - vector_alignment * BITS_PER_UNIT)) > + unsigned int max_alignment; > + tree base = get_base_for_alignment (drb->base_address, &max_alignment); > + if (max_alignment < vector_alignment > + || !vect_can_force_dr_alignment_p (base, > + vector_alignment * > BITS_PER_UNIT)) > { > if (dump_enabled_p ()) > { > Index: gcc/testsuite/gcc.dg/vect/bb-slp-1.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/bb-slp-1.c 2017-11-09 15:15:06.846650880 > +0000 > +++ gcc/testsuite/gcc.dg/vect/bb-slp-1.c 2018-03-17 10:43:42.541669742 > +0000 > @@ -54,5 +54,5 @@ int main (void) > return 0; > } > > +/* { dg-final { scan-tree-dump-not "can't force alignment" "slp1" } } */ > /* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" } } */ > -