Richard Biener <richard.guent...@gmail.com> writes:
> On Sat, Mar 17, 2018 at 11:45 AM, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>> Index: gcc/tree-data-ref.c
>> ===================================================================
>> --- gcc/tree-data-ref.c 2018-02-14 13:14:36.268006810 +0000
>> +++ gcc/tree-data-ref.c 2018-03-17 10:43:42.544669549 +0000
>> @@ -5200,6 +5200,75 @@ dr_alignment (innermost_loop_behavior *d
>>    return alignment;
>>  }
>>
>> +/* If BASE is a pointer-typed SSA name, try to find the object that it
>> +   is based on.  Return this object X on success and store the alignment
>> +   in bytes of BASE - &X in *ALIGNMENT_OUT.  */
>> +
>> +static tree
>> +get_base_for_alignment_1 (tree base, unsigned int *alignment_out)
>> +{
>> +  if (TREE_CODE (base) != SSA_NAME || !POINTER_TYPE_P (TREE_TYPE (base)))
>> +    return NULL_TREE;
>> +
>> +  gimple *def = SSA_NAME_DEF_STMT (base);
>> +  base = analyze_scalar_evolution (loop_containing_stmt (def), base);
>
> I think it is an error to use the def stmt of base here.  Instead you
> need to pass down the point/loop of the use.  For the same reason you
> also want to instantiate parameters after analyzing the evolution.
>
> In the end, if the BB to be vectorized is contained in a loop nest we want to
> instantiate up to the point where (eventually) a DECL we can re-align appears,
> but using the containing loop for now looks OK.

Why's that necessary/better though?  We're not interested in the
evolution of the value at the point it*s used by the potential vector
code, but in how we get from the ultimate base (if there is one) to the
definition of the SSA name.

I don't think instantiating the SCEV would give any new information,
but it could lose some.  E.g. if we have:

  x = &foo;
  do
    x += 8;
  while (*y++ < 10)
  ...potential vector use of *x...

we wouldn't be able to express the address of x after the do-while
loop, because there's nothing that counts the number of iterations.
But the uninstantiated SCEV could tell us that x = &foo + N * 8 for
some (unknown) N.

(OK, so that doesn't actually work unless we take the effort
to look through loop-closed SSA form, but still...)

>> +  /* Peel chrecs and record the minimum alignment preserved by
>> +     all steps.  */
>> +  unsigned int alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT;
>> +  while (TREE_CODE (base) == POLYNOMIAL_CHREC)
>> +    {
>> +      unsigned int step_alignment = highest_pow2_factor (CHREC_RIGHT 
>> (base));
>> +      alignment = MIN (alignment, step_alignment);
>> +      base = CHREC_LEFT (base);
>> +    }
>> +
>> +  /* Punt if the expression is too complicated to handle.  */
>> + if (tree_contains_chrecs (base, NULL) || !POINTER_TYPE_P (TREE_TYPE
>> (base)))
>> +    return NULL_TREE;
>> +
>> +  /* Analyze the base to which the steps we peeled were applied.  */
>> +  poly_int64 bitsize, bitpos, bytepos;
>> +  machine_mode mode;
>> +  int unsignedp, reversep, volatilep;
>> +  tree offset;
>> +  base = get_inner_reference (build_fold_indirect_ref (base),
>
> ick :/
>
> what happens if you simply use get_pointer_alignment here and
> strip down POINTER_PLUS_EXPRs to the ultimate LHS?  (basically
> what get_pointer_alignment_1 does)  After all get_base_for_alignment_1
> itself only deals with plain SSA_NAMEs, not, say, &a_1->b.c or &MEM[a_1 + 4].

Yeah, but those have (hopefully) been handled by dr_analyze_innermost
already, which should have stripped off both the constant and variable
parts of the offset.  So I think SSA names are the only interesting
input.  The problem is that once we follow the definitions of an SSA
name through CHREC_LEFTs, we get a general address again.

get_pointer_alignment and get_pointer_alignment_1 don't do what we want
because they take the alignment of the existing object into account,
whereas here we explicitly want to ignore that and only calculate the
alignment of the offset.

I guess we could pass a flag down to ignore the current alignment though?

Thanks,
Richard

Reply via email to