On Mon, Mar 19, 2018 at 10:27 PM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > Richard Biener <richard.guent...@gmail.com> writes: >> On Sat, Mar 17, 2018 at 11:45 AM, Richard Sandiford >> <richard.sandif...@linaro.org> wrote: >>> Index: gcc/tree-data-ref.c >>> =================================================================== >>> --- gcc/tree-data-ref.c 2018-02-14 13:14:36.268006810 +0000 >>> +++ gcc/tree-data-ref.c 2018-03-17 10:43:42.544669549 +0000 >>> @@ -5200,6 +5200,75 @@ dr_alignment (innermost_loop_behavior *d >>> return alignment; >>> } >>> >>> +/* If BASE is a pointer-typed SSA name, try to find the object that it >>> + is based on. Return this object X on success and store the alignment >>> + in bytes of BASE - &X in *ALIGNMENT_OUT. */ >>> + >>> +static tree >>> +get_base_for_alignment_1 (tree base, unsigned int *alignment_out) >>> +{ >>> + if (TREE_CODE (base) != SSA_NAME || !POINTER_TYPE_P (TREE_TYPE (base))) >>> + return NULL_TREE; >>> + >>> + gimple *def = SSA_NAME_DEF_STMT (base); >>> + base = analyze_scalar_evolution (loop_containing_stmt (def), base); >> >> I think it is an error to use the def stmt of base here. Instead you >> need to pass down the point/loop of the use. For the same reason you >> also want to instantiate parameters after analyzing the evolution. >> >> In the end, if the BB to be vectorized is contained in a loop nest we want to >> instantiate up to the point where (eventually) a DECL we can re-align >> appears, >> but using the containing loop for now looks OK. > > Why's that necessary/better though? We're not interested in the > evolution of the value at the point it*s used by the potential vector > code, but in how we get from the ultimate base (if there is one) to the > definition of the SSA name.
Hmm, ok. > I don't think instantiating the SCEV would give any new information, > but it could lose some. E.g. if we have: > > x = &foo; > do > x += 8; > while (*y++ < 10) > ...potential vector use of *x... > > we wouldn't be able to express the address of x after the do-while > loop, because there's nothing that counts the number of iterations. > But the uninstantiated SCEV could tell us that x = &foo + N * 8 for > some (unknown) N. Not sure that it works that way. I'm not 100% sure what kind of parameters are left symbolic, and if analysis loop and instantiation "loop" are the same I guess it doesn't make any difference... > (OK, so that doesn't actually work unless we take the effort > to look through loop-closed SSA form, but still...) > >>> + /* Peel chrecs and record the minimum alignment preserved by >>> + all steps. */ >>> + unsigned int alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT; >>> + while (TREE_CODE (base) == POLYNOMIAL_CHREC) >>> + { >>> + unsigned int step_alignment = highest_pow2_factor (CHREC_RIGHT >>> (base)); >>> + alignment = MIN (alignment, step_alignment); >>> + base = CHREC_LEFT (base); >>> + } >>> + >>> + /* Punt if the expression is too complicated to handle. */ >>> + if (tree_contains_chrecs (base, NULL) || !POINTER_TYPE_P (TREE_TYPE >>> (base))) >>> + return NULL_TREE; >>> + >>> + /* Analyze the base to which the steps we peeled were applied. */ >>> + poly_int64 bitsize, bitpos, bytepos; >>> + machine_mode mode; >>> + int unsignedp, reversep, volatilep; >>> + tree offset; >>> + base = get_inner_reference (build_fold_indirect_ref (base), >> >> ick :/ >> >> what happens if you simply use get_pointer_alignment here and >> strip down POINTER_PLUS_EXPRs to the ultimate LHS? (basically >> what get_pointer_alignment_1 does) After all get_base_for_alignment_1 >> itself only deals with plain SSA_NAMEs, not, say, &a_1->b.c or &MEM[a_1 + 4]. > > Yeah, but those have (hopefully) been handled by dr_analyze_innermost > already, which should have stripped off both the constant and variable > parts of the offset. So I think SSA names are the only interesting > input. The problem is that once we follow the definitions of an SSA > name through CHREC_LEFTs, we get a general address again. > > get_pointer_alignment and get_pointer_alignment_1 don't do what we want > because they take the alignment of the existing object into account, > whereas here we explicitly want to ignore that and only calculate the > alignment of the offset. Ah, indeed - I missed that fact. > I guess we could pass a flag down to ignore the current alignment though? But we need the base anyway... so, given we can only ever re-align decls and never plain pointers instead of using build_fold_indirect_ref do if (TREE_CODE (base) != ADDR_EXPR) return NULL_TREE; else use TREE_OPERAND (base, 0)? Richard. > > Thanks, > Richard