On Tue, Aug 22, 2017 at 4:19 PM, Richard Sandiford
<[email protected]> wrote:
> Richard Biener <[email protected]> writes:
>> On Fri, Aug 18, 2017 at 12:30 PM, Richard Biener
>> <[email protected]> wrote:
>>> On Thu, Aug 17, 2017 at 2:24 PM, Bin.Cheng <[email protected]> wrote:
>>>> On Thu, Aug 17, 2017 at 12:35 PM, Richard Sandiford
>>>> <[email protected]> wrote:
>>>>> "Bin.Cheng" <[email protected]> writes:
>>>>>> On Wed, Aug 16, 2017 at 6:50 PM, Richard Sandiford
>>>>>> <[email protected]> wrote:
>>>>>>> "Bin.Cheng" <[email protected]> writes:
>>>>>>>> On Wed, Aug 16, 2017 at 5:00 PM, Richard Sandiford
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> "Bin.Cheng" <[email protected]> writes:
>>>>>>>>>> On Wed, Aug 16, 2017 at 2:38 PM, Richard Sandiford
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>> The first loop in the testcase regressed after my recent changes to
>>>>>>>>>>> dr_analyze_innermost. Previously we would treat "i" as an iv even
>>>>>>>>>>> for bb analysis and end up with:
>>>>>>>>>>>
>>>>>>>>>>> DR_BASE_ADDRESS: p or q
>>>>>>>>>>> DR_OFFSET: 0
>>>>>>>>>>> DR_INIT: 0 or 4
>>>>>>>>>>> DR_STEP: 16
>>>>>>>>>>>
>>>>>>>>>>> We now always keep the step as 0 instead, so for an int "i" we'd
>>>>>>>>>>> have:
>>>>>>>>>>>
>>>>>>>>>>> DR_BASE_ADDRESS: p or q
>>>>>>>>>>> DR_OFFSET: (intptr_t) i
>>>>>>>>>>> DR_INIT: 0 or 4
>>>>>>>>>>> DR_STEP: 0
>>>>>>>>>>>
>>>>>>>>>>> This is also what we'd like to have for the unsigned "i", but the
>>>>>>>>>>> problem is that strip_constant_offset thinks that the "i + 1" in
>>>>>>>>>>> "(intptr_t) (i + 1)" could wrap and so doesn't peel off the "+ 1".
>>>>>>>>>>> The [i + 1] accesses therefore have a DR_OFFSET equal to the SSA
>>>>>>>>>>> name that holds "(intptr_t) (i + 1)", meaning that the accesses no
>>>>>>>>>>> longer seem to be related to the [i] ones.
>>>>>>>>>>
>>>>>>>>>> Didn't read the change in detail, so sorry if I mis-understood
>>>>>>>>>> the issue.
>>>>>>>>>> I made changes in scev to better fold type conversion by
>>>>>>>>>> various sources
>>>>>>>>>> of information, for example, vrp, niters, undefined overflow
>>>>>>>>>> behavior etc.
>>>>>>>>>> In theory these information should be available for other
>>>>>>>>>> optimizers without
>>>>>>>>>> querying scev. For the mentioned test, vrp should compute
>>>>>>>>>> accurate range
>>>>>>>>>> information for "i" so that we can fold (intptr_t) (i + 1) it without
>>>>>>>>>> worrying
>>>>>>>>>> overflow. Note we don't do it in generic folding because
>>>>>>>>>> (intptr_t) (i) + 1
>>>>>>>>>> could be more expensive (especially in case of (T)(i + j)), or
>>>>>>>>>> because the
>>>>>>>>>> CST part is in bigger precision after conversion.
>>>>>>>>>> But such folding is wanted in several places, e.g, IVOPTs. To
>>>>>>>>>> provide such
>>>>>>>>>> an interface, we changed tree-affine and made it do aggressive
>>>>>>>>>> fold. I am
>>>>>>>>>> curious if it's possible to use aff_tree to implement
>>>>>>>>>> strip_constant_offset
>>>>>>>>>> here since aggressive folding is wanted. After all, using
>>>>>>>>>> additional chrec
>>>>>>>>>> looks like a little heavy wrto the simple test.
>>>>>>>>>
>>>>>>>>> Yeah, using aff_tree does work here when the bounds are constant.
>>>>>>>>> It doesn't look like it works for things like:
>>>>>>>>>
>>>>>>>>> double p[1000];
>>>>>>>>> double q[1000];
>>>>>>>>>
>>>>>>>>> void
>>>>>>>>> f4 (unsigned int n)
>>>>>>>>> {
>>>>>>>>> for (unsigned int i = 0; i < n; i += 4)
>>>>>>>>> {
>>>>>>>>> double a = q[i] + p[i];
>>>>>>>>> double b = q[i + 1] + p[i + 1];
>>>>>>>>> q[i] = a;
>>>>>>>>> q[i + 1] = b;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> though, where the bounds on the global arrays guarantee that [i + 1]
>>>>>>>>> can't
>>>>>>>>> overflow, even though "n" is unconstrained. The patch as posted
>>>>>>>>> handles
>>>>>>>>> this case too.
>>>>>>>> BTW is this a missed optimization in value range analysis? The range
>>>>>>>> information for i should flow in a way like: array boundary -> niters
>>>>>>>> -> scev/vrp.
>>>>>>>> I think that's what niters/scev do in analysis.
>>>>>>>
>>>>>>> Yeah, maybe :-) It looks like the problem is that when SLP runs,
>>>>>>> the previous VRP pass came before loop header copying, so the (single)
>>>>>>> header has to cope with n == 0 case. Thus we get:
>>>>>> Ah, there are several passes in between vrp and pass_ch, not sure if
>>>>>> any such pass depends on vrp intensively. I would suggestion reorder
>>>>>> the two passes, or standalone VRP interface updating information for
>>>>>> loop region after header copied? This is a non-trivial issue that
>>>>>> needs to be fixed. Niters analyzer rely on
>>>>>> simplify_using_initial_conditions heavily to get the same information,
>>>>>> which in my opinion should be provided by VRP. Though this won't be
>>>>>> able to obsolete simplify_using_initial_conditions because VRP is weak
>>>>>> in symbolic range...
>>>>>>
>>>>>>>
>>>>>>> Visiting statement:
>>>>>>> i_15 = ASSERT_EXPR <i_6, i_6 < n_9(D)>;
>>>>>>> Intersecting
>>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> and
>>>>>>> [0, 0]
>>>>>>> to
>>>>>>> [0, 0] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> Intersecting
>>>>>>> [0, 0] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> and
>>>>>>> [0, 1000]
>>>>>>> to
>>>>>>> [0, 0] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> Found new range for i_15: [0, 0]
>>>>>>>
>>>>>>> Visiting statement:
>>>>>>> _3 = i_15 + 1;
>>>>>>> Match-and-simplified i_15 + 1 to 1
>>>>>>> Intersecting
>>>>>>> [1, 1]
>>>>>>> and
>>>>>>> [0, +INF]
>>>>>>> to
>>>>>>> [1, 1]
>>>>>>> Found new range for _3: [1, 1]
>>>>>>>
>>>>>>> (where _3 is the index we care about), followed by:
>>>>>>>
>>>>>>> Visiting statement:
>>>>>>> i_15 = ASSERT_EXPR <i_6, i_6 < n_9(D)>;
>>>>>>> Intersectings/aarch64-linux/trunk-orig/debug/gcc'
>>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> and
>>>>>>> [0, 4]
>>>>>>> to
>>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> Intersecting
>>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> and
>>>>>>> [0, 1000]
>>>>>>> to
>>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements)
>>>>>>> Found new range for i_15: [0, n_9(D) + 4294967295]
>>>>>>>
>>>>>>> Visiting statement:
>>>>>>> _3 = i_15 + 1;
>>>>>>> Intersecting
>>>>>>> [0, +INF]
>>>>>>> and
>>>>>>> [0, +INF]
>>>>>>> to
>>>>>>> [0, +INF]
>>>>>>> Found new range for _3: [0, +INF]
>>>>>>>
>>>>>>> I guess in this case it would be better to intersect the i_15 ranges
>>>>>>> to [0, 1000] rather than [0, n_9(D) + 4294967295].
>>>>>>>
>>>>>>> It does work if another VRP pass runs after CH. But even then,
>>>>>>> is it a good idea to rely on the range info being kept up-to-date
>>>>>>> all the way through to SLP? A lot happens inbetween.
>>>>>> To some extend yes. Now I understand that SCEV uses niters
>>>>>> information to prove no_overflow. Niters analysis does infer better
>>>>>> information from array boundary, while VRP fails to do that. I don't
>>>>>> worry much about gap between vrp pass and slp, it's basically the same
>>>>>> as niters. Both information are analyzed at one point and meant to be
>>>>>> used by following passes. It's also not common for vrp information
>>>>>> become invalid given we are on SSA?
>>>>>
>>>>> I'm not worried so much about vrp information becoming invalid on
>>>>> an SSA name that existed when VRP was run. It's more a question
>>>>> of what happens about SSA names that get introduced after VRP,
>>>>> e.g. by things like dom, reassoc, PRE, etc.
>>>> For induction variables in concern, these passes shouldn't
>>>> aggressively introduces new variables I think.
>>>>>
>>>>>> Now that data address is not analyzed against loop, VRP would be the
>>>>>> only information we can use unless adding back scev analysis. IMHO,
>>>>>> the patch is doing so in another way than before. It requires
>>>>>> additional chrec data structure. I remember the previous patch
>>>>>> enables more slp vectorization, is it because of "step" information
>>>>>> from scev?
>>>>>
>>>>> Do you mean that:
>>>>>
>>>>> 2017-07-03 Richard Sandiford <[email protected]>
>>>>>
>>>>> * tree-data-ref.c (dr_analyze_innermost): Replace the "nest"
>>>>> parameter with a "loop" parameter and use it instead of the
>>>>> loop containing DR_STMT. Don't check simple_iv when doing
>>>>> BB analysis. Describe the two analysis modes in the comment.
>>>>>
>>>>> enabled more SLP vectorisation in bb-slp-pr65935.c? That was due
>>>>> to us not doing IV analysis for BB vectorisation, and ensuring that
>>>>> the step was always zero.
>>>> Which means vectorizer code can handle not IV-analyzed offset, but
>>>> can't for analyzed form?
>>>>>
>>>>>> In this patch, step information is simply discarded. I am
>>>>>> wondering if possible to always analyze scev within innermost loop for
>>>>>> slp while discards step information.
>>>>>
>>>>> Well, we don't calculate a step for bb analysis (i.e. it's not the case
>>>>> the patch calculates step information and then simply discards it).
>>>>> I don't see how that would work. For bb analysis, the DR_OFFSET + DR_INIT
>>>>> has to give the offset for every execution of the block, not just the
>>>>> first iteration of the containing loop. So if we get back a nonzero
>>>>> step, we have to do something with it.
>>>> Yeah.
>>>>>
>>>>> But:
>>>>>
>>>>> (a) the old simple_iv analysis is more expensive than simply calling
>>>>> analyze_scev, so I don't think this is a win in terms of complexity.
>>>>>
>>>>> (b) for bb analysis, there's nothing particularly special about the
>>>>> innermost loop. It makes more sense to analyse it in the innermost
>>>>> loop for which the offset is invariant, as shown by the second
>>>>> testcase in the patch.
>>>>>
>>>>> (c) The patch helps with loop vectorisation too, since analysing the
>>>>> starting DR_OFFSET in the context of the containing loop can help
>>>>> in a similar way as analysing the full offset does for SLP.
>>>>
>>>> I have to admit I am not very much into this method. It complicates
>>>> structure as well as code.
>>>> Mostly because now dr_init are split into two different fields and one
>>>> of it is lazily computed.
>>>>
>>>> For example:
>>>>> @@ -2974,12 +2974,12 @@ vect_vfa_segment_size (struct data_refer
>>>>> vect_no_alias_p (struct data_reference *a, struct data_reference *b,
>>>>> tree segment_length_a, tree segment_length_b)
>>>>> {
>>>>> - gcc_assert (TREE_CODE (DR_INIT (a)) == INTEGER_CST
>>>>> - && TREE_CODE (DR_INIT (b)) == INTEGER_CST);
>>>>> - if (tree_int_cst_equal (DR_INIT (a), DR_INIT (b)))
>>>>> + gcc_assert (TREE_CODE (DR_CHREC_INIT (a)) == INTEGER_CST
>>>>> + && TREE_CODE (DR_CHREC_INIT (b)) == INTEGER_CST);
>>>>> + if (tree_int_cst_equal (DR_CHREC_INIT (a), DR_CHREC_INIT (b)))
>>>>> return false;
>>>>>
>>>>> - tree seg_a_min = DR_INIT (a);
>>>>> + tree seg_a_min = DR_CHREC_INIT (a);
>>>>> tree seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_min),
>>>>> seg_a_min, segment_length_a);
>>>>> /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT
>>>>> @@ -2990,10 +2990,10 @@ vect_no_alias_p (struct data_reference *
>>>>> tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a)));
>>>>> seg_a_min = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_max),
>>>>> seg_a_max, unit_size);
>>>>> - seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_INIT (a)),
>>>>> - DR_INIT (a), unit_size);
>>>>> + seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_CHREC_INIT (a)),
>>>>> + DR_CHREC_INIT (a), unit_size);
>>>>> }
>>>>> - tree seg_b_min = DR_INIT (b);
>>>>> + tree seg_b_min = DR_CHREC_INIT (b);
>>>>> tree seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_b_min),
>>>>> seg_b_min, segment_length_b);
>>>>> if (tree_int_cst_compare (DR_STEP (b), size_zero_node) < 0)
>>>>
>>>> Use of DR_INIT is simply replaced by DR_CHREC_INIT. Is it safe to do
>>>> so in case of non-ZERO
>>>> DR_INIT? It worries me that I may need to think twice before
>>>> referring to DR_INIT because it's
>>>> not clear when DR_OFFSET is split and DR_CHREC_INIT becomes non-ZERO.
>>>> It may simply
>>>> because I am too dumb to handle this. I will leave this to richi.
>>>
>>> I'm trying to understand this a bit (not liking it very much in its
>>> current form).
>>>
>>> Can code currently using DR_OFFSET and DR_INIT simply use
>>> DR_CHREC_INIT and CHREC_LEFT (DR_CHREC_OFFSET) (or DR_CHREC_OFFSET
>>> if DR_CHREC_OFFSET is not a CHREC)? If yes, would there be any downside
>>> in doing that? If not, then why?
>
> There's nothing particularly special about the CHREC_LEFT for users of
> the drs. The chrec as a whole describes the variable part of the offset.
>
>>> That said, I'm all for making DR info more powerful. But I detest
>>> having extra fields
>>> and adding confusion as to when to use which. Thus if we can make
>>> DR_CHREC_INIT
>>> be DR_INIT and DR_OFFSET an inline function accessing CHREC_LEFT if
>>> suitable plus exposing DR_OFFSET_CHREC that would make me more happy.
>>
>> And if we want to make it opt-in do a dr_analyze_me_harder () which will
>> re-write DR_OFFSET/INIT into the new form.
>>
>> But DR_OFFSET and DR_INIT (read) accessors would maintain their
>> semantics while DR_OFFSET_CHREC would expose the CHREC if
>> available.
>
> After the changes, the only place that actually cared about the split
> between the "old" DR_OFFSET and DR_INIT was tree-predcom.c:
> ref_at_iteration. Everywhere else just wanted the sum of OFFSET and INIT,
> and for them it would have been more convenient to have a combined field.
>
> So maybe one way of trying to avoid the confusion would be to keep
> DR_OFFSET together as the full starting offset from the base, then
> provide DR_VAR_OFFSET and DR_CONST_OFFSET as the split forms, with
> DR_VAR_OFFSET being more like an abstract value number. See the comment
> at the start of the patch below for more details.
>
> I'm not sure whether the main reason for splitting the offset was to
> make life easier for the users of the drs, or whether it was to try
> to avoid creating new trees. But then, the unconditional scev analysis
> that we did previously already generated new trees.
Make live easier and allow same base + variable offset but differing
const offset to be detected easily -- a[i] vs. a[i+1].
@@ -787,14 +821,14 @@ canonicalize_base_object_address (tree a
bool
dr_analyze_innermost (innermost_loop_behavior *drb, tree ref,
- struct loop *loop)
+ gimple *stmt, struct loop *loop)
{
please amend the function comment with what STMT is about
(DR_STMT I suppose).
@@ -893,14 +927,14 @@ dr_analyze_innermost (innermost_loop_beh
init = size_binop (PLUS_EXPR, init, dinit);
base_misalignment -= TREE_INT_CST_LOW (dinit);
- split_constant_offset (offset_iv.base, &offset_iv.base, &dinit);
- init = size_binop (PLUS_EXPR, init, dinit);
-
- step = size_binop (PLUS_EXPR,
- fold_convert (ssizetype, base_iv.step),
- fold_convert (ssizetype, offset_iv.step));
-
base = canonicalize_base_object_address (base_iv.base);
+ tree offset = size_binop (PLUS_EXPR,
+ fold_convert (ssizetype, offset_iv.base),
+ init);
so you remove the split_constant_offset handling on offset_iv.base.
This may end up no longer walking and expanding def stmts of
SSA names contained therein. I suppose this is fully intended
so that re-computing the ref address using DR_BASE/DR_OFFSET
doesn't end up expanding that redundant code? For analysis
relying on this one now needs to resort to DR_VAR/CONST_OFFSET
where SCEV analysis hopefully performs similar expansions?
Just sth to watch at ...
@@ -921,12 +955,12 @@ dr_analyze_innermost (innermost_loop_beh
}
drb->base_address = base;
- drb->offset = fold_convert (ssizetype, offset_iv.base);
- drb->init = init;
+ drb->offset = offset;
drb->step = step;
+ split_constant_offset (scev, &drb->var_offset, &drb->const_offset);
so is the full-fledged split_constant_offset (with its SSA name handling)
still needed here? Sth to eventually address with a followup.
@@ -1490,6 +1482,7 @@ ref_at_iteration (data_reference_p dr, i
tree ref_op2 = NULL_TREE;
tree new_offset;
+ split_constant_offset (DR_OFFSET (dr), &off, &coff);
if (iter != 0)
{
new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter));
likewise here? Why do you think ref_at_iteration cares? Is that because
of codegen quality? I'd have done with coff == size_zero_node plus
simplifications that arise from that.
Thanks,
Richard.
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.
>
> Thanks,
> Richard
>
>
> 2017-08-22 Richard Sandiford <[email protected]>
>
> gcc/
> PR tree-optimization/81635
> * tree-data-ref.h (innermost_loop_behavior): Remove init field.
> Add var_offset and const_offset fields. Rename offset_alignment
> to var_offset_alignment.
> (DR_INIT): Delete.
> (DR_CONST_OFFSET, DR_VAR_OFFSET): New macros.
> (DR_OFFSET_ALIGNMENT): Replace with...
> (DR_VAR_OFFSET_ALIGNMENT): ...this new macro.
> (dr_analyze_innermost): Add a gimple * argument.
> (dr_equal_offsets_p): Delete.
> (dr_var_offsets_equal_p, dr_var_offsets_compare): Declare.
> * tree-vectorizer.h (STMT_VINFO_DR_INIT): Delete.
> (STMT_VINFO_DR_VAR_OFFSET, STMT_VINFO_DR_CONST_OFFSET): New macros.
> (STMT_VINFO_DR_OFFSET_ALIGNMENT): Replace with...
> (STMT_VINFO_DR_VAR_OFFSET_ALIGNMENT): ...this new macro.
> * tree.c (tree_ctz): Handle POLYNOMIAL_CHREC.
> * tree-data-ref.c: Include tree-ssa-loop-ivopts.h.
> (split_constant_offset): Handle POLYNOMIAL_CHREC.
> (analyze_offset_scev): New function.
> (dr_analyze_innermost): Add a gimple * statement. Update after
> changes to innermost_behavior. Initialize var_offset and
> const_offset.
> (create_data_ref): Update call to dr_analyze_innermost.
> Update dump after changes to innermost_behavior.
> (operator ==): Use dr_var_offsets_equal_p and compare the
> DR_CONST_OFFSETs.
> (prune_runtime_alias_test_list): Likewise.
> (comp_dr_with_seg_len_pair): Use dr_var_offsets_compare and compare
> the DR_CONST_OFFSETs.
> (create_intersect_range_checks): Use DR_OFFSET without adding
> DR_INIT.
> (dr_equal_offsets_p1, dr_equal_offsets_p): Delete.
> (dr_alignment): Use const_offset instead of init and
> var_offset_alignment instead of offset_alignment.
> * tree-if-conv.c (innermost_loop_behavior_hash::hash): Don't
> test the init field.
> (innermost_loop_behavior_hash::equal): Likewise.
> (ifcvt_memrefs_wont_trap): Likewise.
> (if_convertible_loop_p_1): Likewise.
> * tree-loop-distribution.c (build_addr_arg_loc): Use DR_OFFSET
> without adding DR_INIT.
> (build_rdg_partition_for_vertex): Don't check DR_INIT.
> (share_memory_accesses): Likewise.
> (pg_add_dependence_edges): Likewise.
> (compute_alias_check_pairs): Use dr_var_offsets_compare.
> * tree-predcom.c (aff_combination_dr_offset): Use DR_OFFSET without
> adding DR_INIT.
> (determine_offset): Likewise.
> (valid_initializer_p): Likewise.
> (find_looparound_phi): Update call to dr_analyze_innermost.
> (ref_at_iteration): Use split_constant_offset to split the offset.
> * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
> const_offset instead of init and var_offset_alignment instead of
> offset_alignment.
> (vect_find_same_alignment_drs): Use dr_var_offsets_compare and
> compare the DR_CONST_OFFSETs.
> (dr_group_sort_cmp): Likewise.
> (vect_analyze_group_access_1): Use DR_CONST_OFFSET instead of DR_INIT.
> (vect_no_alias_p): Likewise.
> (vect_analyze_data_ref_accesses): Use dr_var_offsets_equal_p and
> compare the DR_CONST_OFFSETs.
> (vect_prune_runtime_alias_test_list): Use dr_var_offsets_compare.
> (vect_analyze_data_refs): Don't check DR_INIT and use DR_OFFSET
> without adding DR_INIT. Use DR_VAR_OFFSET_ALIGNMENT instead of
> DR_OFFSET_ALIGNMENT. Update call to dr_analyze_innermost, and
> update subsequent dump.
> (vect_create_addr_base_for_vector_ref): Use DR_OFFSET without
> adding DR_INIT.
> * tree-vect-stmts.c (vectorizable_store): Likewise.
> (vectorizable_load): Likewise. Use DR_CONST_OFFSET instead
> of DR_INIT.
>
> gcc/testsuite/
> PR tree-optimization/81635
> * gcc.dg/vect/bb-slp-pr81635.c: New test.
>
> Index: gcc/tree-data-ref.h
> ===================================================================
> --- gcc/tree-data-ref.h 2017-08-21 10:42:51.088530428 +0100
> +++ gcc/tree-data-ref.h 2017-08-22 14:54:48.630563940 +0100
> @@ -28,29 +28,44 @@ #define GCC_TREE_DATA_REF_H
> innermost_loop_behavior describes the evolution of the address of the
> memory
> reference in the innermost enclosing loop. The address is expressed as
> BASE + STEP * # of iteration, and base is further decomposed as the base
> - pointer (BASE_ADDRESS), loop invariant offset (OFFSET) and
> - constant offset (INIT). Examples, in loop nest
> + pointer (BASE_ADDRESS) and the loop invariant offset (OFFSET).
> + OFFSET is further expressed as the sum of a zero or non-constant term
> + (VAR_OFFSET) and a constant term (CONST_OFFSET). VAR_OFFSET should be
> + treated as an abstract representation; in particular, it may contain
> + chrecs. CONST_OFFSET is always an INTEGER_CST.
>
> - for (i = 0; i < 100; i++)
> - for (j = 3; j < 100; j++)
> + Examples, in loop nest
>
> - Example 1 Example 2
> - data-ref a[j].b[i][j] *(p + x + 16B + 4B * j)
> + 1: for (i = 0; i < 100; i++)
> + 2: for (j = 3; j < 100; j++)
>
> + Example 1 Example 2
> + data-ref a[j].b[i][j] *(p + x + 16B + 4B * j)
>
> - innermost_loop_behavior
> - base_address &a p
> - offset i * D_i x
> - init 3 * D_j + offsetof (b) 28
> - step D_j 4
>
> - */
> + innermost_loop_behavior
> + base_address &a p
> + offset i * D_i + 3 * D_j + offsetof (b) x + 28
> + var_offset {0, +, D_i}_1 x (or an equiv. chrec)
> + const_offset 3 * D_j + offsetof (b) 28
> + step D_j 4
> +
> + The main two uses of VAR_OFFSET and CONST_OFFSET are:
> +
> + 1. to better analyze the alignment, since CONST_OFFSET can be treated as
> + the misalignment wrt the alignment of VAR_OFFSET.
> +
> + 2. to find data references that are a constant number of bytes apart.
> + If two data references have the same BASE_ADDRESS and VAR_OFFSET,
> + the distance between them is given by the difference in their
> + CONST_OFFSETs. */
> struct innermost_loop_behavior
> {
> tree base_address;
> tree offset;
> - tree init;
> tree step;
> + tree var_offset;
> + tree const_offset;
>
> /* BASE_ADDRESS is known to be misaligned by BASE_MISALIGNMENT bytes
> from an alignment boundary of BASE_ALIGNMENT bytes. For example,
> @@ -91,7 +106,7 @@ struct innermost_loop_behavior
> /* The largest power of two that divides OFFSET, capped to a suitably
> high value if the offset is zero. This is a byte rather than a bit
> quantity. */
> - unsigned int offset_alignment;
> + unsigned int var_offset_alignment;
>
> /* Likewise for STEP. */
> unsigned int step_alignment;
> @@ -186,12 +201,13 @@ #define DR_IS_WRITE(DR) (!DR_
> #define DR_IS_CONDITIONAL_IN_STMT(DR) (DR)->is_conditional_in_stmt
> #define DR_BASE_ADDRESS(DR) (DR)->innermost.base_address
> #define DR_OFFSET(DR) (DR)->innermost.offset
> -#define DR_INIT(DR) (DR)->innermost.init
> +#define DR_VAR_OFFSET(DR) (DR)->innermost.var_offset
> +#define DR_CONST_OFFSET(DR) (DR)->innermost.const_offset
> #define DR_STEP(DR) (DR)->innermost.step
> #define DR_PTR_INFO(DR) (DR)->alias.ptr_info
> #define DR_BASE_ALIGNMENT(DR) (DR)->innermost.base_alignment
> #define DR_BASE_MISALIGNMENT(DR) (DR)->innermost.base_misalignment
> -#define DR_OFFSET_ALIGNMENT(DR) (DR)->innermost.offset_alignment
> +#define DR_VAR_OFFSET_ALIGNMENT(DR) (DR)->innermost.var_offset_alignment
> #define DR_STEP_ALIGNMENT(DR) (DR)->innermost.step_alignment
> #define DR_INNERMOST(DR) (DR)->innermost
>
> @@ -412,7 +428,8 @@ #define DDR_REVERSED_P(DDR) (DDR)->rever
> #define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>
>
> -bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
> +bool dr_analyze_innermost (innermost_loop_behavior *, tree,
> + gimple *, struct loop *);
> extern bool compute_data_dependences_for_loop (struct loop *, bool,
> vec<loop_p> *,
> vec<data_reference_p> *,
> @@ -466,8 +483,6 @@ dr_alignment (data_reference *dr)
>
> extern bool dr_may_alias_p (const struct data_reference *,
> const struct data_reference *, bool);
> -extern bool dr_equal_offsets_p (struct data_reference *,
> - struct data_reference *);
>
> extern bool runtime_alias_check_p (ddr_p, struct loop *, bool);
> extern int data_ref_compare_tree (tree, tree);
> @@ -675,4 +690,23 @@ lambda_matrix_new (int m, int n, struct
> return mat;
> }
>
> +/* Check if DRA and DRB have equal DR_VAR_OFFSETs. */
> +
> +inline bool
> +dr_var_offsets_equal_p (struct data_reference *dra,
> + struct data_reference *drb)
> +{
> + return eq_evolutions_p (DR_VAR_OFFSET (dra), DR_VAR_OFFSET (drb));
> +}
> +
> +/* Compare the DR_VAR_OFFSETs of DRA and DRB for sorting purposes,
> + returning a qsort-style result. */
> +
> +inline int
> +dr_var_offsets_compare (struct data_reference *dra,
> + struct data_reference *drb)
> +{
> + return data_ref_compare_tree (DR_VAR_OFFSET (dra), DR_VAR_OFFSET (drb));
> +}
> +
> #endif /* GCC_TREE_DATA_REF_H */
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h 2017-08-04 11:42:45.939105152 +0100
> +++ gcc/tree-vectorizer.h 2017-08-22 14:54:48.633563940 +0100
> @@ -740,14 +740,15 @@ #define STMT_VINFO_VEC_CONST_COND_REDUC_
>
> #define STMT_VINFO_DR_WRT_VEC_LOOP(S) (S)->dr_wrt_vec_loop
> #define STMT_VINFO_DR_BASE_ADDRESS(S) (S)->dr_wrt_vec_loop.base_address
> -#define STMT_VINFO_DR_INIT(S) (S)->dr_wrt_vec_loop.init
> #define STMT_VINFO_DR_OFFSET(S) (S)->dr_wrt_vec_loop.offset
> +#define STMT_VINFO_DR_VAR_OFFSET(S) (S)->dr_wrt_vec_loop.var_offset
> +#define STMT_VINFO_DR_CONST_OFFSET(S) (S)->dr_wrt_vec_loop.const_offset
> #define STMT_VINFO_DR_STEP(S) (S)->dr_wrt_vec_loop.step
> #define STMT_VINFO_DR_BASE_ALIGNMENT(S)
> (S)->dr_wrt_vec_loop.base_alignment
> #define STMT_VINFO_DR_BASE_MISALIGNMENT(S) \
> (S)->dr_wrt_vec_loop.base_misalignment
> -#define STMT_VINFO_DR_OFFSET_ALIGNMENT(S) \
> - (S)->dr_wrt_vec_loop.offset_alignment
> +#define STMT_VINFO_DR_VAR_OFFSET_ALIGNMENT(S) \
> + (S)->dr_wrt_vec_loop.var_offset_alignment
> #define STMT_VINFO_DR_STEP_ALIGNMENT(S) \
> (S)->dr_wrt_vec_loop.step_alignment
>
> Index: gcc/tree.c
> ===================================================================
> --- gcc/tree.c 2017-08-21 12:14:47.159835474 +0100
> +++ gcc/tree.c 2017-08-22 14:54:48.634563941 +0100
> @@ -2601,6 +2601,12 @@ tree_ctz (const_tree expr)
> return MIN (ret1, prec);
> }
> return 0;
> + case POLYNOMIAL_CHREC:
> + ret1 = tree_ctz (CHREC_LEFT (expr));
> + if (ret1 == 0)
> + return ret1;
> + ret2 = tree_ctz (CHREC_RIGHT (expr));
> + return MIN (ret1, ret2);
> default:
> return 0;
> }
> Index: gcc/tree-data-ref.c
> ===================================================================
> --- gcc/tree-data-ref.c 2017-08-21 10:42:51.088530428 +0100
> +++ gcc/tree-data-ref.c 2017-08-22 14:54:48.629563940 +0100
> @@ -86,6 +86,7 @@ Software Foundation; either version 3, o
> #include "expr.h"
> #include "gimple-iterator.h"
> #include "tree-ssa-loop-niter.h"
> +#include "tree-ssa-loop-ivopts.h"
> #include "tree-ssa-loop.h"
> #include "tree-ssa.h"
> #include "cfgloop.h"
> @@ -730,7 +731,19 @@ split_constant_offset (tree exp, tree *v
> *off = ssize_int (0);
> STRIP_NOPS (exp);
>
> - if (tree_is_chrec (exp)
> + if (TREE_CODE (exp) == POLYNOMIAL_CHREC)
> + {
> + split_constant_offset (CHREC_LEFT (exp), &op0, &op1);
> + if (op0 != CHREC_LEFT (exp))
> + {
> + *var = build3 (POLYNOMIAL_CHREC, type, CHREC_VAR (exp),
> + op0, CHREC_RIGHT (exp));
> + *off = op1;
> + }
> + return;
> + }
> +
> + if (automatically_generated_chrec_p (exp)
> || get_gimple_rhs_class (TREE_CODE (exp)) == GIMPLE_TERNARY_RHS)
> return;
>
> @@ -765,7 +778,28 @@ canonicalize_base_object_address (tree a
> return build_fold_addr_expr (TREE_OPERAND (addr, 0));
> }
>
> -/* Analyze the behavior of memory reference REF. There are two modes:
> +/* Analyze the scalar evolution of OFFSET in the innermost parent of
> + LOOP for which it isn't invariant. Return OFFSET itself if the
> + value is invariant or if it's too complex to analyze. */
> +
> +static tree
> +analyze_offset_scev (struct loop *loop, tree offset)
> +{
> + struct loop *inv_loop = outermost_invariant_loop_for_expr (loop, offset);
> + if (inv_loop != NULL)
> + {
> + if (loop_depth (inv_loop) == 0)
> + return offset;
> + loop = loop_outer (inv_loop);
> + }
> + tree res = analyze_scalar_evolution (loop, offset);
> + if (chrec_contains_undetermined (res))
> + return offset;
> + return res;
> +}
> +
> +/* Analyze the behavior of memory reference REF, which occurs in STMT.
> + There are two modes:
>
> - BB analysis. In this case we simply split the address into base,
> init and offset components, without reference to any containing loop.
> @@ -787,14 +821,14 @@ canonicalize_base_object_address (tree a
>
> bool
> dr_analyze_innermost (innermost_loop_behavior *drb, tree ref,
> - struct loop *loop)
> + gimple *stmt, struct loop *loop)
> {
> HOST_WIDE_INT pbitsize, pbitpos;
> tree base, poffset;
> machine_mode pmode;
> int punsignedp, preversep, pvolatilep;
> affine_iv base_iv, offset_iv;
> - tree init, dinit, step;
> + tree dinit;
> bool in_loop = (loop && loop->num);
>
> if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -885,7 +919,7 @@ dr_analyze_innermost (innermost_loop_beh
> }
> }
>
> - init = ssize_int (pbitpos / BITS_PER_UNIT);
> + tree init = ssize_int (pbitpos / BITS_PER_UNIT);
>
> /* Subtract any constant component from the base and add it to INIT
> instead.
> Adjust the misalignment to reflect the amount we subtracted. */
> @@ -893,14 +927,14 @@ dr_analyze_innermost (innermost_loop_beh
> init = size_binop (PLUS_EXPR, init, dinit);
> base_misalignment -= TREE_INT_CST_LOW (dinit);
>
> - split_constant_offset (offset_iv.base, &offset_iv.base, &dinit);
> - init = size_binop (PLUS_EXPR, init, dinit);
> -
> - step = size_binop (PLUS_EXPR,
> - fold_convert (ssizetype, base_iv.step),
> - fold_convert (ssizetype, offset_iv.step));
> -
> base = canonicalize_base_object_address (base_iv.base);
> + tree offset = size_binop (PLUS_EXPR,
> + fold_convert (ssizetype, offset_iv.base),
> + init);
> + tree step = size_binop (PLUS_EXPR,
> + fold_convert (ssizetype, base_iv.step),
> + fold_convert (ssizetype, offset_iv.step));
> + tree scev = analyze_offset_scev (loop_containing_stmt (stmt), offset);
>
> /* See if get_pointer_alignment can guarantee a higher alignment than
> the one we calculated above. */
> @@ -921,12 +955,12 @@ dr_analyze_innermost (innermost_loop_beh
> }
>
> drb->base_address = base;
> - drb->offset = fold_convert (ssizetype, offset_iv.base);
> - drb->init = init;
> + drb->offset = offset;
> drb->step = step;
> + split_constant_offset (scev, &drb->var_offset, &drb->const_offset);
> drb->base_alignment = base_alignment;
> drb->base_misalignment = base_misalignment & (base_alignment - 1);
> - drb->offset_alignment = highest_pow2_factor (offset_iv.base);
> + drb->var_offset_alignment = highest_pow2_factor (drb->var_offset);
> drb->step_alignment = highest_pow2_factor (step);
>
> if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -1154,7 +1188,7 @@ create_data_ref (loop_p nest, loop_p loo
> DR_IS_READ (dr) = is_read;
> DR_IS_CONDITIONAL_IN_STMT (dr) = is_conditional_in_stmt;
>
> - dr_analyze_innermost (&DR_INNERMOST (dr), memref,
> + dr_analyze_innermost (&DR_INNERMOST (dr), memref, stmt,
> nest != NULL ? loop : NULL);
> dr_analyze_indices (dr, nest, loop);
> dr_analyze_alias (dr);
> @@ -1166,15 +1200,17 @@ create_data_ref (loop_p nest, loop_p loo
> print_generic_expr (dump_file, DR_BASE_ADDRESS (dr), TDF_SLIM);
> fprintf (dump_file, "\n\toffset from base address: ");
> print_generic_expr (dump_file, DR_OFFSET (dr), TDF_SLIM);
> - fprintf (dump_file, "\n\tconstant offset from base address: ");
> - print_generic_expr (dump_file, DR_INIT (dr), TDF_SLIM);
> + fprintf (dump_file, "\n\tvariable part of offset: ");
> + print_generic_expr (dump_file, DR_VAR_OFFSET (dr), TDF_SLIM);
> + fprintf (dump_file, "\n\tconstant part of offset: ");
> + print_generic_expr (dump_file, DR_CONST_OFFSET (dr), TDF_SLIM);
> fprintf (dump_file, "\n\tstep: ");
> print_generic_expr (dump_file, DR_STEP (dr), TDF_SLIM);
> fprintf (dump_file, "\n\tbase alignment: %d", DR_BASE_ALIGNMENT (dr));
> fprintf (dump_file, "\n\tbase misalignment: %d",
> DR_BASE_MISALIGNMENT (dr));
> - fprintf (dump_file, "\n\toffset alignment: %d",
> - DR_OFFSET_ALIGNMENT (dr));
> + fprintf (dump_file, "\n\tvariable offset alignment: %d",
> + DR_VAR_OFFSET_ALIGNMENT (dr));
> fprintf (dump_file, "\n\tstep alignment: %d", DR_STEP_ALIGNMENT (dr));
> fprintf (dump_file, "\n\tbase_object: ");
> print_generic_expr (dump_file, DR_BASE_OBJECT (dr), TDF_SLIM);
> @@ -1324,11 +1360,12 @@ runtime_alias_check_p (ddr_p ddr, struct
> operator == (const dr_with_seg_len& d1,
> const dr_with_seg_len& d2)
> {
> - return operand_equal_p (DR_BASE_ADDRESS (d1.dr),
> + return (operand_equal_p (DR_BASE_ADDRESS (d1.dr),
> DR_BASE_ADDRESS (d2.dr), 0)
> - && data_ref_compare_tree (DR_OFFSET (d1.dr), DR_OFFSET (d2.dr)) == > 0
> - && data_ref_compare_tree (DR_INIT (d1.dr), DR_INIT (d2.dr)) == 0
> - && data_ref_compare_tree (d1.seg_len, d2.seg_len) == 0;
> + && dr_var_offsets_equal_p (d1.dr, d2.dr)
> + && data_ref_compare_tree (DR_CONST_OFFSET (d1.dr),
> + DR_CONST_OFFSET (d2.dr)) == 0
> + && data_ref_compare_tree (d1.seg_len, d2.seg_len) == 0);
> }
>
> /* Comparison function for sorting objects of dr_with_seg_len_pair_t
> @@ -1360,17 +1397,15 @@ comp_dr_with_seg_len_pair (const void *p
> if ((comp_res = data_ref_compare_tree (DR_STEP (a2.dr),
> DR_STEP (b2.dr))) != 0)
> return comp_res;
> - if ((comp_res = data_ref_compare_tree (DR_OFFSET (a1.dr),
> - DR_OFFSET (b1.dr))) != 0)
> + if ((comp_res = dr_var_offsets_compare (a1.dr, b1.dr)) != 0)
> return comp_res;
> - if ((comp_res = data_ref_compare_tree (DR_INIT (a1.dr),
> - DR_INIT (b1.dr))) != 0)
> + if ((comp_res = data_ref_compare_tree (DR_CONST_OFFSET (a1.dr),
> + DR_CONST_OFFSET (b1.dr))) != 0)
> return comp_res;
> - if ((comp_res = data_ref_compare_tree (DR_OFFSET (a2.dr),
> - DR_OFFSET (b2.dr))) != 0)
> + if ((comp_res = dr_var_offsets_compare (a2.dr, b2.dr)) != 0)
> return comp_res;
> - if ((comp_res = data_ref_compare_tree (DR_INIT (a2.dr),
> - DR_INIT (b2.dr))) != 0)
> + if ((comp_res = data_ref_compare_tree (DR_CONST_OFFSET (a2.dr),
> + DR_CONST_OFFSET (b2.dr))) != 0)
> return comp_res;
>
> return 0;
> @@ -1455,10 +1490,9 @@ prune_runtime_alias_test_list (vec<dr_wi
>
> if (!operand_equal_p (DR_BASE_ADDRESS (dr_a1->dr),
> DR_BASE_ADDRESS (dr_a2->dr), 0)
> - || !operand_equal_p (DR_OFFSET (dr_a1->dr),
> - DR_OFFSET (dr_a2->dr), 0)
> - || !tree_fits_shwi_p (DR_INIT (dr_a1->dr))
> - || !tree_fits_shwi_p (DR_INIT (dr_a2->dr)))
> + || !dr_var_offsets_equal_p (dr_a1->dr, dr_a2->dr)
> + || !tree_fits_shwi_p (DR_CONST_OFFSET (dr_a1->dr))
> + || !tree_fits_shwi_p (DR_CONST_OFFSET (dr_a2->dr)))
> continue;
>
> /* Only merge const step data references. */
> @@ -1484,11 +1518,13 @@ prune_runtime_alias_test_list (vec<dr_wi
> continue;
>
> /* Make sure dr_a1 starts left of dr_a2. */
> - if (tree_int_cst_lt (DR_INIT (dr_a2->dr), DR_INIT (dr_a1->dr)))
> + if (tree_int_cst_lt (DR_CONST_OFFSET (dr_a2->dr),
> + DR_CONST_OFFSET (dr_a1->dr)))
> std::swap (*dr_a1, *dr_a2);
>
> bool do_remove = false;
> - wide_int diff = wi::sub (DR_INIT (dr_a2->dr), DR_INIT (dr_a1->dr));
> + wide_int diff = wi::sub (DR_CONST_OFFSET (dr_a2->dr),
> + DR_CONST_OFFSET (dr_a1->dr));
> wide_int min_seg_len_b;
> tree new_seg_len;
>
> @@ -1756,10 +1792,6 @@ create_intersect_range_checks (struct lo
> tree addr_base_b = DR_BASE_ADDRESS (dr_b.dr);
> tree offset_a = DR_OFFSET (dr_a.dr), offset_b = DR_OFFSET (dr_b.dr);
>
> - offset_a = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_a),
> - offset_a, DR_INIT (dr_a.dr));
> - offset_b = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_b),
> - offset_b, DR_INIT (dr_b.dr));
> addr_base_a = fold_build_pointer_plus (addr_base_a, offset_a);
> addr_base_b = fold_build_pointer_plus (addr_base_b, offset_b);
>
> @@ -1826,48 +1858,6 @@ create_runtime_alias_checks (struct loop
> }
> }
>
> -/* Check if OFFSET1 and OFFSET2 (DR_OFFSETs of some data-refs) are identical
> - expressions. */
> -static bool
> -dr_equal_offsets_p1 (tree offset1, tree offset2)
> -{
> - bool res;
> -
> - STRIP_NOPS (offset1);
> - STRIP_NOPS (offset2);
> -
> - if (offset1 == offset2)
> - return true;
> -
> - if (TREE_CODE (offset1) != TREE_CODE (offset2)
> - || (!BINARY_CLASS_P (offset1) && !UNARY_CLASS_P (offset1)))
> - return false;
> -
> - res = dr_equal_offsets_p1 (TREE_OPERAND (offset1, 0),
> - TREE_OPERAND (offset2, 0));
> -
> - if (!res || !BINARY_CLASS_P (offset1))
> - return res;
> -
> - res = dr_equal_offsets_p1 (TREE_OPERAND (offset1, 1),
> - TREE_OPERAND (offset2, 1));
> -
> - return res;
> -}
> -
> -/* Check if DRA and DRB have equal offsets. */
> -bool
> -dr_equal_offsets_p (struct data_reference *dra,
> - struct data_reference *drb)
> -{
> - tree offset1, offset2;
> -
> - offset1 = DR_OFFSET (dra);
> - offset2 = DR_OFFSET (drb);
> -
> - return dr_equal_offsets_p1 (offset1, offset2);
> -}
> -
> /* Returns true if FNA == FNB. */
>
> static bool
> @@ -5083,13 +5073,13 @@ dr_alignment (innermost_loop_behavior *d
> /* Get the alignment of BASE_ADDRESS + INIT. */
> unsigned int alignment = drb->base_alignment;
> unsigned int misalignment = (drb->base_misalignment
> - + TREE_INT_CST_LOW (drb->init));
> + + TREE_INT_CST_LOW (drb->const_offset));
> if (misalignment != 0)
> alignment = MIN (alignment, misalignment & -misalignment);
>
> /* Cap it to the alignment of OFFSET. */
> if (!integer_zerop (drb->offset))
> - alignment = MIN (alignment, drb->offset_alignment);
> + alignment = MIN (alignment, drb->var_offset_alignment);
>
> /* Cap it to the alignment of STEP. */
> if (!integer_zerop (drb->step))
> Index: gcc/tree-if-conv.c
> ===================================================================
> --- gcc/tree-if-conv.c 2017-07-13 09:25:12.666266733 +0100
> +++ gcc/tree-if-conv.c 2017-08-22 14:54:48.630563940 +0100
> @@ -149,7 +149,6 @@ innermost_loop_behavior_hash::hash (cons
>
> hash = iterative_hash_expr (e->base_address, 0);
> hash = iterative_hash_expr (e->offset, hash);
> - hash = iterative_hash_expr (e->init, hash);
> return iterative_hash_expr (e->step, hash);
> }
>
> @@ -161,8 +160,6 @@ innermost_loop_behavior_hash::equal (con
> || (!e1->base_address && e2->base_address)
> || (!e1->offset && e2->offset)
> || (e1->offset && !e2->offset)
> - || (!e1->init && e2->init)
> - || (e1->init && !e2->init)
> || (!e1->step && e2->step)
> || (e1->step && !e2->step))
> return false;
> @@ -173,9 +170,6 @@ innermost_loop_behavior_hash::equal (con
> if (e1->offset && e2->offset
> && !operand_equal_p (e1->offset, e2->offset, 0))
> return false;
> - if (e1->init && e2->init
> - && !operand_equal_p (e1->init, e2->init, 0))
> - return false;
> if (e1->step && e2->step
> && !operand_equal_p (e1->step, e2->step, 0))
> return false;
> @@ -856,8 +850,7 @@ ifcvt_memrefs_wont_trap (gimple *stmt, v
> innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>
> gcc_assert (DR_STMT (a) == stmt);
> - gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a)
> - || DR_INIT (a) || DR_STEP (a));
> + gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a) || DR_STEP (a));
>
> master_dr = innermost_DR_map->get (innermost);
> gcc_assert (master_dr != NULL);
> @@ -1433,8 +1426,7 @@ if_convertible_loop_p_1 (struct loop *lo
> if (TREE_CODE (ref) == COMPONENT_REF
> || TREE_CODE (ref) == IMAGPART_EXPR
> || TREE_CODE (ref) == REALPART_EXPR
> - || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr)
> - || DR_INIT (dr) || DR_STEP (dr)))
> + || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr) || DR_STEP (dr)))
> {
> while (TREE_CODE (ref) == COMPONENT_REF
> || TREE_CODE (ref) == IMAGPART_EXPR
> Index: gcc/tree-loop-distribution.c
> ===================================================================
> --- gcc/tree-loop-distribution.c 2017-08-21 10:42:51.088530428 +0100
> +++ gcc/tree-loop-distribution.c 2017-08-22 14:54:48.630563940 +0100
> @@ -903,8 +903,7 @@ build_addr_arg_loc (location_t loc, data
> {
> tree addr_base;
>
> - addr_base = size_binop_loc (loc, PLUS_EXPR, DR_OFFSET (dr), DR_INIT (dr));
> - addr_base = fold_convert_loc (loc, sizetype, addr_base);
> + addr_base = fold_convert_loc (loc, sizetype, DR_OFFSET (dr));
>
> /* Test for a negative stride, iterating over every element. */
> if (tree_int_cst_sgn (DR_STEP (dr)) == -1)
> @@ -1289,8 +1288,7 @@ build_rdg_partition_for_vertex (struct g
>
> /* Partition can only be executed sequentially if there is any
> unknown data reference. */
> - if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr)
> - || !DR_INIT (dr) || !DR_STEP (dr))
> + if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) || !DR_STEP (dr))
> partition->type = PTYPE_SEQUENTIAL;
>
> bitmap_set_bit (partition->datarefs, idx);
> @@ -1507,21 +1505,18 @@ share_memory_accesses (struct graph *rdg
> {
> dr1 = datarefs_vec[i];
>
> - if (!DR_BASE_ADDRESS (dr1)
> - || !DR_OFFSET (dr1) || !DR_INIT (dr1) || !DR_STEP (dr1))
> + if (!DR_BASE_ADDRESS (dr1) || !DR_OFFSET (dr1) || !DR_STEP (dr1))
> continue;
>
> EXECUTE_IF_SET_IN_BITMAP (partition2->datarefs, 0, j, bj)
> {
> dr2 = datarefs_vec[j];
>
> - if (!DR_BASE_ADDRESS (dr2)
> - || !DR_OFFSET (dr2) || !DR_INIT (dr2) || !DR_STEP (dr2))
> + if (!DR_BASE_ADDRESS (dr2) || !DR_OFFSET (dr2) || !DR_STEP (dr2))
> continue;
>
> if (operand_equal_p (DR_BASE_ADDRESS (dr1), DR_BASE_ADDRESS (dr2),
> 0)
> && operand_equal_p (DR_OFFSET (dr1), DR_OFFSET (dr2), 0)
> - && operand_equal_p (DR_INIT (dr1), DR_INIT (dr2), 0)
> && operand_equal_p (DR_STEP (dr1), DR_STEP (dr2), 0))
> return true;
> }
> @@ -1705,7 +1700,6 @@ pg_add_dependence_edges (struct graph *r
> runtime alias check. */
> if (!DR_BASE_ADDRESS (dr1) || !DR_BASE_ADDRESS (dr2)
> || !DR_OFFSET (dr1) || !DR_OFFSET (dr2)
> - || !DR_INIT (dr1) || !DR_INIT (dr2)
> || !DR_STEP (dr1) || !tree_fits_uhwi_p (DR_STEP (dr1))
> || !DR_STEP (dr2) || !tree_fits_uhwi_p (DR_STEP (dr2))
> || res == 0)
> @@ -2203,7 +2197,7 @@ compute_alias_check_pairs (struct loop *
> DR_BASE_ADDRESS (dr_b));
>
> if (comp_res == 0)
> - comp_res = data_ref_compare_tree (DR_OFFSET (dr_a), DR_OFFSET (dr_b));
> + comp_res = dr_var_offsets_compare (dr_a, dr_b);
> gcc_assert (comp_res != 0);
>
> if (latch_dominated_by_data_ref (loop, dr_a))
> Index: gcc/tree-predcom.c
> ===================================================================
> --- gcc/tree-predcom.c 2017-08-10 14:36:08.057471267 +0100
> +++ gcc/tree-predcom.c 2017-08-22 14:54:48.631563940 +0100
> @@ -666,18 +666,13 @@ suitable_reference_p (struct data_refere
> return true;
> }
>
> -/* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
> +/* Stores DR_OFFSET (DR) to OFFSET. */
>
> static void
> aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
> {
> - tree type = TREE_TYPE (DR_OFFSET (dr));
> - aff_tree delta;
> -
> - tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset,
> - &name_expansions);
> - aff_combination_const (&delta, type, wi::to_widest (DR_INIT (dr)));
> - aff_combination_add (offset, &delta);
> + tree_to_aff_combination_expand (DR_OFFSET (dr), TREE_TYPE (DR_OFFSET (dr)),
> + offset, &name_expansions);
> }
>
> /* Determines number of iterations of the innermost enclosing loop before B
> @@ -710,8 +705,7 @@ determine_offset (struct data_reference
> /* If the references have loop invariant address, check that they
> access
> exactly the same location. */
> *off = 0;
> - return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
> - && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
> + return operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0);
> }
>
> /* Compare the offsets of the addresses, and check whether the difference
> @@ -1171,8 +1165,7 @@ valid_initializer_p (struct data_referen
> /* If the address of the reference is invariant, initializer must access
> exactly the same location. */
> if (integer_zerop (DR_STEP (root)))
> - return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
> - && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
> + return operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0);
>
> /* Verify that this index of REF is equal to the root's index at
> -DISTANCE-th iteration. */
> @@ -1247,7 +1240,7 @@ find_looparound_phi (struct loop *loop,
> memset (&init_dr, 0, sizeof (struct data_reference));
> DR_REF (&init_dr) = init_ref;
> DR_STMT (&init_dr) = phi;
> - if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, loop))
> + if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, phi, loop))
> return NULL;
>
> if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
> @@ -1481,8 +1474,7 @@ replace_ref_with (gimple *stmt, tree new
> ref_at_iteration (data_reference_p dr, int iter,
> gimple_seq *stmts, tree niters = NULL_TREE)
> {
> - tree off = DR_OFFSET (dr);
> - tree coff = DR_INIT (dr);
> + tree off, coff;
> tree ref = DR_REF (dr);
> enum tree_code ref_code = ERROR_MARK;
> tree ref_type = NULL_TREE;
> @@ -1490,6 +1482,7 @@ ref_at_iteration (data_reference_p dr, i
> tree ref_op2 = NULL_TREE;
> tree new_offset;
>
> + split_constant_offset (DR_OFFSET (dr), &off, &coff);
> if (iter != 0)
> {
> new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter));
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c 2017-08-21 10:42:51.088530428 +0100
> +++ gcc/tree-vect-data-refs.c 2017-08-22 14:54:48.632563940 +0100
> @@ -866,7 +866,7 @@ vect_compute_data_ref_alignment (struct
> base_misalignment = (*entry)->base_misalignment;
> }
>
> - if (drb->offset_alignment < vector_alignment
> + if (drb->var_offset_alignment < vector_alignment
> || !step_preserves_misalignment_p
> /* We need to know whether the step wrt the vectorized loop is
> negative when computing the starting misalignment below. */
> @@ -928,7 +928,7 @@ vect_compute_data_ref_alignment (struct
> base_misalignment = 0;
> }
> unsigned int misalignment = (base_misalignment
> - + TREE_INT_CST_LOW (drb->init));
> + + TREE_INT_CST_LOW (drb->const_offset));
>
> /* If this is a backward running DR then first access in the larger
> vectype actually is N-1 elements before the address in the DR.
> @@ -2187,13 +2187,13 @@ vect_find_same_alignment_drs (struct dat
> return;
>
> if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
> - || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
> + || !dr_var_offsets_equal_p (dra, drb)
> || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
> return;
>
> /* Two references with distance zero have the same alignment. */
> - offset_int diff = (wi::to_offset (DR_INIT (dra))
> - - wi::to_offset (DR_INIT (drb)));
> + offset_int diff = (wi::to_offset (DR_CONST_OFFSET (dra))
> + - wi::to_offset (DR_CONST_OFFSET (drb)));
> if (diff != 0)
> {
> /* Get the wider of the two alignments. */
> @@ -2434,7 +2434,7 @@ vect_analyze_group_access_1 (struct data
> gimple *next = GROUP_NEXT_ELEMENT (vinfo_for_stmt (stmt));
> struct data_reference *data_ref = dr;
> unsigned int count = 1;
> - tree prev_init = DR_INIT (data_ref);
> + tree prev_init = DR_CONST_OFFSET (data_ref);
> gimple *prev = stmt;
> HOST_WIDE_INT diff, gaps = 0;
>
> @@ -2444,9 +2444,10 @@ vect_analyze_group_access_1 (struct data
> data-ref (supported only for loads), we vectorize only the first
> stmt, and the rest get their vectorized loads from the first
> one. */
> - if (!tree_int_cst_compare (DR_INIT (data_ref),
> - DR_INIT (STMT_VINFO_DATA_REF (
> - vinfo_for_stmt (next)))))
> + data_reference *next_ref
> + = STMT_VINFO_DATA_REF (vinfo_for_stmt (next));
> + if (!tree_int_cst_compare (DR_CONST_OFFSET (data_ref),
> + DR_CONST_OFFSET (next_ref)))
> {
> if (DR_IS_WRITE (data_ref))
> {
> @@ -2469,14 +2470,14 @@ vect_analyze_group_access_1 (struct data
> }
>
> prev = next;
> - data_ref = STMT_VINFO_DATA_REF (vinfo_for_stmt (next));
> + data_ref = next_ref;
>
> /* All group members have the same STEP by construction. */
> gcc_checking_assert (operand_equal_p (DR_STEP (data_ref), step, 0));
>
> /* Check that the distance between two accesses is equal to the
> type
> size. Otherwise, we have gaps. */
> - diff = (TREE_INT_CST_LOW (DR_INIT (data_ref))
> + diff = (TREE_INT_CST_LOW (DR_CONST_OFFSET (data_ref))
> - TREE_INT_CST_LOW (prev_init)) / type_size;
> if (diff != 1)
> {
> @@ -2499,7 +2500,7 @@ vect_analyze_group_access_1 (struct data
> gap in the access, GROUP_GAP is always 1. */
> GROUP_GAP (vinfo_for_stmt (next)) = diff;
>
> - prev_init = DR_INIT (data_ref);
> + prev_init = DR_CONST_OFFSET (data_ref);
> next = GROUP_NEXT_ELEMENT (vinfo_for_stmt (next));
> /* Count the number of data-refs in the chain. */
> count++;
> @@ -2715,13 +2716,10 @@ dr_group_sort_cmp (const void *dra_, con
> return cmp;
> }
>
> - /* And according to DR_OFFSET. */
> - if (!dr_equal_offsets_p (dra, drb))
> - {
> - cmp = data_ref_compare_tree (DR_OFFSET (dra), DR_OFFSET (drb));
> - if (cmp != 0)
> - return cmp;
> - }
> + /* And according to DR_VAR_OFFSET. */
> + cmp = dr_var_offsets_compare (dra, drb);
> + if (cmp != 0)
> + return cmp;
>
> /* Put reads before writes. */
> if (DR_IS_READ (dra) != DR_IS_READ (drb))
> @@ -2745,8 +2743,9 @@ dr_group_sort_cmp (const void *dra_, con
> return cmp;
> }
>
> - /* Then sort after DR_INIT. In case of identical DRs sort after stmt UID.
> */
> - cmp = tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb));
> + /* Then sort after DR_CONST_OFFSET. In case of identical DRs sort after
> + stmt UID. */
> + cmp = tree_int_cst_compare (DR_CONST_OFFSET (dra), DR_CONST_OFFSET (drb));
> if (cmp == 0)
> return gimple_uid (DR_STMT (dra)) < gimple_uid (DR_STMT (drb)) ? -1 : 1;
> return cmp;
> @@ -2817,7 +2816,7 @@ vect_analyze_data_ref_accesses (vec_info
> if (DR_IS_READ (dra) != DR_IS_READ (drb)
> || !operand_equal_p (DR_BASE_ADDRESS (dra),
> DR_BASE_ADDRESS (drb), 0)
> - || !dr_equal_offsets_p (dra, drb)
> + || !dr_var_offsets_equal_p (dra, drb)
> || !gimple_assign_single_p (DR_STMT (dra))
> || !gimple_assign_single_p (DR_STMT (drb)))
> break;
> @@ -2835,7 +2834,8 @@ vect_analyze_data_ref_accesses (vec_info
> break;
>
> /* Do not place the same access in the interleaving chain twice. */
> - if (tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb)) == 0)
> + if (tree_int_cst_compare (DR_CONST_OFFSET (dra),
> + DR_CONST_OFFSET (drb)) == 0)
> break;
>
> /* Check the types are compatible.
> @@ -2844,9 +2844,10 @@ vect_analyze_data_ref_accesses (vec_info
> TREE_TYPE (DR_REF (drb))))
> break;
>
> - /* Sorting has ensured that DR_INIT (dra) <= DR_INIT (drb). */
> - HOST_WIDE_INT init_a = TREE_INT_CST_LOW (DR_INIT (dra));
> - HOST_WIDE_INT init_b = TREE_INT_CST_LOW (DR_INIT (drb));
> + /* Sorting has ensured that
> + DR_CONST_OFFSET (dra) <= DR_CONST_OFFSET (drb). */
> + HOST_WIDE_INT init_a = TREE_INT_CST_LOW (DR_CONST_OFFSET (dra));
> + HOST_WIDE_INT init_b = TREE_INT_CST_LOW (DR_CONST_OFFSET (drb));
> gcc_assert (init_a <= init_b);
>
> /* If init_b == init_a + the size of the type * k, we have an
> @@ -2859,10 +2860,10 @@ vect_analyze_data_ref_accesses (vec_info
> /* If we have a store, the accesses are adjacent. This splits
> groups into chunks we support (we don't support vectorization
> of stores with gaps). */
> + HOST_WIDE_INT prev_init
> + = TREE_INT_CST_LOW (DR_CONST_OFFSET (datarefs_copy[i - 1]));
> if (!DR_IS_READ (dra)
> - && (init_b - (HOST_WIDE_INT) TREE_INT_CST_LOW
> - (DR_INIT (datarefs_copy[i-1]))
> - != type_size_a))
> + && (init_b - prev_init) != type_size_a)
> break;
>
> /* If the step (if not zero or non-constant) is greater than the
> @@ -2974,12 +2975,12 @@ vect_vfa_segment_size (struct data_refer
> vect_no_alias_p (struct data_reference *a, struct data_reference *b,
> tree segment_length_a, tree segment_length_b)
> {
> - gcc_assert (TREE_CODE (DR_INIT (a)) == INTEGER_CST
> - && TREE_CODE (DR_INIT (b)) == INTEGER_CST);
> - if (tree_int_cst_equal (DR_INIT (a), DR_INIT (b)))
> + gcc_assert (TREE_CODE (DR_CONST_OFFSET (a)) == INTEGER_CST
> + && TREE_CODE (DR_CONST_OFFSET (b)) == INTEGER_CST);
> + if (tree_int_cst_equal (DR_CONST_OFFSET (a), DR_CONST_OFFSET (b)))
> return false;
>
> - tree seg_a_min = DR_INIT (a);
> + tree seg_a_min = DR_CONST_OFFSET (a);
> tree seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_min),
> seg_a_min, segment_length_a);
> /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT
> @@ -2990,10 +2991,10 @@ vect_no_alias_p (struct data_reference *
> tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a)));
> seg_a_min = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_max),
> seg_a_max, unit_size);
> - seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_INIT (a)),
> - DR_INIT (a), unit_size);
> + seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_CONST_OFFSET (a)),
> + DR_CONST_OFFSET (a), unit_size);
> }
> - tree seg_b_min = DR_INIT (b);
> + tree seg_b_min = DR_CONST_OFFSET (b);
> tree seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_b_min),
> seg_b_min, segment_length_b);
> if (tree_int_cst_compare (DR_STEP (b), size_zero_node) < 0)
> @@ -3001,8 +3002,8 @@ vect_no_alias_p (struct data_reference *
> tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (b)));
> seg_b_min = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_b_max),
> seg_b_max, unit_size);
> - seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_INIT (b)),
> - DR_INIT (b), unit_size);
> + seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_CONST_OFFSET (b)),
> + DR_CONST_OFFSET (b), unit_size);
> }
>
> if (tree_int_cst_le (seg_a_max, seg_b_min)
> @@ -3148,8 +3149,7 @@ vect_prune_runtime_alias_test_list (loop
> comp_res = data_ref_compare_tree (DR_BASE_ADDRESS (dr_a),
> DR_BASE_ADDRESS (dr_b));
> if (comp_res == 0)
> - comp_res = data_ref_compare_tree (DR_OFFSET (dr_a),
> - DR_OFFSET (dr_b));
> + comp_res = dr_var_offsets_compare (dr_a, dr_b);
>
> /* Alias is known at compilation time. */
> if (comp_res == 0
> @@ -3455,7 +3455,7 @@ vect_analyze_data_refs (vec_info *vinfo,
> {
> gimple *stmt;
> stmt_vec_info stmt_info;
> - tree base, offset, init;
> + tree base, offset;
> enum { SG_NONE, GATHER, SCATTER } gatherscatter = SG_NONE;
> bool simd_lane_access = false;
> int vf;
> @@ -3488,8 +3488,7 @@ vect_analyze_data_refs (vec_info *vinfo,
> }
>
> /* Check that analysis of the data-ref succeeded. */
> - if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) || !DR_INIT (dr)
> - || !DR_STEP (dr))
> + if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) || !DR_STEP (dr))
> {
> bool maybe_gather
> = DR_IS_READ (dr)
> @@ -3515,7 +3514,6 @@ vect_analyze_data_refs (vec_info *vinfo,
> gcc_assert (newdr != NULL && DR_REF (newdr));
> if (DR_BASE_ADDRESS (newdr)
> && DR_OFFSET (newdr)
> - && DR_INIT (newdr)
> && DR_STEP (newdr)
> && integer_zerop (DR_STEP (newdr)))
> {
> @@ -3523,8 +3521,7 @@ vect_analyze_data_refs (vec_info *vinfo,
> {
> tree off = DR_OFFSET (newdr);
> STRIP_NOPS (off);
> - if (TREE_CODE (DR_INIT (newdr)) == INTEGER_CST
> - && TREE_CODE (off) == MULT_EXPR
> + if (TREE_CODE (off) == MULT_EXPR
> && tree_fits_uhwi_p (TREE_OPERAND (off, 1)))
> {
> tree step = TREE_OPERAND (off, 1);
> @@ -3555,7 +3552,7 @@ vect_analyze_data_refs (vec_info *vinfo,
> {
> DR_OFFSET (newdr) = ssize_int (0);
> DR_STEP (newdr) = step;
> - DR_OFFSET_ALIGNMENT (newdr)
> + DR_VAR_OFFSET_ALIGNMENT (newdr)
> = BIGGEST_ALIGNMENT;
> DR_STEP_ALIGNMENT (newdr)
> = highest_pow2_factor (step);
> @@ -3665,7 +3662,6 @@ vect_analyze_data_refs (vec_info *vinfo,
>
> base = unshare_expr (DR_BASE_ADDRESS (dr));
> offset = unshare_expr (DR_OFFSET (dr));
> - init = unshare_expr (DR_INIT (dr));
>
> if (is_gimple_call (stmt)
> && (!gimple_call_internal_p (stmt)
> @@ -3701,9 +3697,7 @@ vect_analyze_data_refs (vec_info *vinfo,
> inner loop: *(BASE + INIT + OFFSET). By construction,
> this address must be invariant in the inner loop, so we
> can consider it as being used in the outer loop. */
> - tree init_offset = fold_build2 (PLUS_EXPR, TREE_TYPE (offset),
> - init, offset);
> - tree init_addr = fold_build_pointer_plus (base, init_offset);
> + tree init_addr = fold_build_pointer_plus (base, offset);
> tree init_ref = build_fold_indirect_ref (init_addr);
>
> if (dump_enabled_p ())
> @@ -3715,7 +3709,7 @@ vect_analyze_data_refs (vec_info *vinfo,
> }
>
> if (!dr_analyze_innermost (&STMT_VINFO_DR_WRT_VEC_LOOP (stmt_info),
> - init_ref, loop))
> + init_ref, stmt, loop))
> /* dr_analyze_innermost already explained the failure. */
> return false;
>
> @@ -3728,10 +3722,14 @@ vect_analyze_data_refs (vec_info *vinfo,
> dump_printf (MSG_NOTE, "\n\touter offset from base address: ");
> dump_generic_expr (MSG_NOTE, TDF_SLIM,
> STMT_VINFO_DR_OFFSET (stmt_info));
> - dump_printf (MSG_NOTE,
> - "\n\touter constant offset from base address: ");
> + dump_printf (MSG_NOTE, "\n\tvariable part of outer offset"
> + " from base address: ");
> + dump_generic_expr (MSG_NOTE, TDF_SLIM,
> + STMT_VINFO_DR_VAR_OFFSET (stmt_info));
> + dump_printf (MSG_NOTE, "\n\tconstart part of outer offset"
> + " from base address: ");
> dump_generic_expr (MSG_NOTE, TDF_SLIM,
> - STMT_VINFO_DR_INIT (stmt_info));
> + STMT_VINFO_DR_CONST_OFFSET (stmt_info));
> dump_printf (MSG_NOTE, "\n\touter step: ");
> dump_generic_expr (MSG_NOTE, TDF_SLIM,
> STMT_VINFO_DR_STEP (stmt_info));
> @@ -3739,8 +3737,9 @@ vect_analyze_data_refs (vec_info *vinfo,
> STMT_VINFO_DR_BASE_ALIGNMENT (stmt_info));
> dump_printf (MSG_NOTE, "\n\touter base misalignment: %d\n",
> STMT_VINFO_DR_BASE_MISALIGNMENT (stmt_info));
> - dump_printf (MSG_NOTE, "\n\touter offset alignment: %d\n",
> - STMT_VINFO_DR_OFFSET_ALIGNMENT (stmt_info));
> + dump_printf (MSG_NOTE,
> + "\n\touter variable offset alignment: %d\n",
> + STMT_VINFO_DR_VAR_OFFSET_ALIGNMENT (stmt_info));
> dump_printf (MSG_NOTE, "\n\touter step alignment: %d\n",
> STMT_VINFO_DR_STEP_ALIGNMENT (stmt_info));
> }
> @@ -4055,23 +4054,18 @@ vect_create_addr_base_for_vector_ref (gi
> innermost_loop_behavior *drb = vect_dr_behavior (dr);
>
> tree data_ref_base = unshare_expr (drb->base_address);
> - tree base_offset = unshare_expr (drb->offset);
> - tree init = unshare_expr (drb->init);
> -
> + tree base_offset;
> if (loop_vinfo)
> - base_name = get_name (data_ref_base);
> + {
> + base_name = get_name (data_ref_base);
> + base_offset = fold_convert (sizetype, unshare_expr (drb->offset));
> + }
> else
> {
> - base_offset = ssize_int (0);
> - init = ssize_int (0);
> + base_offset = size_int (0);
> base_name = get_name (DR_REF (dr));
> }
>
> - /* Create base_offset */
> - base_offset = size_binop (PLUS_EXPR,
> - fold_convert (sizetype, base_offset),
> - fold_convert (sizetype, init));
> -
> if (offset)
> {
> offset = fold_build2 (MULT_EXPR, sizetype,
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c 2017-08-21 15:50:48.664709938 +0100
> +++ gcc/tree-vect-stmts.c 2017-08-22 14:54:48.633563940 +0100
> @@ -5970,9 +5970,7 @@ vectorizable_store (gimple *stmt, gimple
> stride_base
> = fold_build_pointer_plus
> (unshare_expr (DR_BASE_ADDRESS (first_dr)),
> - size_binop (PLUS_EXPR,
> - convert_to_ptrofftype (unshare_expr (DR_OFFSET
> (first_dr))),
> - convert_to_ptrofftype (DR_INIT (first_dr))));
> + convert_to_ptrofftype (unshare_expr (DR_OFFSET (first_dr))));
> stride_step = fold_convert (sizetype, unshare_expr (DR_STEP
> (first_dr)));
>
> /* For a store with loop-invariant (but other than power-of-2)
> @@ -6299,7 +6297,6 @@ vectorizable_store (gimple *stmt, gimple
> && TREE_CODE (DR_BASE_ADDRESS (first_dr)) == ADDR_EXPR
> && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr), 0))
> && integer_zerop (DR_OFFSET (first_dr))
> - && integer_zerop (DR_INIT (first_dr))
> && alias_sets_conflict_p (get_alias_set (aggr_type),
> get_alias_set (TREE_TYPE (ref_type))))
> {
> @@ -6993,9 +6990,7 @@ vectorizable_load (gimple *stmt, gimple_
> stride_base
> = fold_build_pointer_plus
> (DR_BASE_ADDRESS (first_dr),
> - size_binop (PLUS_EXPR,
> - convert_to_ptrofftype (DR_OFFSET (first_dr)),
> - convert_to_ptrofftype (DR_INIT (first_dr))));
> + convert_to_ptrofftype (DR_OFFSET (first_dr)));
> stride_step = fold_convert (sizetype, DR_STEP (first_dr));
>
> /* For a load with loop-invariant (but other than power-of-2)
> @@ -7394,7 +7389,6 @@ vectorizable_load (gimple *stmt, gimple_
> && TREE_CODE (DR_BASE_ADDRESS (first_dr)) == ADDR_EXPR
> && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr), 0))
> && integer_zerop (DR_OFFSET (first_dr))
> - && integer_zerop (DR_INIT (first_dr))
> && alias_sets_conflict_p (get_alias_set (aggr_type),
> get_alias_set (TREE_TYPE (ref_type)))
> && (alignment_support_scheme == dr_aligned
> @@ -7417,8 +7411,8 @@ vectorizable_load (gimple *stmt, gimple_
> = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt_for_drptr));
> tree diff = fold_convert (sizetype,
> size_binop (MINUS_EXPR,
> - DR_INIT (first_dr),
> - DR_INIT (ptrdr)));
> + DR_CONST_OFFSET
> (first_dr),
> + DR_CONST_OFFSET (ptrdr)));
> dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi,
> stmt, diff);
> }
> Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635.c
> ===================================================================
> --- /dev/null 2017-08-21 17:11:06.647307681 +0100
> +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635.c 2017-08-22 14:54:48.628563940
> +0100
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_unpack } */
> +
> +double p[1000];
> +double q[1000];
> +
> +void
> +f1 (void)
> +{
> + for (unsigned int i = 0; i < 1000; i += 4)
> + {
> + double a = q[i] + p[i];
> + double b = q[i + 1] + p[i + 1];
> + q[i] = a;
> + q[i + 1] = b;
> + }
> +}
> +
> +void
> +f2 (void)
> +{
> + for (unsigned int i = 0; i < 500; i += 6)
> + for (unsigned int j = 0; j < 500; j += 4)
> + {
> + double a = q[j] + p[i];
> + double b = q[j + 1] + p[i + 1];
> + q[i] = a;
> + q[i + 1] = b;
> + }
> +}
> +
> +void
> +f3 (void)
> +{
> + for (unsigned int i = 2; i < 1000; i += 4)
> + {
> + double a = q[i - 2] + p[i - 2];
> + double b = q[i - 1] + p[i - 1];
> + q[i - 2] = a;
> + q[i - 1] = b;
> + }
> +}
> +
> +void
> +f4 (unsigned int n)
> +{
> + for (unsigned int i = 0; i < n; i += 4)
> + {
> + double a = q[i] + p[i];
> + double b = q[i + 1] + p[i + 1];
> + q[i] = a;
> + q[i + 1] = b;
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "basic block vectorized" 4 "slp1" } } */