On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <[email protected]> wrote:
> Hi,
> One issue revealed in tree ifcvt is the pass stores/tracks DRs against its
> memory references in IR. This causes difficulty in identifying same memory
> references appearing in different forms. Given below example:
>
> void foo (int a[], int b[])
> {
> int i;
> for (i = 0; i < 100; i++)
> {
> if (a[i] ==0)
> a[i] = b[i]*4;
> else
> a[i] = b[i]*3;
> }
> }
>
> The gimple dump before tree ifcvt is as:
>
> <bb 2>:
>
> <bb 3>:
> # i_24 = PHI <i_19(7), 0(2)>
> # ivtmp_28 = PHI <ivtmp_23(7), 100(2)>
> _5 = (long unsigned int) i_24;
> _6 = _5 * 4;
> _8 = a_7(D) + _6;
> _9 = *_8;
> if (_9 == 0)
> goto <bb 4>;
> else
> goto <bb 5>;
>
> <bb 4>:
> _11 = b_10(D) + _6;
> _12 = *_11;
> _13 = _12 * 4;
> goto <bb 6>;
>
> <bb 5>:
> _15 = b_10(D) + _6;
> _16 = *_15;
> _17 = _16 * 3;
>
> <bb 6>:
> # cstore_1 = PHI <_13(4), _17(5)>
> *_8 = cstore_1;
> i_19 = i_24 + 1;
> ivtmp_23 = ivtmp_28 - 1;
> if (ivtmp_23 != 0)
> goto <bb 7>;
> else
> goto <bb 8>;
>
> <bb 7>:
> goto <bb 3>;
>
> <bb 8>:
> return;
>
> The two memory references "*_11" and "*_15" are actually the same thing, but
> ifcvt failed to discover this because they are recorded in different forms.
> This patch fixes the issue by recording/tracking memory reference against its
> innermost_loop_behavior if: the memory reference has innermost_loop_behavior
> and it is not a compound reference.
> BTW, PR56625 reported that this case couldn't be vectorized even tree if-conv
> can handle it. Interesting thing is at current time, it's tree if-conv that
> could not handle the case. Once it's if-converted with this patch,
> vectorizer are able to handle it too.
>
> Bootstrap and test on x86_64 and AArch64. Is it OK, not sure if it's GCC 7?
Hmm.
+ equal_p = true;
+ if (e1->base_address && e2->base_address)
+ equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
+ if (e1->offset && e2->offset)
+ equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
surely better to return false early.
I think we don't want this in tree-data-refs.h also because of ...
@@ -615,15 +619,29 @@
hash_memrefs_baserefs_and_store_DRs_read_written_info
(data_reference_p a)
data_reference_p *master_dr, *base_master_dr;
tree ref = DR_REF (a);
tree base_ref = DR_BASE_OBJECT (a);
+ innermost_loop_behavior *innermost = &DR_INNERMOST (a);
tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
bool exist1, exist2;
- while (TREE_CODE (ref) == COMPONENT_REF
- || TREE_CODE (ref) == IMAGPART_EXPR
- || TREE_CODE (ref) == REALPART_EXPR)
- ref = TREE_OPERAND (ref, 0);
+ /* If reference in DR has innermost loop behavior and it is not
+ a compound memory reference, we store it to innermost_DR_map,
+ otherwise to ref_DR_map. */
+ if (TREE_CODE (ref) == COMPONENT_REF
+ || TREE_CODE (ref) == IMAGPART_EXPR
+ || TREE_CODE (ref) == REALPART_EXPR
+ || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+ || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
+ {
+ while (TREE_CODE (ref) == COMPONENT_REF
+ || TREE_CODE (ref) == IMAGPART_EXPR
+ || TREE_CODE (ref) == REALPART_EXPR)
+ ref = TREE_OPERAND (ref, 0);
+
+ master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
+ }
+ else
+ master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
we don't want an extra hashmap but replace ref_DR_map entirely. So we'd need to
strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
and REALPART) before creating the DR (or adjust the equality function
and hashing
to disregard them which means subtracting their offset from DR_INIT.
To adjust the references we collect you'd maybe could use a callback
to get_references_in_stmt
to adjust them.
OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as
Index: tree-if-conv.c
===================================================================
--- tree-if-conv.c (revision 234215)
+++ tree-if-conv.c (working copy)
@@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
for (i = 0; refs->iterate (i, &dr); i++)
{
+ tree *refp = &DR_REF (dr);
+ while ((TREE_CODE (*refp) == COMPONENT_REF
+ && TREE_OPERAND (*refp, 2) == NULL_TREE)
+ || TREE_CODE (*refp) == IMAGPART_EXPR
+ || TREE_CODE (*refp) == REALPART_EXPR)
+ refp = &TREE_OPERAND (*refp, 0);
+ if (refp != &DR_REF (dr))
+ {
+ tree saved_base = *refp;
+ *refp = integer_zero_node;
+
+ if (DR_INIT (dr))
+ {
+ tree poffset;
+ int punsignedp, preversep, pvolatilep;
+ machine_mode pmode;
+ HOST_WIDE_INT pbitsize, pbitpos;
+ get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset,
+ &pmode, &punsignedp, &preversep, &pvolatilep,
+ false);
+ gcc_assert (poffset == NULL_TREE);
+
+ DR_INIT (dr)
+ = wide_int_to_tree (ssizetype,
+ wi::sub (DR_INIT (dr),
+ pbitpos / BITS_PER_UNIT));
+ }
+
+ *refp = saved_base;
+ DR_REF (dr) = *refp;
+ }
+
dr->aux = XNEW (struct ifc_dr);
DR_BASE_W_UNCONDITIONALLY (dr) = false;
DR_RW_UNCONDITIONALLY (dr) = false;
Can you add a testcase for the vectorization?
Richard.
> Thanks,
> bin
>
>
> 2016-03-11 Bin Cheng <[email protected]>
>
> PR tree-optimization/56625
> PR tree-optimization/69489
> * tree-data-ref.h (innermost_loop_behavior_hash): New class for
> hashing struct innermost_loop_behavior.
> (DR_INNERMOST): New macro.
> * tree-if-conv.c (innermost_DR_map): New map.
> (ref_DR_map, baseref_DR_map): Revise the comment.
> (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
> to innermost_DR_map if it has innermost loop behavior and is not
> a compound reference.
> (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map if it has
> innermost loop behavior and is not a compound reference.
> (if_convertible_loop_p_1): Initialize innermost_DR_map.
> (if_convertible_loop_p): Release innermost_DR_map.
>
> gcc/testsuite/ChangeLog
> 2016-03-11 Bin Cheng <[email protected]>
>
> PR tree-optimization/56625
> PR tree-optimization/69489
> * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.
>