On December 30, 2019 3:15:14 PM GMT+01:00, Richard Sandiford 
<richard.sandif...@arm.com> wrote:
>This patch fixes an awkward corner case in which:
>
>(a) we apply if-conversion to a loop;
>
>(b) the original scalar loop doesn't have a vdef, and thus doesn't
>    need a virtual phi;
>
>(c) the vectorised main loop does need a vdef and a virtual phi (see
>below);
>
>(d) we also vectorise the epilogue; and
>
>(e) the vectorised epilogue still needs a scalar epilogue
>
>The specific case in which (c) applies is if a read-only loop is
>vectorised using IFN_LOAD_LANES, which uses clobber statements to
>mark the lifetime of the temporary array.
>
>The vectoriser relies on the SSA renamer to update virtual operands.
>All would probably be well if it postponed this update until after
>it had vectorised both the main loop and the epilogue loop.  However,
>when vectorising the epilogue, vect_do_peeling does:
>
>  create_lcssa_for_virtual_phi (loop);
>  update_ssa (TODO_update_ssa_only_virtuals);
>
>(with "loop" in this case being the to-be-vectorised epilogue loop).
>So the vectoriser puts the virtual operand into SSA form for the
>vectorised main loop as a separate step, during the early stages
>of vectorising the epilogue.
>
>I wasn't sure at first why that update_ssa was there.  It looked
>initially like it was related to create_lcssa_for_virtual_phi,
>which seemed strange when create_lcssa_for_virtual_phi keeps the
>SSA form up-to-date.  But before r241099 it had the following comment,
>which AFAICT is still the reason:
>
>  /* We might have a queued need to update virtual SSA form.  As we
>     delete the update SSA machinery below after doing a regular
>     incremental SSA update during loop copying make sure we don't
>     lose that fact.
>     ???  Needing to update virtual SSA form by renaming is unfortunate
>     but not all of the vectorizer code inserting new loads / stores
>     properly assigns virtual operands to those statements.  */
>
>The patch restores that comment since IMO it's helpful.
>
>(a), (d) and (e) mean that we copy the original un-if-converted scalar
>loop to act as the scalar epilogue.  The update_ssa above means that
>this
>copying needs to cope with any new virtual SSA names in the main loop.
>The code to do that (reasonably) assumed that one of two things was
>true:
>
>(1) the scalar loop and the vector loops don't have vdefs, and so no
>    virtual operand update is needed.  The definition that applies
>    on entry to the loops is the same in all cases.
>
>(2) the scalar loop and the vector loops have virtual phis, and so --
>    after applying create_lcssa_for_virtual_phi on the to-be-vectorised
>    epilogue loop -- the virtual operand update can be handled in the
>    same way as for normal SSA names.
>
>But (b) and (c) together mean that the scalar loop and the
>still-to-be-vectorised epilogue loop have no virtual phi that (2)
>can use.  We'd therefore keep the original vuses when duplicating,
>rather than updating them to the definition that applies on exit
>from the epilogue loop.  (Since the epilogue is still unvectorised
>and has no vdefs, the definition that applies on exit is the same
>as the one that applies on entry.)
>
>This patch therefore adds a third case: the scalar loop and
>to-be-vectorised epilogue have no virtual defs, but the main loop does.
>
>Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Ok. 

Richard. 

>Richard
>
>
>2019-12-30  Richard Sandiford  <richard.sandif...@arm.com>
>
>gcc/
>       * tree-vect-loop-manip.c (create_lcssa_for_virtual_phi): Return
>       the incoming virtual operand definition.
>       (vect_do_peeling): When vectorizing an epilogue loop, handle the
>       case in which the main loop has a virtual phi and the epilogue
>       and scalar loops don't.  Restore an earlier comment about the
>       update_ssa call.
>
>gcc/testsuite/
>       * gcc.dg/vect/vect-epilogues-2.c: New test.
>
>Index: gcc/tree-vect-loop-manip.c
>===================================================================
>--- gcc/tree-vect-loop-manip.c 2019-12-30 14:12:25.000000000 +0000
>+++ gcc/tree-vect-loop-manip.c 2019-12-30 14:12:26.082419824 +0000
>@@ -1249,9 +1249,12 @@ slpeel_can_duplicate_loop_p (const class
>the *guard[12] routines, which assume loop closed SSA form for all PHIs
>  (but normally loop closed SSA form doesn't require virtual PHIs to be
>    in the same form).  Doing this early simplifies the checking what
>-   uses should be renamed.  */
>+   uses should be renamed.
> 
>-static void
>+   If we create a new phi after the loop, return the definition that
>+   applies on entry to the loop, otherwise return null.  */
>+
>+static tree
> create_lcssa_for_virtual_phi (class loop *loop)
> {
>   gphi_iterator gsi;
>@@ -1283,10 +1286,12 @@ create_lcssa_for_virtual_phi (class loop
>                 && !flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
>               FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
>                 SET_USE (use_p, new_vop);
>+
>+          return PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
>         }
>       break;
>       }
>-
>+  return NULL_TREE;
> }
> 
> /* Function vect_get_loop_location.
>@@ -2483,9 +2488,42 @@ vect_do_peeling (loop_vec_info loop_vinf
>class loop *prolog, *epilog = NULL, *loop = LOOP_VINFO_LOOP
>(loop_vinfo);
>   class loop *first_loop = loop;
>bool irred_flag = loop_preheader_edge (loop)->flags &
>EDGE_IRREDUCIBLE_LOOP;
>-  create_lcssa_for_virtual_phi (loop);
>+
>+  /* We might have a queued need to update virtual SSA form.  As we
>+     delete the update SSA machinery below after doing a regular
>+     incremental SSA update during loop copying make sure we don't
>+     lose that fact.
>+     ???  Needing to update virtual SSA form by renaming is
>unfortunate
>+     but not all of the vectorizer code inserting new loads / stores
>+     properly assigns virtual operands to those statements.  */
>   update_ssa (TODO_update_ssa_only_virtuals);
> 
>+  create_lcssa_for_virtual_phi (loop);
>+
>+  /* If we're vectorizing an epilogue loop, the update_ssa above will
>+     have ensured that the virtual operand is in SSA form throughout
>the
>+     vectorized main loop.  Normally it is possible to trace the
>updated
>+     vector-stmt vdefs back to scalar-stmt vdefs and vector-stmt vuses
>+     back to scalar-stmt vuses, meaning that the effect of the SSA
>update
>+     remains local to the main loop.  However, there are rare cases in
>+     which the vectorized loop has vdefs even when the original scalar
>+     loop didn't.  For example, vectorizing a load with IFN_LOAD_LANES
>+     introduces clobbers of the temporary vector array, which in turn
>+     needs new vdefs.  If the scalar loop doesn't write to memory,
>these
>+     new vdefs will be the only ones in the vector loop.
>+
>+     In that case, update_ssa will have added a new virtual phi to the
>+     main loop, which previously didn't need one.  Ensure that we
>(locally)
>+     maintain LCSSA form for the virtual operand, just as we would
>have
>+     done if the virtual phi had existed from the outset.  This makes
>it
>+     easier to duplicate the scalar epilogue loop below.  */
>+  tree vop_to_rename = NULL_TREE;
>+  if (loop_vec_info orig_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO
>(loop_vinfo))
>+    {
>+      class loop *orig_loop = LOOP_VINFO_LOOP (orig_loop_vinfo);
>+      vop_to_rename = create_lcssa_for_virtual_phi (orig_loop);
>+    }
>+
>   if (MAY_HAVE_DEBUG_BIND_STMTS)
>     {
>       gcc_assert (!adjust_vec.exists ());
>@@ -2706,6 +2744,26 @@ vect_do_peeling (loop_vec_info loop_vinf
>        as the transformations mentioned above make less or no sense when not
>        vectorizing.  */
>       epilog = vect_epilogues ? get_loop_copy (loop) : scalar_loop;
>+      if (vop_to_rename)
>+      {
>+        /* Vectorizing the main loop can sometimes introduce a vdef to
>+           a loop that previously didn't have one; see the comment above
>+           the definition of VOP_TO_RENAME for details.  The definition
>+           D that holds on E will then be different from the definition
>+           VOP_TO_RENAME that holds during SCALAR_LOOP, so we need to
>+           rename VOP_TO_RENAME to D when copying the loop.
>+
>+           The virtual operand is in LCSSA form for the main loop,
>+           and no stmt between the main loop and E needs a vdef,
>+           so we know that D is provided by a phi rather than by a
>+           vdef on a normal gimple stmt.  */
>+        basic_block vdef_bb = e->src;
>+        gphi *vphi;
>+        while (!(vphi = get_virtual_phi (vdef_bb)))
>+          vdef_bb = get_immediate_dominator (CDI_DOMINATORS, vdef_bb);
>+        gcc_assert (vop_to_rename != gimple_phi_result (vphi));
>+        set_current_def (vop_to_rename, gimple_phi_result (vphi));
>+      }
>     epilog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, epilog, e);
>       if (!epilog)
>       {
>Index: gcc/testsuite/gcc.dg/vect/vect-epilogues-2.c
>===================================================================
>--- /dev/null  2019-09-17 11:41:18.176664108 +0100
>+++ gcc/testsuite/gcc.dg/vect/vect-epilogues-2.c       2019-12-30
>14:12:26.082419824 +0000
>@@ -0,0 +1,57 @@
>+/* { dg-do compile } */
>+
>+int
>+f1 (int *x, int n)
>+{
>+  int res = 0;
>+  for (int i = 0; i < n; ++i)
>+    res += x[i * 2] == 1 ? 2 : 3;
>+  return res;
>+}
>+
>+int
>+f2 (int *x)
>+{
>+  int res = 0;
>+  for (int i = 0; i < 0x83; ++i)
>+    res += x[i * 2] == 1 ? 2 : 3;
>+  return res;
>+}
>+
>+int
>+f3 (int *x, int n)
>+{
>+  int res = 0;
>+  for (int i = 0; i < n; ++i)
>+    res += x[i * 2] == 1 ? 2 : 3;
>+  return res + x[0x100];
>+}
>+
>+int
>+f4 (int *x)
>+{
>+  int res = 0;
>+  for (int i = 0; i < 0x83; ++i)
>+    res += x[i * 2] == 1 ? 2 : 3;
>+  return res + x[0x100];
>+}
>+
>+int
>+f5 (int *x, int n, int a)
>+{
>+  int res = 0;
>+  for (int i = 0; i < n; ++i)
>+    res += x[i * 2] == 1 ? 2 : 3;
>+  x[a] += 1;
>+  return res;
>+}
>+
>+int
>+f6 (int *x, int a)
>+{
>+  int res = 0;
>+  for (int i = 0; i < 0x83; ++i)
>+    res += x[i * 2] == 1 ? 2 : 3;
>+  x[a] += 1;
>+  return res;
>+}

Reply via email to