https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008

--- Comment #19 from rguenther at suse dot de <rguenther at suse dot de> ---
On Sun, 24 Dec 2017, sergey.shalnov at intel dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
> 
> --- Comment #18 from sergey.shalnov at intel dot com ---
> Yes, I agree that vector_store stage has it’s own vectorization cost.
> And each vector_store has vector_construction stage. These stages are 
> different
> in gcc slp (as you know).
> To better illustrate my point of view I would like to propose a patch.
> I didn’t submit the patch to community yet because I think it is better to
> discuss it before that.
> 
> index 0ca42b4..72e3123 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -1825,7 +1825,7 @@ vect_analyze_slp_cost_1 (slp_instance instance, slp_tree
> node,
>             record_stmt_cost (prologue_cost_vec, 1, vector_load,
>                               stmt_info, 0, vect_prologue);
>           else if (dt == vect_external_def)
> -           record_stmt_cost (prologue_cost_vec, 1, vec_construct,
> +           record_stmt_cost (prologue_cost_vec, ncopies_for_cost,
> vec_construct,
>                               stmt_info, 0, vect_prologue);
>         }
>      }

I'm not sure it affects the testcase in question but yes, there's
imprecision in this in that it assumes too much CSE happening.
Using ncopies_for_cost will over-estimate the cost though,
it will match the code we generate though (CSE will cleanup).
As the comment says this code should be re-engineered ;)  (it also
misses to make a splat cheap).  Sth like the following might work
for most cases:

Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c (revision 256070)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1815,18 +1815,41 @@ vect_analyze_slp_cost_1 (slp_instance in
       enum vect_def_type dt;
       if (!op || op == lhs)
        continue;
-      if (vect_is_simple_use (op, stmt_info->vinfo, &def_stmt, &dt))
+      if (vect_is_simple_use (op, stmt_info->vinfo, &def_stmt, &dt)
+         && (dt == vect_constant_def || dt == vect_external_def))
        {
          /* Without looking at the actual initializer a vector of
             constants can be implemented as load from the constant pool.
-            ???  We need to pass down stmt_info for a vector type
-            even if it points to the wrong stmt.  */
-         if (dt == vect_constant_def)
-           record_stmt_cost (prologue_cost_vec, 1, vector_load,
-                             stmt_info, 0, vect_prologue);
-         else if (dt == vect_external_def)
-           record_stmt_cost (prologue_cost_vec, 1, vec_construct,
-                             stmt_info, 0, vect_prologue);
+            When all elements are the same we can use a splat.  */
+         tree vectype = get_vectype_for_scalar_type (TREE_TYPE (op));
+         unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype);
+         unsigned group_size = SLP_TREE_SCALAR_STMTS (node).length ();
+         tree elt = NULL_TREE;
+         unsigned nelt = 0;
+         for (unsigned j = 0; j < ncopies_for_cost * nunits / group_size; ++j)
+           for (unsigned k = 0; k < group_size; ++k)
+             {
+               if (nelt == 0)
+                 elt = gimple_op (SLP_TREE_SCALAR_STMTS (node)[nelt], i);
+               /* ???  We're just tracking whether all operands of a single
+                  vector initializer are the same, ideally we'd check if
+                  we emitted the same one already.  */
+               else if (elt != gimple_op (SLP_TREE_SCALAR_STMTS (node)[nelt],
+                                          i))
+                 elt = NULL_TREE;
+               nelt++;
+               if (nelt == group_size)
+                 {
+                   /* ???  We need to pass down stmt_info for a vector type
+                      even if it points to the wrong stmt.  */
+                   record_stmt_cost (prologue_cost_vec, 1,
+                                     dt == vect_external_def
+                                     ? (elt ? scalar_to_vec : vec_construct)
+                                     : vector_load,
+                                     stmt_info, 0, vect_prologue);
+                   nelt = 0;
+                 }
+             }
        }
     }

Reply via email to