The vectorizer cost model has a serious issue in not dealing well with
targets using scalar stmt cost != 1.  This is because it passes
scalar iteration _cost_ to routines scaling that cost with the targets
scalar stmt cost again.  This is for example visible on x86_64 for
all AMD archs which use high scalar stmt cost (6).

I am testing the following patch to fix that - for GCC 6 we might want
to avoid the roundoff errors that can appear.

Richard.

2015-02-10  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/64909
        * tree-vect-loop.c (vect_estimate_min_profitable_iters): Properly
        pass a scalar-stmt count estimate to the cost model.
        * tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Likewise.

        * gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c: New testcase.

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c        (revision 220540)
+++ gcc/tree-vect-loop.c        (working copy)
@@ -2834,6 +2834,11 @@ vect_estimate_min_profitable_iters (loop
      statements.  */
 
   scalar_single_iter_cost = vect_get_single_scalar_iteration_cost (loop_vinfo);
+  /* ???  Below we use this cost as number of stmts with scalar_stmt cost,
+     thus divide by that.  This introduces rounding errors, thus better
+     introduce a new cost kind (raw_cost?  scalar_iter_cost?). */
+  int scalar_single_iter_stmts
+    = scalar_single_iter_cost / vect_get_stmt_cost (scalar_stmt);
 
   /* Add additional cost for the peeled instructions in prologue and epilogue
      loop.
@@ -2868,10 +2873,10 @@ vect_estimate_min_profitable_iters (loop
       /* FORNOW: Don't attempt to pass individual scalar instructions to
         the model; just assume linear cost for scalar iterations.  */
       (void) add_stmt_cost (target_cost_data,
-                           peel_iters_prologue * scalar_single_iter_cost,
+                           peel_iters_prologue * scalar_single_iter_stmts,
                            scalar_stmt, NULL, 0, vect_prologue);
       (void) add_stmt_cost (target_cost_data, 
-                           peel_iters_epilogue * scalar_single_iter_cost,
+                           peel_iters_epilogue * scalar_single_iter_stmts,
                            scalar_stmt, NULL, 0, vect_epilogue);
     }
   else
@@ -2887,7 +2892,7 @@ vect_estimate_min_profitable_iters (loop
 
       (void) vect_get_known_peeling_cost (loop_vinfo, peel_iters_prologue,
                                          &peel_iters_epilogue,
-                                         scalar_single_iter_cost,
+                                         scalar_single_iter_stmts,
                                          &prologue_cost_vec,
                                          &epilogue_cost_vec);
 
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c   (revision 220540)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -1184,10 +1206,13 @@ vect_peeling_hash_get_lowest_cost (_vect
     }
 
   single_iter_cost = vect_get_single_scalar_iteration_cost (loop_vinfo);
-  outside_cost += vect_get_known_peeling_cost (loop_vinfo, elem->npeel,
-                                              &dummy, single_iter_cost,
-                                              &prologue_cost_vec,
-                                              &epilogue_cost_vec);
+  outside_cost += vect_get_known_peeling_cost
+    (loop_vinfo, elem->npeel, &dummy,
+     /* ???  We use this cost as number of stmts with scalar_stmt cost,
+       thus divide by that.  This introduces rounding errors, thus better 
+       introduce a new cost kind (raw_cost?  scalar_iter_cost?). */
+     single_iter_cost / vect_get_stmt_cost (scalar_stmt),
+     &prologue_cost_vec, &epilogue_cost_vec);
 
   /* Prologue and epilogue costs are added to the target model later.
      These costs depend only on the scalar iteration cost, the
Index: gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c      
(revision 0)
+++ gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c      
(working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-mtune=bdver1" } */
+
+unsigned short a[32];
+unsigned int b[32];
+void t()
+{
+  int i;
+  for (i=0;i<12;i++)
+    b[i]=a[i];
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */

Reply via email to