This adds a --param to allow disabling of vectorization of
floating point inductions.  Ontop of -Ofast this should allow
549.fotonik3d_r to not miscompare.

While I thought of a more elaborate way of disabling certain
vectorization kinds (reductions also came to my mind) this
for now simply uses a --param than some sophisticated -fvectorize-*
scheme.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  I've
verified that 549.fotonik3d_r miscompares with -Ofast -march=znver2
and passes when adding --param vect-induction-float=0 which
should be valid at least for peak (but I guess also base for
FOPTIMIZE for example).  I did not benchmark against other
workarounds (it has been said -fno-unsafe-math-optimizations
or other similar things work as well).

OK for trunk?

Thanks,
Richard.

2022-03-08  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/84201
        * params.opt (-param=vect-induction-float): Add.
        * doc/invoke.texi (vect-induction-float): Document.
        * tree-vect-loop.cc (vectorizable_induction): Honor
        param_vect_induction_float.

        * gcc.dg/vect/pr84201.c: New testcase.
---
 gcc/doc/invoke.texi                 |  3 +++
 gcc/params.opt                      |  4 ++++
 gcc/testsuite/gcc.dg/vect/pr84201.c | 22 ++++++++++++++++++++++
 gcc/tree-vect-loop.cc               |  8 ++++++++
 4 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr84201.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b01ffab566a..a0fa5e1cf43 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14989,6 +14989,9 @@ in an inner loop relative to the loop being vectorized. 
 The factor applied
 is the maximum of the estimated number of iterations of the inner loop and
 this parameter.  The default value of this parameter is 50.
 
+@item vect-induction-float
+Enable loop vectorization of floating point inductions.
+
 @item avoid-fma-max-bits
 Maximum number of bits for which we avoid creating FMAs.
 
diff --git a/gcc/params.opt b/gcc/params.opt
index f76f7839916..9561aa61a50 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1176,6 +1176,10 @@ Controls how loop vectorizer uses partial vectors.  0 
means never, 1 means only
 Common Joined UInteger Var(param_vect_inner_loop_cost_factor) Init(50) 
IntegerRange(1, 10000) Param Optimization
 The maximum factor which the loop vectorizer applies to the cost of statements 
in an inner loop relative to the loop being vectorized.
 
+-param=vect-induction-float=
+Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRage(0, 
1) Param Optimization
+Enable loop vectorization of floating point inductions.
+
 -param=vrp1-mode=
 Common Joined Var(param_vrp1_mode) Enum(vrp_mode) Init(VRP_MODE_VRP) Param 
Optimization
 --param=vrp1-mode=[vrp|ranger] Specifies the mode VRP1 should operate in.
diff --git a/gcc/testsuite/gcc.dg/vect/pr84201.c 
b/gcc/testsuite/gcc.dg/vect/pr84201.c
new file mode 100644
index 00000000000..1cc6d1ff13c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr84201.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Ofast --param vect-induction-float=0" } */
+
+void foo (float *a, float f, float s, int n)
+{
+  for (int i = 0; i < n; ++i)
+    {
+      a[i] = f;
+      f += s;
+    }
+}
+
+void bar (double *a, double f, double s, int n)
+{
+  for (int i = 0; i < n; ++i)
+    {
+      a[i] = f;
+      f += s;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" } } */
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 1f30fc82ca1..7fcec12a3e9 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -8175,6 +8175,14 @@ vectorizable_induction (loop_vec_info loop_vinfo,
       return false;
     }
 
+  if (FLOAT_TYPE_P (vectype) && !param_vect_induction_float)
+    {
+      if (dump_enabled_p ())
+       dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                        "floating point induction vectorization disabled\n");
+      return false;
+    }
+
   step_expr = STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_info);
   gcc_assert (step_expr != NULL_TREE);
   tree step_vectype = get_same_sized_vectype (TREE_TYPE (step_expr), vectype);
-- 
2.34.1

Reply via email to