>We'd also need to update the documentation:
>... The @samp{very-cheap} model only
>allows vectorization if the vector code would entirely replace the
>scalar code that is being vectorized. For example, if each iteration
>of a vectorized loop would only be able to handle exactly four iterations
>of the scalar loop, the @samp{very-cheap} model would only allow
>vectorization if the scalar iteration count is known to be a multiple
>of four.
Changed.
>And since it's a change in documented behaviour, it should probably
>be in the release notes too.
Will submit another patch for that when it lands on trunk.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,},
aarch64-unknown-linux-gnu{-m32,}.
Ok for trunk?
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_costing): Enable
vectorization for LOOP_VINFO_PEELING_FOR_NITER in very cheap
cost model.
(vect_analyze_loop): Disable epilogue vectorization in very
cheap cost model.
* doc/invoke.texi: Adjust documents for very-cheap cost model.
---
gcc/doc/invoke.texi | 11 ++++-------
gcc/tree-vect-loop.cc | 6 +++---
2 files changed, 7 insertions(+), 10 deletions(-)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b2f16b45eaf..edcadeb108a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14309,13 +14309,10 @@ counts that will likely execute faster than when
executing the original
scalar loop. The @samp{cheap} model disables vectorization of
loops where doing so would be cost prohibitive for example due to
required runtime checks for data dependence or alignment but otherwise
-is equal to the @samp{dynamic} model. The @samp{very-cheap} model only
-allows vectorization if the vector code would entirely replace the
-scalar code that is being vectorized. For example, if each iteration
-of a vectorized loop would only be able to handle exactly four iterations
-of the scalar loop, the @samp{very-cheap} model would only allow
-vectorization if the scalar iteration count is known to be a multiple
-of four.
+is equal to the @samp{dynamic} model. The @samp{very-cheap} model disables
+vectorization of loops when any runtime check for data dependence or alignment
+is required, it also disables vectorization of epilogue loops but otherwise is
+equal to the @samp{cheap} model.
The default cost model depends on other optimization flags and is
either @samp{dynamic} or @samp{cheap}.
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 6933f597b4d..a76d3b8ea5f 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -2375,8 +2375,7 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo,
a copy of the scalar code (even if we might be able to vectorize it). */
if (loop_cost_model (loop) == VECT_COST_MODEL_VERY_CHEAP
&& (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
- || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
- || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo)))
+ || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)))
{
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -3681,7 +3680,8 @@ vect_analyze_loop (class loop *loop, gimple
*loop_vectorized_call,
/* No code motion support for multiple epilogues so
for now
not supported when multiple exits. */
&& !LOOP_VINFO_EARLY_BREAKS (first_loop_vinfo)
- && !loop->simduid);
+ && !loop->simduid
+ && loop_cost_model (loop) >
VECT_COST_MODEL_VERY_CHEAP);
if (!vect_epilogues)
return first_loop_vinfo;
--
2.31.1