On Wed, Oct 9, 2024 at 3:27 AM liuhongt <[email protected]> wrote:
>
> >We'd also need to update the documentation:
>
> >... The @samp{very-cheap} model only
> >allows vectorization if the vector code would entirely replace the
> >scalar code that is being vectorized. For example, if each iteration
> >of a vectorized loop would only be able to handle exactly four iterations
> >of the scalar loop, the @samp{very-cheap} model would only allow
> >vectorization if the scalar iteration count is known to be a multiple
> >of four.
> Changed.
>
> >And since it's a change in documented behaviour, it should probably
> >be in the release notes too.
>
> Will submit another patch for that when it lands on trunk.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,},
> aarch64-unknown-linux-gnu{-m32,}.
>
> Ok for trunk?
OK.
Richard.
> gcc/ChangeLog:
>
> * tree-vect-loop.cc (vect_analyze_loop_costing): Enable
> vectorization for LOOP_VINFO_PEELING_FOR_NITER in very cheap
> cost model.
> (vect_analyze_loop): Disable epilogue vectorization in very
> cheap cost model.
> * doc/invoke.texi: Adjust documents for very-cheap cost model.
> ---
> gcc/doc/invoke.texi | 11 ++++-------
> gcc/tree-vect-loop.cc | 6 +++---
> 2 files changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index b2f16b45eaf..edcadeb108a 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14309,13 +14309,10 @@ counts that will likely execute faster than when
> executing the original
> scalar loop. The @samp{cheap} model disables vectorization of
> loops where doing so would be cost prohibitive for example due to
> required runtime checks for data dependence or alignment but otherwise
> -is equal to the @samp{dynamic} model. The @samp{very-cheap} model only
> -allows vectorization if the vector code would entirely replace the
> -scalar code that is being vectorized. For example, if each iteration
> -of a vectorized loop would only be able to handle exactly four iterations
> -of the scalar loop, the @samp{very-cheap} model would only allow
> -vectorization if the scalar iteration count is known to be a multiple
> -of four.
> +is equal to the @samp{dynamic} model. The @samp{very-cheap} model disables
> +vectorization of loops when any runtime check for data dependence or
> alignment
> +is required, it also disables vectorization of epilogue loops but otherwise
> is
> +equal to the @samp{cheap} model.
>
> The default cost model depends on other optimization flags and is
> either @samp{dynamic} or @samp{cheap}.
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 6933f597b4d..a76d3b8ea5f 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -2375,8 +2375,7 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo,
> a copy of the scalar code (even if we might be able to vectorize it).
> */
> if (loop_cost_model (loop) == VECT_COST_MODEL_VERY_CHEAP
> && (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
> - || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
> - || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo)))
> + || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)))
> {
> if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> @@ -3681,7 +3680,8 @@ vect_analyze_loop (class loop *loop, gimple
> *loop_vectorized_call,
> /* No code motion support for multiple epilogues
> so for now
> not supported when multiple exits. */
> && !LOOP_VINFO_EARLY_BREAKS (first_loop_vinfo)
> - && !loop->simduid);
> + && !loop->simduid
> + && loop_cost_model (loop) >
> VECT_COST_MODEL_VERY_CHEAP);
> if (!vect_epilogues)
> return first_loop_vinfo;
>
> --
> 2.31.1
>