Richard Biener <rguent...@suse.de> writes:
> The following removes an optimization that wrongly triggers right now
> because it accesses LOOP_VINFO_COST_MODEL_THRESHOLD which might not be
> computed yet.
>
> Testing on x86_64 didn't reveal any testsuite coverage.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK?
>
>       PR tree-optimization/120939
>       * tree-vect-loop.cc (vect_need_peeling_or_partial_vectors_p):
>       Remove eliding an epilogue based on not computed
>       LOOP_VINFO_COST_MODEL_THRESHOLD.

This regresses:

FAIL: gcc.dg/torture/pr113026-1.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions   (test for bogus messages, line 10)

on aarch64-linux-gnu, with:

.../pr113026-1.c:10:12: warning: writing 1 byte into a region of size 0 
[-Wstringop-overflow=]
.../pr113026-1.c:4:6: note: at offset 16 into destination object 'dst' of size 
16

I haven't looked into why yet, but it does seem superficially similar
to PR60505, which was what this code seems to have been added to fix
(g:090cd8dc70b80183c83d9f43f1e6ab9970481efd).

Thanks,
Richard

> ---
>  gcc/tree-vect-loop.cc | 21 ++-------------------
>  1 file changed, 2 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 46a6399243d..7ac61d4dce2 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -1224,13 +1224,6 @@ static bool
>  vect_need_peeling_or_partial_vectors_p (loop_vec_info loop_vinfo)
>  {
>    unsigned HOST_WIDE_INT const_vf;
> -  HOST_WIDE_INT max_niter
> -    = likely_max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo));
> -
> -  unsigned th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);
> -  if (!th && LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo))
> -    th = LOOP_VINFO_COST_MODEL_THRESHOLD (LOOP_VINFO_ORIG_LOOP_INFO
> -                                       (loop_vinfo));
>  
>    if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
>        && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) >= 0)
> @@ -1250,18 +1243,8 @@ vect_need_peeling_or_partial_vectors_p (loop_vec_info 
> loop_vinfo)
>        VF * N + 1.  That's something of a niche case though.  */
>        || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
>        || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&const_vf)
> -      || ((tree_ctz (LOOP_VINFO_NITERS (loop_vinfo))
> -        < (unsigned) exact_log2 (const_vf))
> -       /* In case of versioning, check if the maximum number of
> -          iterations is greater than th.  If they are identical,
> -          the epilogue is unnecessary.  */
> -       && (!LOOP_REQUIRES_VERSIONING (loop_vinfo)
> -           || ((unsigned HOST_WIDE_INT) max_niter
> -               /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD
> -                  but that's only computed later based on our result.
> -                  The following is the most conservative approximation.  */
> -               > (std::max ((unsigned HOST_WIDE_INT) th,
> -                            const_vf) / const_vf) * const_vf))))
> +      || (tree_ctz (LOOP_VINFO_NITERS (loop_vinfo))
> +       < (unsigned) exact_log2 (const_vf)))
>      return true;
>  
>    return false;

Reply via email to