Richard Biener <rguent...@suse.de> writes:
> With recent SLP vectorization patches I see RISC-V divison by zero
> for gfortran.dg/matmul_10.f90 and others in get_group_load_store_type
> which does
>
>               && can_div_trunc_p (group_size
>                                   * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - gap,
>                                   nunits, &tem, &remain)
>               && (known_eq (remain, 0u)
>                   || (constant_multiple_p (nunits, remain, &num)
>                       && (vector_vector_composition_type (vectype, num,
>                                                           &half_vtype)
>                           != NULL_TREE))))
>             overrun_p = false;
>
> where for [2, 2] / [0, 2] the condition doesn't reflect what we
> are trying to test - that, when remain is zero or, when non-zero,
> nunits is a multiple of remain, we can avoid touching a gap via
> loading smaller pieces and vector composition.  Changing known_eq
> to maybe_eq wouldn't be correct I think since for example if it
> were [0, 4] then we cannot load smaller parts.  Note we know
> that remain is in [0, nunits - 1] given we compute it as
> X % nunits so maybe we can indeed use maybe_eq (remain, 0u) here
> in this particular case?

I don't think that would be safe.  In general, any case that can
happen for VLA is a case that could happen for VLS, if the compiler
had more information.  So I don't think it's safe to ignore nonzero
remainders for VLA if we wouldn't for VLS.  (E.g. in principle, that
range would allow nunits - 2 even for VLA, which might be zero for
the lowest vector length, but nonzero for others.)

> The following adds constant_multiple_if_divisor_nonzero_p to
> express the intent of no interest in the 'multiple' value for
> the case B is zero.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> Is the reasoning that maybe_eq is "safe" sound in this particular
> case?  I think we should avoid adding 
> constant_multiple_if_divisor_nonzero_p if not absolutely necessary.

Yeah, seeing it written out, maybe it is a bit too confusing.  And when
we talked about this on irc yesterday, I was assuming that we could cope
with either possiblity (zero remainder or extra vectors) dynamically.
I now realise that we need to distinguish between them statically instead.

On that basis, maybe we should just protect the constant_multiple_p test
with known_ne (remainder, 0).

Thanks, and sorry for suggesting a dead end.

Richard

>
> Thanks,
> Richard.
>
>       * poly-int.h (constant_multiple_if_divisor_nonzero_p): New.
>       * tree-vect-stmts.cc (get_group_load_store_type): Use it.
> ---
>  gcc/poly-int.h         | 45 ++++++++++++++++++++++++++++++++++++++++++
>  gcc/tree-vect-stmts.cc |  3 ++-
>  2 files changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/poly-int.h b/gcc/poly-int.h
> index 94708165961..29337513eb8 100644
> --- a/gcc/poly-int.h
> +++ b/gcc/poly-int.h
> @@ -1986,6 +1986,51 @@ constant_multiple_p (const poly_int<N, Ca> &a,
>    return true;
>  }
>  
> +/* Return true if A is a constant multiple of B, storing the
> +   multiple in *MULTIPLE if so.  The case of B being zero is
> +   excluded but it shall not be known equal to zero.  */
> +
> +template<unsigned int N, typename Ca, typename Cb, typename Cm>
> +inline bool
> +constant_multiple_if_divisor_nonzero_p (const poly_int<N, Ca> &a,
> +                                     const poly_int<N, Cb> &b, Cm *multiple)
> +{
> +  typedef POLY_CAST (Ca, Cb) NCa;
> +  typedef POLY_CAST (Cb, Ca) NCb;
> +  typedef POLY_INT_TYPE (Ca) ICa;
> +  typedef POLY_INT_TYPE (Cb) ICb;
> +  typedef POLY_BINARY_COEFF (Ca, Cb) C;
> +
> +  C r = 0;
> +  if (b.coeffs[0] != ICb (0))
> +    {
> +      if (NCa (a.coeffs[0]) % NCb (b.coeffs[0]) != 0)
> +     return false;
> +      r = NCa (a.coeffs[0]) / NCb (b.coeffs[0]);
> +    }
> +  for (unsigned int i = 1; i < N; ++i)
> +    if (b.coeffs[i] == ICb (0))
> +      {
> +     if (a.coeffs[i] != ICa (0))
> +       return false;
> +      }
> +    else
> +      {
> +     if (NCa (a.coeffs[i]) % NCb (b.coeffs[i]) != 0)
> +       return false;
> +     if (r == 0)
> +       r = NCa (a.coeffs[i]) / NCb (b.coeffs[i]);
> +     else if (NCa (a.coeffs[i]) / NCb (b.coeffs[i]) != r)
> +       return false;
> +      }
> +
> +  if (r == 0)
> +    return false;
> +
> +  *multiple = r;
> +  return true;
> +}
> +
>  /* Return true if A is a constant multiple of B.  */
>  
>  template<unsigned int N, typename Ca, typename Cb>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index f8bb637342e..2641833b3aa 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -2135,7 +2135,8 @@ get_group_load_store_type (vec_info *vinfo, 
> stmt_vec_info stmt_info,
>                                 * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - gap,
>                                 nunits, &tem, &remain)
>             && (known_eq (remain, 0u)
> -               || (constant_multiple_p (nunits, remain, &num)
> +               || (constant_multiple_if_divisor_nonzero_p (nunits, remain,
> +                                                           &num)
>                     && (vector_vector_composition_type (vectype, num,
>                                                         &half_vtype)
>                         != NULL_TREE))))

Reply via email to