Richard Biener <[email protected]> writes:
> With recent SLP vectorization patches I see RISC-V divison by zero
> for gfortran.dg/matmul_10.f90 and others in get_group_load_store_type
> which does
>
> && can_div_trunc_p (group_size
> * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - gap,
> nunits, &tem, &remain)
> && (known_eq (remain, 0u)
> || (constant_multiple_p (nunits, remain, &num)
> && (vector_vector_composition_type (vectype, num,
> &half_vtype)
> != NULL_TREE))))
> overrun_p = false;
>
> where for [2, 2] / [0, 2] the condition doesn't reflect what we
> are trying to test - that, when remain is zero or, when non-zero,
> nunits is a multiple of remain, we can avoid touching a gap via
> loading smaller pieces and vector composition. Changing known_eq
> to maybe_eq wouldn't be correct I think since for example if it
> were [0, 4] then we cannot load smaller parts. Note we know
> that remain is in [0, nunits - 1] given we compute it as
> X % nunits so maybe we can indeed use maybe_eq (remain, 0u) here
> in this particular case?
I don't think that would be safe. In general, any case that can
happen for VLA is a case that could happen for VLS, if the compiler
had more information. So I don't think it's safe to ignore nonzero
remainders for VLA if we wouldn't for VLS. (E.g. in principle, that
range would allow nunits - 2 even for VLA, which might be zero for
the lowest vector length, but nonzero for others.)
> The following adds constant_multiple_if_divisor_nonzero_p to
> express the intent of no interest in the 'multiple' value for
> the case B is zero.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> Is the reasoning that maybe_eq is "safe" sound in this particular
> case? I think we should avoid adding
> constant_multiple_if_divisor_nonzero_p if not absolutely necessary.
Yeah, seeing it written out, maybe it is a bit too confusing. And when
we talked about this on irc yesterday, I was assuming that we could cope
with either possiblity (zero remainder or extra vectors) dynamically.
I now realise that we need to distinguish between them statically instead.
On that basis, maybe we should just protect the constant_multiple_p test
with known_ne (remainder, 0).
Thanks, and sorry for suggesting a dead end.
Richard
>
> Thanks,
> Richard.
>
> * poly-int.h (constant_multiple_if_divisor_nonzero_p): New.
> * tree-vect-stmts.cc (get_group_load_store_type): Use it.
> ---
> gcc/poly-int.h | 45 ++++++++++++++++++++++++++++++++++++++++++
> gcc/tree-vect-stmts.cc | 3 ++-
> 2 files changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/poly-int.h b/gcc/poly-int.h
> index 94708165961..29337513eb8 100644
> --- a/gcc/poly-int.h
> +++ b/gcc/poly-int.h
> @@ -1986,6 +1986,51 @@ constant_multiple_p (const poly_int<N, Ca> &a,
> return true;
> }
>
> +/* Return true if A is a constant multiple of B, storing the
> + multiple in *MULTIPLE if so. The case of B being zero is
> + excluded but it shall not be known equal to zero. */
> +
> +template<unsigned int N, typename Ca, typename Cb, typename Cm>
> +inline bool
> +constant_multiple_if_divisor_nonzero_p (const poly_int<N, Ca> &a,
> + const poly_int<N, Cb> &b, Cm *multiple)
> +{
> + typedef POLY_CAST (Ca, Cb) NCa;
> + typedef POLY_CAST (Cb, Ca) NCb;
> + typedef POLY_INT_TYPE (Ca) ICa;
> + typedef POLY_INT_TYPE (Cb) ICb;
> + typedef POLY_BINARY_COEFF (Ca, Cb) C;
> +
> + C r = 0;
> + if (b.coeffs[0] != ICb (0))
> + {
> + if (NCa (a.coeffs[0]) % NCb (b.coeffs[0]) != 0)
> + return false;
> + r = NCa (a.coeffs[0]) / NCb (b.coeffs[0]);
> + }
> + for (unsigned int i = 1; i < N; ++i)
> + if (b.coeffs[i] == ICb (0))
> + {
> + if (a.coeffs[i] != ICa (0))
> + return false;
> + }
> + else
> + {
> + if (NCa (a.coeffs[i]) % NCb (b.coeffs[i]) != 0)
> + return false;
> + if (r == 0)
> + r = NCa (a.coeffs[i]) / NCb (b.coeffs[i]);
> + else if (NCa (a.coeffs[i]) / NCb (b.coeffs[i]) != r)
> + return false;
> + }
> +
> + if (r == 0)
> + return false;
> +
> + *multiple = r;
> + return true;
> +}
> +
> /* Return true if A is a constant multiple of B. */
>
> template<unsigned int N, typename Ca, typename Cb>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index f8bb637342e..2641833b3aa 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -2135,7 +2135,8 @@ get_group_load_store_type (vec_info *vinfo,
> stmt_vec_info stmt_info,
> * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - gap,
> nunits, &tem, &remain)
> && (known_eq (remain, 0u)
> - || (constant_multiple_p (nunits, remain, &num)
> + || (constant_multiple_if_divisor_nonzero_p (nunits, remain,
> + &num)
> && (vector_vector_composition_type (vectype, num,
> &half_vtype)
> != NULL_TREE))))