On Fri, 30 Aug 2024, Richard Sandiford wrote: > Richard Biener <rguent...@suse.de> writes: > > With recent SLP vectorization patches I see RISC-V divison by zero > > for gfortran.dg/matmul_10.f90 and others in get_group_load_store_type > > which does > > > > && can_div_trunc_p (group_size > > * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - > > gap, > > nunits, &tem, &remain) > > && (known_eq (remain, 0u) > > || (constant_multiple_p (nunits, remain, &num) > > && (vector_vector_composition_type (vectype, num, > > &half_vtype) > > != NULL_TREE)))) > > overrun_p = false; > > > > where for [2, 2] / [0, 2] the condition doesn't reflect what we > > are trying to test - that, when remain is zero or, when non-zero, > > nunits is a multiple of remain, we can avoid touching a gap via > > loading smaller pieces and vector composition. Changing known_eq > > to maybe_eq wouldn't be correct I think since for example if it > > were [0, 4] then we cannot load smaller parts. Note we know > > that remain is in [0, nunits - 1] given we compute it as > > X % nunits so maybe we can indeed use maybe_eq (remain, 0u) here > > in this particular case? > > I don't think that would be safe. In general, any case that can > happen for VLA is a case that could happen for VLS, if the compiler > had more information. So I don't think it's safe to ignore nonzero > remainders for VLA if we wouldn't for VLS. (E.g. in principle, that > range would allow nunits - 2 even for VLA, which might be zero for > the lowest vector length, but nonzero for others.) > > > The following adds constant_multiple_if_divisor_nonzero_p to > > express the intent of no interest in the 'multiple' value for > > the case B is zero. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > Is the reasoning that maybe_eq is "safe" sound in this particular > > case? I think we should avoid adding > > constant_multiple_if_divisor_nonzero_p if not absolutely necessary. > > Yeah, seeing it written out, maybe it is a bit too confusing. And when > we talked about this on irc yesterday, I was assuming that we could cope > with either possiblity (zero remainder or extra vectors) dynamically. > I now realise that we need to distinguish between them statically instead. > > On that basis, maybe we should just protect the constant_multiple_p test > with known_ne (remainder, 0).
Ah, OK. Yes, doing && (known_eq (remain, 0u) || (known_ne (remain, 0u) && constant_multiple_p (nunits, remain, &num) && (vector_vector_composition_type (vectype, num, &half_vtype) != NULL_TREE)))) overrun_p = false; is obviously safe as well and it resolves the two cases I was seeing. I think for the other latent RISC-V issue we can then sync up the vectorizable_load case and add the corresponding known_ne there, too, possibly fixing that issue as well. > Thanks, and sorry for suggesting a dead end. No worries. Thanks, Richard. > Richard > > > > > Thanks, > > Richard. > > > > * poly-int.h (constant_multiple_if_divisor_nonzero_p): New. > > * tree-vect-stmts.cc (get_group_load_store_type): Use it. > > --- > > gcc/poly-int.h | 45 ++++++++++++++++++++++++++++++++++++++++++ > > gcc/tree-vect-stmts.cc | 3 ++- > > 2 files changed, 47 insertions(+), 1 deletion(-) > > > > diff --git a/gcc/poly-int.h b/gcc/poly-int.h > > index 94708165961..29337513eb8 100644 > > --- a/gcc/poly-int.h > > +++ b/gcc/poly-int.h > > @@ -1986,6 +1986,51 @@ constant_multiple_p (const poly_int<N, Ca> &a, > > return true; > > } > > > > +/* Return true if A is a constant multiple of B, storing the > > + multiple in *MULTIPLE if so. The case of B being zero is > > + excluded but it shall not be known equal to zero. */ > > + > > +template<unsigned int N, typename Ca, typename Cb, typename Cm> > > +inline bool > > +constant_multiple_if_divisor_nonzero_p (const poly_int<N, Ca> &a, > > + const poly_int<N, Cb> &b, Cm *multiple) > > +{ > > + typedef POLY_CAST (Ca, Cb) NCa; > > + typedef POLY_CAST (Cb, Ca) NCb; > > + typedef POLY_INT_TYPE (Ca) ICa; > > + typedef POLY_INT_TYPE (Cb) ICb; > > + typedef POLY_BINARY_COEFF (Ca, Cb) C; > > + > > + C r = 0; > > + if (b.coeffs[0] != ICb (0)) > > + { > > + if (NCa (a.coeffs[0]) % NCb (b.coeffs[0]) != 0) > > + return false; > > + r = NCa (a.coeffs[0]) / NCb (b.coeffs[0]); > > + } > > + for (unsigned int i = 1; i < N; ++i) > > + if (b.coeffs[i] == ICb (0)) > > + { > > + if (a.coeffs[i] != ICa (0)) > > + return false; > > + } > > + else > > + { > > + if (NCa (a.coeffs[i]) % NCb (b.coeffs[i]) != 0) > > + return false; > > + if (r == 0) > > + r = NCa (a.coeffs[i]) / NCb (b.coeffs[i]); > > + else if (NCa (a.coeffs[i]) / NCb (b.coeffs[i]) != r) > > + return false; > > + } > > + > > + if (r == 0) > > + return false; > > + > > + *multiple = r; > > + return true; > > +} > > + > > /* Return true if A is a constant multiple of B. */ > > > > template<unsigned int N, typename Ca, typename Cb> > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > index f8bb637342e..2641833b3aa 100644 > > --- a/gcc/tree-vect-stmts.cc > > +++ b/gcc/tree-vect-stmts.cc > > @@ -2135,7 +2135,8 @@ get_group_load_store_type (vec_info *vinfo, > > stmt_vec_info stmt_info, > > * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - gap, > > nunits, &tem, &remain) > > && (known_eq (remain, 0u) > > - || (constant_multiple_p (nunits, remain, &num) > > + || (constant_multiple_if_divisor_nonzero_p (nunits, remain, > > + &num) > > && (vector_vector_composition_type (vectype, num, > > &half_vtype) > > != NULL_TREE)))) > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)